[GitHub] [drill] dobesv opened a new pull request #1977: DRILL-7573: Support htpasswd based authentication

2020-02-10 Thread GitBox
dobesv opened a new pull request #1977: DRILL-7573: Support htpasswd based 
authentication
URL: https://github.com/apache/drill/pull/1977
 
 
   # [DRILL-7573](https://issues.apache.org/jira/browse/DRILL-7573): Support 
htpasswd based authentication
   
   ## Description
   
   This allows you to specify `htpasswd` as your authentication implementation. 
 In this case, users will be authenticated using usernames and password taken 
from a text file in `htpasswd` format.
   
   This gives some more flexibility compared to the PAM authenticator.  For 
example, in docker / kubernetes you can mount a folder with an htpasswd file 
and update that file when you want to add/remove users, without any concern 
about interfering with the contents of /etc/passwd and /etc/shadow.
   
   ## Documentation
   
   # Using a password file for authentication
   
   Apache Drill allows you to store valid usernames and passwords in a text 
file in the popular "htpasswd" format.
   
   This can be more convenient than using PAM in containerized environments, 
because you do not have to modify any system files like `passwd`, `shadow`, or 
files in `pam.d`.  Instead you can mount a volume with the `htpasswd` file in 
it and tell drill to use that file for authentication.
   
   To configure this feature:
   
   1. Create an htpasswd file and copy/mount it to/on the drillbit 
machines/containers:
   
   $ htpasswd /path/to/htpasswd $USER
   
   2. Add the following configuration to the `drill.exec` block in the 
`/conf/drill-override.conf` file:  
 
 drill.exec: {
   security.auth.mechanisms : ["PLAIN"],
   security.user.auth: {
 enabled: true,
 packages += "org.apache.drill.exec.rpc.user.security",
 impl: "htpasswd",
 htpasswd: { file: "/path/to/htpasswd" }
   }
 }
   
   3. Restart the drillbit(s)
   4. Now you must use a username/password from the `htpasswd` file when 
logging into Drill
   
   Note: Currently the `crypt` and `bcrypt` algorithms are not supported, you 
should probably use the  MD5 hashing algorithm used by default by the 
`htpasswd` command.
   
   ## Testing
   
   I created an `htpasswd` file using `htpasswd`, configured the auth mechanism 
as shown above, and testing logging in with both valid and invalid passwords 
with MD5, SHA-1, and plantext password hashes in the `htpasswd` files.
   
   No automated tests so far, but I'm open to advice on how/where to add them.  
Still very new to the code base.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on issue #1976: [SECURITY] Use HTTPS to resolve dependencies in Maven Build

2020-02-10 Thread GitBox
cgivre commented on issue #1976: [SECURITY] Use HTTPS to resolve dependencies 
in Maven Build
URL: https://github.com/apache/drill/pull/1976#issuecomment-584436772
 
 
   Thanks for the contribution.  This should be an easy contribution however, 
Drill's procedure for PRs is to first create a JIRA issue.  Could you please:
   1.   Create a JIRA issue (https://issues.apache.org)
   2.   Include the issue number in the title and commit message
   
   Thank you very much for the contribution.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] JLLeitschuh opened a new pull request #1976: [SECURITY] Use HTTPS to resolve dependencies in Maven Build

2020-02-10 Thread GitBox
JLLeitschuh opened a new pull request #1976: [SECURITY] Use HTTPS to resolve 
dependencies in Maven Build
URL: https://github.com/apache/drill/pull/1976
 
 
   
[![mitm_build](https://user-images.githubusercontent.com/1323708/59226671-90645200-8ba1-11e9-8ab3-39292bef99e9.jpeg)](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e)
   
   - [Want to take over the Java ecosystem? All you need is a 
MITM!](https://medium.com/@jonathan.leitschuh/want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-1fc329d898fb?source=friends_link=3c99970c55a899ad9ef41f126efcde0e)
   - [Update: Want to take over the Java ecosystem? All you need is a 
MITM!](https://medium.com/bugbountywriteup/update-want-to-take-over-the-java-ecosystem-all-you-need-is-a-mitm-d069d253fe23?source=friends_link=8c8e52a7d57b98d0b7e541665688b454)
   
   ---
   
   This is a security fix for a  vulnerability in your [Apache 
Maven](https://maven.apache.org/) `pom.xml` file(s).
   
   The build files indicate that this project is resolving dependencies over 
HTTP instead of HTTPS.
   This leaves your build vulnerable to allowing a [Man in the 
Middle](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) (MITM) 
attackers to execute arbitrary code on your or your computer or CI/CD system.
   
   This vulnerability has a CVSS v3.0 Base Score of 
[8.1/10](https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H).
   
   [POC 
code](https://max.computer/blog/how-to-take-over-the-computer-of-any-java-or-clojure-or-scala-developer/)
 has existed since 2014 to maliciously compromise a JAR file in-flight.
   MITM attacks against HTTP are [increasingly 
common](https://security.stackexchange.com/a/12050), for example [Comcast is 
known to have done it to their own 
users](https://thenextweb.com/insights/2017/12/11/comcast-continues-to-inject-its-own-code-into-websites-you-visit/#).
   
   This contribution is a part of a submission to the [GitHub Security 
Lab](https://securitylab.github.com/) Bug Bounty program.
   
   ## Detecting this and Future Vulnerabilities
   
   This vulnerability was automatically detected by 
[LGTM.com](https://lgtm.com) using this [CodeQL 
Query](https://lgtm.com/rules/155648721/).
   
   As of September 2019 LGTM.com and Semmle are [officially a part of 
GitHub](https://github.blog/2019-09-18-github-welcomes-semmle/).
   
   You can automatically detect future vulnerabilities like this by enabling 
the free (for open-source) [LGTM App](https://github.com/marketplace/lgtm).
   
   I'm not an employee of GitHub nor of Semmle, I'm simply a user of 
[LGTM.com](https://lgtm.com) and an open-source security researcher.
   
   ## Source
   
   Yes, this contribution was automatically generated, however, the code to 
generate this PR was lovingly hand crafted to bring this security fix to your 
repository.
   
   The source code that generated and submitted this PR can be found here:
   
[JLLeitschuh/bulk-security-pr-generator](https://github.com/JLLeitschuh/bulk-security-pr-generator)
   
   ## Opting-Out
   
   If you'd like to opt-out of future automated security vulnerability fixes 
like this, please consider adding a file called
   `.github/GH-ROBOTS.txt` to your repository with the line:
   
   ```
   User-agent: JLLeitschuh/bulk-security-pr-generator
   Disallow: *
   ```
   
   This bot will respect the [ROBOTS.txt](https://moz.com/learn/seo/robotstxt) 
format for future contributions.
   
   Alternatively, if this project is no longer actively maintained, consider 
[archiving](https://help.github.com/en/github/creating-cloning-and-archiving-repositories/about-archiving-repositories)
 the repository.
   
   ## CLA Requirements
   
   _This section is only relevant if your project requires contributors to sign 
a Contributor License Agreement (CLA) for external contributions._
   
   It is unlikely that I'll be able to directly sign CLAs. However, all 
contributed commits are already automatically signed-off.
   
   > The meaning of a signoff depends on the project, but it typically 
certifies that committer has the rights to submit this work under the same 
license and agrees to a Developer Certificate of Origin 
   > (see 
[https://developercertificate.org/](https://developercertificate.org/) for more 
information).
   >
   > \- [Git Commit Signoff documentation](https://developercertificate.org/)
   
   If signing your organization's CLA is a strict-requirement for merging this 
contribution, please feel free to close this PR.
   
   ## Tracking
   
   All PR's generated as part of this fix are tracked here: 
   https://github.com/JLLeitschuh/bulk-security-pr-generator/issues/2


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL 

[GitHub] [drill] cgivre commented on issue #1973: DRILL-7575: Fix typo in class name of FormSecurityHandler

2020-02-10 Thread GitBox
cgivre commented on issue #1973: DRILL-7575: Fix typo in class name of 
FormSecurityHandler
URL: https://github.com/apache/drill/pull/1973#issuecomment-584256826
 
 
   LGTM +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] dobesv commented on issue #1973: DRILL-7575: Fix typo in class name of FormSecurityHandler

2020-02-10 Thread GitBox
dobesv commented on issue #1973: DRILL-7575: Fix typo in class name of 
FormSecurityHandler
URL: https://github.com/apache/drill/pull/1973#issuecomment-584238274
 
 
   OK, done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] dobesv commented on issue #1972: DRILL-7562: Support HTTP Basic authentication for REST API calls

2020-02-10 Thread GitBox
dobesv commented on issue #1972: DRILL-7562: Support HTTP Basic authentication 
for REST API calls
URL: https://github.com/apache/drill/pull/1972#issuecomment-584238503
 
 
   I just updated the commit message to include the JIRA issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376942618
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java
 ##
 @@ -152,11 +154,18 @@ private void addOutputContainerData() {
 container.setRecordCount(1);
   }
 
-  protected void setupNewSchema() throws IOException {
+  protected void setupNewSchema() {
 try {
   // update the schema in RecordWriter
   stats.startSetup();
-  recordWriter.updateSchema(incoming);
+  try {
+recordWriter.updateSchema(incoming);
+  } catch (IOException e) {
+// TODO: This is better handled inside updateSchema()
 
 Review comment:
   same question as above


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376999100
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/orderedpartitioner/OrderedPartitionRecordBatch.java
 ##
 @@ -87,6 +87,8 @@
  * value is determined by where each record falls in the partition table. This
  * column is used by PartitionSenderRootExec to determine which bucket to 
assign
  * each record to.
+ * 
+ * This code is not used.
 
 Review comment:
   if the class is never used why don't just delete it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r377003326
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/FrameSupportTemplate.java
 ##
 @@ -175,7 +177,7 @@ private int processROWS(int row) throws DrillException {
 return row;
   }
 
-  private int processRANGE(int row) throws DrillException {
+  private int processRANGE(int row) throws SchemaChangeException {
 
 Review comment:
   ```suggestion
 private int processRange(int row) throws SchemaChangeException {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376908070
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/RootExec.java
 ##
 @@ -20,34 +20,38 @@
 import org.apache.drill.exec.proto.ExecProtos.FragmentHandle;
 
 /**
- * Functionality
- * 
- *   A FragmentRoot is a node which is the last processing node in a query 
plan. FragmentTerminals include Exchange
- *   output nodes and storage nodes.  They are there driving force behind the 
completion of a query.
- * 
- * Assumptions
- * 
- *   All implementations of {@link RootExec} assume that all their methods are 
called by the same thread.
+ * Node which is the last processing node in a query plan. FragmentTerminals
+ * include Exchange output nodes and storage nodes. They are there driving 
force
+ * behind the completion of a query.
  * 
+ * Assumes that all implementations of {@link RootExec} assume that all their
+ * methods are called by the same thread.
  */
 public interface RootExec extends AutoCloseable {
+
   /**
* Do the next batch of work.
-   * @return Whether or not additional batches of work are necessary. False 
means that this fragment is done.
+   *
+   * @return Whether or not additional batches of work are necessary. False
+   * means that this fragment is done.
*/
   boolean next();
 
   /**
-   * Inform sender that receiving fragment is finished and doesn't need any 
more data. This can be called multiple
-   * times (once for each downstream receiver). If all receivers are finished 
then a subsequent call to {@link #next()}
-   * will return false.
-   * @param handle The handle pointing to the downstream receiver that does 
not need anymore data.
+   * Inform sender that receiving fragment is finished and doesn't need any 
more
+   * data. This can be called multiple times (once for each downstream
+   * receiver). If all receivers are finished then a subsequent call to
+   * {@link #next()} will return false.
+   *
+   * @param handle
+   *  The handle pointing to the downstream receiver that does not need
+   *  anymore data.
*/
   void receivingFragmentFinished(FragmentHandle handle);
 
   /**
-   * Dump failed batches' state preceded by its parent's state to logs. 
Invoked when there is a
-   * failure during fragment execution.
+   * Dump failed batches' state preceded by its parent's state to logs. Invoked
+   * when there is a failure during fragment execution.
 
 Review comment:
   Please add ```@param``` section to the JavaDoc describing where the trowable 
came from. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376903426
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/BaseRootExec.java
 ##
 @@ -184,4 +179,38 @@ public void close() throws Exception {
   }
 }
   }
+
+  /**
+   * Given a list of operators and a stack trace, walks the stack trace and
+   * the operator list to find the leaf-most operator, which is the one
+   * that was active when the exception was thrown. Handle the cases in
+   * which no operator was active, each operator had multiple methods on
+   * the stack, or the exception was thrown in some class called by
+   * the operator.
+   * 
+   * Not all operators leave a mark in the trace. In particular if a the
+   * call stack is only through base-class methods, then we have no way to
+   * know the actual class during the call. This is OK because the leaf
+   * methods are just pass-through operations, they are unlikely to fail.
+   *
+   * @param  the type of the operator. Parameterized to allow easier
+   * testing
+   * @param dag the list of operators from root-most to leaf-most
+   * @param e the exception thrown somewhere in the operator tree
+   * @return the leaf-most operator, if any
+   */
+  public static  T findLeaf(List dag, Throwable e) {
+StackTraceElement[] trace = e.getStackTrace();
 
 Review comment:
   Minor comment: here we could collect a ```Set``` of class name strings 
ignoring those names which don't start with ```org.apache.drill``` package 
name.   Like: 
   ```java
   Set drillStackTrace = Stream.of(trace)
   .map(StackTraceElement::getClassName)
   .filter(cn -> cn.startsWith(DRILL_PACKAGE_NAME))
   .collect(Collectors.toSet());
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376919500
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScreenCreator.java
 ##
 @@ -83,47 +83,47 @@ public boolean innerNext() {
   IterOutcome outcome = next(incoming);
   logger.trace("Screen Outcome {}", outcome);
   switch (outcome) {
-  case STOP:
-return false;
-  case NONE:
-if (firstBatch) {
-  // this is the only data message sent to the client and may contain 
the schema
-  QueryWritableBatch batch;
-  QueryData header = QueryData.newBuilder()
-.setQueryId(context.getHandle().getQueryId())
-.setRowCount(0)
-.setDef(RecordBatchDef.getDefaultInstance())
-.build();
-  batch = new QueryWritableBatch(header);
+case STOP:
+  return false;
+case NONE:
+  if (firstBatch) {
+// this is the only data message sent to the client and may 
contain the schema
+QueryWritableBatch batch;
+QueryData header = QueryData.newBuilder()
+  .setQueryId(context.getHandle().getQueryId())
+  .setRowCount(0)
+  .setDef(RecordBatchDef.getDefaultInstance())
+  .build();
+batch = new QueryWritableBatch(header);
+
+stats.startWait();
+try {
+  userConnection.sendData(batch);
+} finally {
+  stats.stopWait();
+}
+firstBatch = false; // we don't really need to set this. But who 
knows!
+  }
 
+  return false;
+case OK_NEW_SCHEMA:
+  materializer = new VectorRecordMaterializer(context, oContext, 
incoming);
+  //$FALL-THROUGH$
+case OK:
+  injector.injectPause(context.getExecutionControls(), "sending-data", 
logger);
+  final QueryWritableBatch batch = materializer.convertNext();
+  updateStats(batch);
   stats.startWait();
   try {
 userConnection.sendData(batch);
   } finally {
 stats.stopWait();
   }
-  firstBatch = false; // we don't really need to set this. But who 
knows!
-}
-
-return false;
-  case OK_NEW_SCHEMA:
-materializer = new VectorRecordMaterializer(context, oContext, 
incoming);
-//$FALL-THROUGH$
-  case OK:
-injector.injectPause(context.getExecutionControls(), "sending-data", 
logger);
-final QueryWritableBatch batch = materializer.convertNext();
-updateStats(batch);
-stats.startWait();
-try {
-  userConnection.sendData(batch);
-} finally {
-  stats.stopWait();
-}
-firstBatch = false;
-
-return true;
-  default:
-throw new UnsupportedOperationException();
+  firstBatch = false;
+
+  return true;
+default:
+  throw new UnsupportedOperationException();
 
 Review comment:
   Here would be nice to add message containing ```outcome```  which caused the 
exception.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376951117
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterBatchCreator.java
 ##
 @@ -29,7 +29,9 @@
 
 public class RuntimeFilterBatchCreator implements 
BatchCreator{
 
 Review comment:
   ```suggestion
   public class RuntimeFilterBatchCreator implements 
BatchCreator {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376947666
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/SpilledRecordBatch.java
 ##
 @@ -163,11 +162,12 @@ public IterOutcome next() {
 container = vas.get();
   }
 } catch (IOException e) {
-  lastOutcome = IterOutcome.STOP;
-  throw UserException.dataReadError(e).addContext("Failed reading from a 
spill file").build(logger);
+  throw UserException.dataReadError(e)
+  .addContext("Failed reading from a spill file")
+  .build(logger);
 } catch (Exception e) {
-  lastOutcome = IterOutcome.STOP;
-  throw e;
+  // TODO: Catch the error closer to the cause and create a better error 
message.
 
 Review comment:
can we avoid adding new todos?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376951049
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterBatchCreator.java
 ##
 @@ -29,7 +29,9 @@
 
 public class RuntimeFilterBatchCreator implements 
BatchCreator{
   @Override
-  public CloseableRecordBatch getBatch(ExecutorFragmentContext context, 
RuntimeFilterPOP config, List children) throws 
ExecutionSetupException {
+  public CloseableRecordBatch getBatch(ExecutorFragmentContext context,
+   RuntimeFilterPOP config, List children)
+   throws ExecutionSetupException {
 
 Review comment:
   ```java
 public CloseableRecordBatch getBatch(ExecutorFragmentContext context, 
RuntimeFilterPOP config,
  List children) throws 
ExecutionSetupException {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376945843
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java
 ##
 @@ -340,16 +340,12 @@ public IterOutcome innerNext() {
   }
 
 case UPDATE_AGGREGATOR:
-  context.getExecutorState().fail(UserException.unsupportedError()
+  throw UserException.unsupportedError()
   .message(SchemaChangeException.schemaChanged(
   "Hash aggregate does not support schema change",
   incomingSchema,
   incoming.getSchema()).getMessage())
-  .build(logger));
-  close();
-  killIncoming(false);
-  firstBatch = false;
-  return IterOutcome.STOP;
+  .build(logger);
 
 Review comment:
   could you please explain why ```close()``` and ```killIncoming(false)``` 
invocations were removed for the case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376923043
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/StatisticsWriterRecordBatch.java
 ##
 @@ -91,41 +93,47 @@ public IterOutcome innerNext() {
 }
 // process the complete upstream in one next() call
 IterOutcome upstream;
-try {
-  do {
-upstream = next(incoming);
-
-switch(upstream) {
-  case STOP:
-return upstream;
-
-  case NOT_YET:
-  case NONE:
-break;
-
-  case OK_NEW_SCHEMA:
-setupNewSchema();
-// $FALL-THROUGH$
-  case OK:
+do {
+  upstream = next(incoming);
+
+  switch(upstream) {
+case STOP:
+  return upstream;
+
+case NOT_YET:
+case NONE:
+  break;
+
+case OK_NEW_SCHEMA:
+  setupNewSchema();
+  // $FALL-THROUGH$
+case OK:
+  try {
 counter += 
statsRecordWriterImpl.writeStatistics(incoming.getRecordCount());
-logger.debug("Total records written so far: {}", counter);
-
-for(final VectorWrapper v : incoming) {
-  v.getValueVector().clear();
-}
-break;
-
-  default:
-throw new UnsupportedOperationException();
-}
-  } while(upstream != IterOutcome.NONE);
-  // Flush blocking writers now
+  } catch (IOException e) {
+// TODO: Better handled inside the write() method.
 
 Review comment:
   such todos usually live forever in the code, could you please improve the 
```StatisticsRecordWriterImpl.writeStatistics``` in this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r377003222
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/FrameSupportTemplate.java
 ##
 @@ -156,7 +158,7 @@ private int processPartition(final int currentRow) throws 
DrillException {
 }
   }
 
-  private int processROWS(int row) throws DrillException {
+  private int processROWS(int row) throws SchemaChangeException {
 
 Review comment:
   ```suggestion
 private int processRows(int row) throws SchemaChangeException {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376926556
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/TopN/PriorityQueueTemplate.java
 ##
 @@ -142,11 +143,7 @@ public void add(RecordBatchData batch) throws 
SchemaChangeException{
   public void generate() {
 Stopwatch watch = Stopwatch.createStarted();
 final DrillBuf drillBuf = allocator.buffer(4 * queueSize);
-try {
-  finalSv4 = new SelectionVector4(drillBuf, queueSize, 4000);
-} catch (SchemaChangeException e) {
-  throw AbstractRecordBatch.schemaChangeException(e, "Priority Queue", 
logger);
-}
+finalSv4 = new SelectionVector4(drillBuf, queueSize, 4000);
 
 Review comment:
   please convert magic number to constant. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376942403
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java
 ##
 @@ -77,54 +80,53 @@ public BatchSchema getSchema() {
   @Override
   public IterOutcome innerNext() {
 if (processed) {
-//  cleanup();
   // if the upstream record batch is already processed and next() is 
called by
   // downstream then return NONE to indicate completion
   return IterOutcome.NONE;
 }
 
 // process the complete upstream in one next() call
 IterOutcome upstream;
-try {
-  do {
-upstream = next(incoming);
-
-switch(upstream) {
-  case STOP:
-return upstream;
-
-  case NOT_YET:
+do {
+  upstream = next(incoming);
+
+  switch(upstream) {
+case STOP:
+  return upstream;
+
+case NOT_YET:
+  break;
+case NONE:
+  if (schema != null) {
+// Schema is for the output batch schema which is setup in 
setupNewSchema(). Since the output
+// schema is fixed ((Fragment(VARCHAR), Number of records written 
(BIGINT)) we should set it
+// up even with 0 records for it to be reported back to the client.
 break;
-  case NONE:
-if (schema != null) {
-  // Schema is for the output batch schema which is setup in 
setupNewSchema(). Since the output
-  // schema is fixed ((Fragment(VARCHAR), Number of records 
written (BIGINT)) we should set it
-  // up even with 0 records for it to be reported back to the 
client.
-  break;
-}
-
-  case OK_NEW_SCHEMA:
-setupNewSchema();
-// $FALL-THROUGH$
-  case OK:
-counter += eventBasedRecordWriter.write(incoming.getRecordCount());
-logger.debug("Total records written so far: {}", counter);
+  }
 
-for(final VectorWrapper v : incoming) {
-  v.getValueVector().clear();
-}
-break;
-
-  default:
-throw new UnsupportedOperationException();
-}
-  } while(upstream != IterOutcome.NONE);
-} catch(IOException ex) {
-  logger.error("Failure during query", ex);
-  kill(false);
-  context.getExecutorState().fail(ex);
-  return IterOutcome.STOP;
-}
+case OK_NEW_SCHEMA:
+  setupNewSchema();
+  // $FALL-THROUGH$
+case OK:
+  try {
+counter += eventBasedRecordWriter.write(incoming.getRecordCount());
+  } catch (IOException e) {
+// TODO: Better handled inside the write() method.
 
 Review comment:
   can we avoid leaving new todos in the code?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376890243
 
 

 ##
 File path: 
common/src/main/java/org/apache/drill/common/exceptions/DrillRuntimeException.java
 ##
 @@ -17,8 +17,11 @@
  */
 package org.apache.drill.common.exceptions;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 public class DrillRuntimeException extends RuntimeException {
-  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillRuntimeException.class);
+  private static final Logger logger = 
LoggerFactory.getLogger(DrillRuntimeException.class);
 
 Review comment:
   Seems like the logger is unused and may be safely removed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376924072
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/StatisticsWriterRecordBatch.java
 ##
 @@ -154,11 +162,18 @@ private void addOutputContainerData() {
 container.setRecordCount(1);
   }
 
-  protected void setupNewSchema() throws IOException {
+  protected void setupNewSchema() {
 try {
   // update the schema in RecordWriter
   stats.startSetup();
-  recordWriter.updateSchema(incoming);
+  try {
+recordWriter.updateSchema(incoming);
+  } catch (IOException e) {
+// TODO: This is better handled inside updateSchema()
 
 Review comment:
   I would like to ask for fixing it, thank you in advance.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376954034
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterRecordBatch.java
 ##
 @@ -196,20 +194,24 @@ private void setupHashHelper() {
   ValueVectorReadExpression toHashFieldExp = new 
ValueVectorReadExpression(typedFieldId);
   hashFieldExps.add(toHashFieldExp);
 }
-hash64 = hashHelper.getHash64(hashFieldExps.toArray(new 
LogicalExpression[hashFieldExps.size()]), typedFieldIds.toArray(new 
TypedFieldId[typedFieldIds.size()]));
+hash64 = hashHelper.getHash64(hashFieldExps.toArray(
+new LogicalExpression[hashFieldExps.size()]),
+typedFieldIds.toArray(new TypedFieldId[typedFieldIds.size()]));
   } catch (Exception e) {
 throw UserException.internalError(e).build(logger);
   }
 }
   }
 
   /**
-   * If RuntimeFilter is available then applies the filter condition on the 
incoming batch records and creates an SV2
-   * to store indexes which passes the filter condition. In case when 
RuntimeFilter is not available it just pass
+   * If RuntimeFilter is available then applies the filter condition on the
+   * incoming batch records and creates an SV2 to store indexes which passes 
the
+   * filter condition. In case when RuntimeFilter is not available it just pass
* through all the records from incoming batch to downstream.
+   *
* @throws SchemaChangeException
 
 Review comment:
   ```suggestion
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for operator errors

2020-02-10 Thread GitBox
ihuzenko commented on a change in pull request #1975: DRILL-7576: Fail fast for 
operator errors
URL: https://github.com/apache/drill/pull/1975#discussion_r376912149
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java
 ##
 @@ -212,13 +213,24 @@ private boolean shouldContinueAfterNoRecords() throws 
Exception {
   }
   return true;
 } else { // Regular scan
-  currentReader.close();
-  currentReader = null;
+  closeCurrentReader();
   return true; // In regular case, we always continue the iteration, if no 
more reader, we will break out at the head of loop
 }
   }
 
-  private IterOutcome internalNext() throws Exception {
+  private void closeCurrentReader() {
+try {
+  if (currentReader != null) {
 
 Review comment:
   I would suggest putting whole ```try-catch-finally``` inside the if, or 
using our ```Autocloseables``` like: 
   ```java
   AutoCloseables.closeSilently(currentReader);
   currentReader = null;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services