[GitHub] drill pull request #911: - DRILL-5507 Made verbose info logging message debu...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/911#discussion_r133838389 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/schedule/BlockMapBuilder.java --- @@ -246,12 +249,16 @@ public EndpointByteMap getEndpointByteMap(FileWork work) throws IOException { DrillbitEndpoint endpoint = getDrillBitEndpoint(host); if (endpoint != null) { endpointByteMap.add(endpoint, bytes); -} else { - logger.info("Failure finding Drillbit running on host {}. Skipping affinity to that host.", host); +} else if (logger.isDebugEnabled()) { --- End diff -- and `} else if (noDrillbitHosts != null && noDrillbitHosts.add(host)) { logger.debug(...); }` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #911: - DRILL-5507 Made verbose info logging message debu...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/911#discussion_r133838104 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/schedule/BlockMapBuilder.java --- @@ -228,6 +230,7 @@ public EndpointByteMap getEndpointByteMap(FileWork work) throws IOException { // Find submap of ranges that intersect with the rowGroup ImmutableRangeMap<Long,BlockLocation> subRangeMap = blockMap.subRangeMap(rowGroupRange); +Set noDrillbitHosts = Sets.newHashSet(); --- End diff -- Consider `final Set noDrillbitHosts = logger.isDebugEnabled() ? Sets.newHashSet() : null;` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #930: DRILL-5761: Disable Lilith ClassicMultiplexSocketAp...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/930#discussion_r138722693 --- Diff: common/src/test/resources/logback-test.xml --- @@ -0,0 +1,92 @@ + + + + + + + +true +1 +true +${LILITH_HOSTNAME:-localhost} + + + + + + + + %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + +
[GitHub] drill issue #926: DRILL-5269 Make DirectSubScan Jackson JSON deserializable
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/926 rebased and squashed ---
[GitHub] drill issue #926: DRILL-5269 Make DirectSubScan Jackson JSON deserializable
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/926 @paul-rogers Added unit test and adopted query to an existing parquet file. ---
[GitHub] drill issue #977: DRILL-5849: Add freemarker lib to dependencyManagement to ...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/977 The proposed changes simplify the dependency management, but do not change the dependency resolution, AFAIK. ---
[GitHub] drill issue #977: DRILL-5849: Add freemarker lib to dependencyManagement to ...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/977 @arina-ielchiieva I refer to java-exec and fmpp modules dependencies resolution. Both AFAIK should not be affected by the change, so I suppose it is another module that has a transitive dependency on freemarker through java-exec and another 3rd party library that has the issue described in the JIRA. Please confirm and provide details on the module and the 3rd party library. ---
[GitHub] drill pull request #991: DRILL-5876: Remove netty-tcnative dependency from j...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/991#discussion_r144697258 --- Diff: exec/java-exec/pom.xml --- @@ -701,18 +707,21 @@ - -kr.motd.maven -os-maven-plugin -1.5.0.Final - - + --- End diff -- Looking at netty code it seems to use `Class.forName("io.netty.internal.tcnative.SSL", false, OpenSsl.class.getClassLoader());` ([see OpenSsl](https://github.com/netty/netty/blob/4.1/handler/src/main/java/io/netty/handler/ssl/OpenSsl.java)), so if all netty-tcnative OS dependent jars are on the classpath, it should load the first in the classpath (if delegated to the system classloader) or one that OpenSsl classloader finds. My understanding is that OpenSsl class is loaded by the system classloader (I may be wrong), but in this case, having all variants of netty-tcnative on the classpath, will not resolve the issue as netty will try to load "io.netty.internal.tcnative.SSL" class only from one of the jars and if it happens to be a wrong jar, will disable OpenSsl functionality. ---
[GitHub] drill pull request #991: DRILL-5876: Remove netty-tcnative dependency from j...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/991#discussion_r144659314 --- Diff: exec/java-exec/pom.xml --- @@ -701,18 +707,21 @@ - -kr.motd.maven -os-maven-plugin -1.5.0.Final - - + --- End diff -- For example `-P apache,openssl'. Another option will be to enable openssl profile based on a property. ---
[GitHub] drill pull request #985: DRILL-5862 Update project parent pom xml to the lat...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/985#discussion_r144937286 --- Diff: pom.xml --- @@ -15,7 +15,8 @@ org.apache apache -14 +18 --- End diff -- @paul-rogers This is the latest version that majority of Apache projects use (if I remember correctly it was released in May 2016, version 14 is quite old). Apache parent pom is a maven artifact that defines plugins and other common to ASF project maven parameters and it needs to be versioned. I'd suggest keeping pom file as is, the discussion from the PR is mirrored in the DRILL-5862 and the commit has reference to it. ---
[GitHub] drill pull request #991: DRILL-5876: Remove netty-tcnative dependency from j...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/991#discussion_r145008784 --- Diff: exec/java-exec/pom.xml --- @@ -701,18 +707,21 @@ - -kr.motd.maven -os-maven-plugin -1.5.0.Final - - + --- End diff -- Sounds like a hack. I'd recommend to keep it commented out or try `.mvn/extensions.xml` solution. ---
[GitHub] drill pull request #991: DRILL-5876: Remove netty-tcnative dependency from j...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/991#discussion_r144624216 --- Diff: exec/java-exec/pom.xml --- @@ -701,18 +707,21 @@ - -kr.motd.maven -os-maven-plugin -1.5.0.Final - - + --- End diff -- Can it be in a separate profile (openssl) that is disabled by default? ---
[GitHub] drill pull request #934: DRILL-3449 When Foreman node dies, the FragmentExec...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/934#discussion_r137941417 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentStatusReporter.java --- @@ -113,4 +120,14 @@ void fail(final UserException ex) { sendStatus(status); } + @Override + public void close() + { +final ControlTunnel tunnel = this.tunnel.getAndSet(null); +if (tunnel != null) { + logger.debug("Closing {}", this); --- End diff -- No, close() is not a placeholder. It closes FragmentStatusReporter and after the close, request to send status becomes no-op. ---
[GitHub] drill issue #926: DRILL-5269 Make DirectSubScan Jackson JSON deserializable
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/926 @paul-rogers I guess you refer to a system/integration test (execute the query provided in JIRA), not a unit test. ---
[GitHub] drill pull request #934: DRILL-3449 When Foreman node dies, the FragmentExec...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/934 DRILL-3449 When Foreman node dies, the FragmentExecutor still tries to send status updates to Foreman You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-3449 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #934 commit 06f15eeaf34c610fce0e3a62ae635c10f9885608 Author: Vlad Rozov <vro...@apache.org> Date: 2017-09-07T01:29:02Z DRILL-3449 When Foreman node dies, the FragmentExecutor still tries to send status updates to Foreman ---
[GitHub] drill issue #934: DRILL-3449 When Foreman node dies, the FragmentExecutor st...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/934 @paul-rogers Please review ---
[GitHub] drill pull request #926: DRILL-5269 Make DirectSubScan Jackson JSON deserial...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/926#discussion_r137658224 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/RecordReader.java --- @@ -26,6 +26,9 @@ import org.apache.drill.exec.record.MaterializedField; import org.apache.drill.exec.vector.ValueVector; +import com.fasterxml.jackson.annotation.JsonTypeInfo; + +@JsonTypeInfo(use=JsonTypeInfo.Id.NAME, include=JsonTypeInfo.As.WRAPPER_OBJECT, property="type") --- End diff -- AFAIK, it will not impact implementations of the `RecordReader` interface. The annotation affects Jackson type inference when a physical operator has a reference to a `RecordReader` and Jackson needs to construct a concrete implementation of the `RecordReader`. Such information needs to be passed in any case and the annotation specifies JSON syntax used to pass the type information. To avoid the concern I'll move the annotation to the field declaration in `DirectSubScan`. The effect of such move is that it will be necessary to annotate `RecordReader` in every operator that may be passed as part of a fragment to a remote node for execution. ---
[GitHub] drill pull request #934: DRILL-3449 When Foreman node dies, the FragmentExec...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/934#discussion_r137941124 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentStatusReporter.java --- @@ -113,4 +120,14 @@ void fail(final UserException ex) { sendStatus(status); } + @Override + public void close() + { +final ControlTunnel tunnel = this.tunnel.getAndSet(null); +if (tunnel != null) { + logger.debug("Closing {}", this); --- End diff -- We are closing FragmentStatusReporter, not the `tunnel` that it references. The ControlTunnel is not Closable even though it has a reference to a resource that is Closable and should provide a way to release the resource it holds. Please let me know if a comment is required here, but I do plan to make ControlTunnel Closable. As it requires code refactoring not directly related to the JIRA/PR, I plan to do this in a separate PR. ---
[GitHub] drill pull request #930: DRILL-5761: Disable Lilith ClassicMultiplexSocketAp...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/930#discussion_r138407993 --- Diff: common/src/test/resources/logback-test.xml --- @@ -0,0 +1,111 @@ + + + + + + + +true +1 +true +${LILITH_HOSTNAME:-localhost} + + + + + + + + %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + + + + + --- End diff -- Can this logger be defined between line 21 and 22 in a single if condition? ---
[GitHub] drill pull request #930: DRILL-5761: Disable Lilith ClassicMultiplexSocketAp...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/930#discussion_r138408349 --- Diff: common/src/test/resources/logback-test.xml --- @@ -0,0 +1,92 @@ + + + + + + + +true +1 +true +${LILITH_HOSTNAME:-localhost} + + + + + + + + %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n + + + +
[GitHub] drill pull request #913: - DRILL-5729 Fix Travis Build
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/913#discussion_r135148431 --- Diff: .travis.yml --- @@ -13,8 +13,10 @@ # See the License for the specific language governing permissions and # limitations under the License. -sudo: false +sudo: required before_install: git fetch --unshallow language: java +jdk: + - openjdk7 install: mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false) script: mvn package -DskipTests=true --- End diff -- `install` includes `package` goal and I don't think that javadoc or source plugin are enabled in the default profile, so `script` just repeats the `install`. Also, if the goal is to skip source jar generation, the proper property is maven.source.skip, I believe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #913: - DRILL-5729 Fix Travis Build
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/913 @ilooner-mapr Do you know why it fails on the Trusty (new default)? Going back to using Precise is OK for a while, but my guess that it will eventually become obsolete. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #913: - DRILL-5729 Fix Travis Build
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/913 @ilooner-mapr I tested it on my branch and it works with the default jdk and without *sudo*. Add `MAVEN_OPTS="-Xms1G -Xmx1G"` before `mvn`. Can you remove leading dash from the commit message as part of rebase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #913: - DRILL-5729 Fix Travis Build
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/913#discussion_r135143313 --- Diff: .travis.yml --- @@ -13,8 +13,10 @@ # See the License for the specific language governing permissions and # limitations under the License. -sudo: false +sudo: required before_install: git fetch --unshallow language: java +jdk: + - openjdk7 install: mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false) --- End diff -- @ilooner-mapr can you remove redirection to mvn_install.log as part of the same PR. It is not necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #913: - DRILL-5729 Fix Travis Build
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/913#discussion_r135291374 --- Diff: .travis.yml --- @@ -13,8 +13,9 @@ # See the License for the specific language governing permissions and # limitations under the License. -sudo: false before_install: git fetch --unshallow language: java -install: mvn install --batch-mode -DskipTests=true -Dmaven.javadoc.skip=true -Dsource.skip=true > mvn_install.log || (cat mvn_install.log && false) -script: mvn package -DskipTests=true +jdk: + - openjdk7 --- End diff -- @ilooner-mapr Switch to jdk7 is not required, it works with jdk8 as well, but enforcing jdk7 helps with validating that the project is java 1.7 compliant, so I am OK with the change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #913: - DRILL-5729 Fix Travis Build
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/913 @parthchandra LGTM. There are questions that I have related to how travis-ci build is configured, but they are for pre-existing settings and can be addressed in a follow up PR if necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #930: DRILL-5761: Disable Lilith ClassicMultiplexSocketAp...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/930#discussion_r136835528 --- Diff: common/src/test/resources/logback.xml --- @@ -16,17 +16,22 @@ 1 true ${LILITH_HOSTNAME:-localhost} +
[GitHub] drill pull request #926: DRILL-5269 Make DirectSubScan Jackson JSON deserial...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/926 DRILL-5269 Make DirectSubScan Jackson JSON deserializable @amansinha100 Please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-5269 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/926.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #926 commit 685653fed38beb4eea76cfe3460907386eb1a6c0 Author: Vlad Rozov <vro...@apache.org> Date: 2017-08-30T00:05:24Z DRILL-5269 Make DirectSubScan Jackson JSON deserializable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #965: DRILL-5811 reduced repeated log messages further.
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/965#discussion_r14359 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/schedule/BlockMapBuilder.java --- @@ -104,12 +104,16 @@ public BlockMapReader(FileStatus status, boolean blockify) { @Override protected List runInner() throws Exception { final List work = Lists.newArrayList(); + + final Set noDrillbitHosts = logger.isDebugEnabled() ? Sets.newHashSet() : null; --- End diff -- Consider moving `noDillbitHosts` to `BlockMapBuilder` class (use `Sets.newConcurrentHashSet()` in this case) as it does not seem to belong to `BlockMapReader`. With such change, other changes are not necessary and likely this will allow reducing repeated log messages even further. Drop `` from `Sets.newHashSet()`. ---
[GitHub] drill pull request #985: DRILL-5862 Update project parent pom xml to the lat...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/985 DRILL-5862 Update project parent pom xml to the latest ASF version You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-5862 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/985.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #985 commit cdcfa4d991f7fa5598bb9f74291c37bd6204209f Author: Vlad Rozov <vro...@apache.org> Date: 2017-10-11T15:05:25Z DRILL-5862 Update project parent pom xml to the latest ASF version ---
[GitHub] drill pull request #985: DRILL-5862 Update project parent pom xml to the lat...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/985#discussion_r144339857 --- Diff: pom.xml --- @@ -15,7 +15,8 @@ org.apache apache -14 +18 + --- End diff -- Yes, to make maven happy. Otherwise, it expects to find parent pom in a parent directory and if finds one (for example if there is an uber pom that builds multiple projects), maven will complain. ---
[GitHub] drill pull request #985: DRILL-5862 Update project parent pom xml to the lat...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/985#discussion_r144351062 --- Diff: pom.xml --- @@ -15,7 +15,8 @@ org.apache apache -14 +18 + --- End diff -- Try to add an empty pom to the parent of your drill repo :). I guess most of the projects assume that there will be no pom.xml file in a parent directory and it is not always the case, so it is better to follow maven guidelines (http://maven.apache.org/ref/3.0/maven-model/maven.html) ---
[GitHub] drill pull request #991: DRILL-5876: Remove netty-tcnative dependency from j...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/991#discussion_r145289493 --- Diff: exec/java-exec/pom.xml --- @@ -693,6 +699,19 @@ + + openssl + + + io.netty + netty-tcnative + 2.0.1.Final --- End diff -- Please add provided scope. ---
[GitHub] drill pull request #991: DRILL-5876: Remove netty-tcnative dependency from j...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/991#discussion_r145291048 --- Diff: exec/java-exec/pom.xml --- @@ -22,7 +22,10 @@ 1.8-rev1 + --- End diff -- Please uncomment (should be harmful) or move to openssl profile. ---
[GitHub] drill pull request #1048: DRILL-5987: Use one version of javassist
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1048#discussion_r152690825 --- Diff: common/pom.xml --- @@ -63,12 +63,29 @@ org.msgpack msgpack 0.6.6 + --- End diff -- Exclusions should not be necessary. ---
[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1031 rebased to upstream/master and squashed commits. ---
[GitHub] drill pull request #1058: DRILL-6002: Avoid memory copy from direct buffer t...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1058 DRILL-6002: Avoid memory copy from direct buffer to heap while spilling to local disk @paul-rogers Please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-6002 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1058.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1058 commit 8e9124de681d3a8cd70bf0bb243460cb78dcb295 Author: Vlad Rozov <vro...@apache.org> Date: 2017-11-22T22:06:13Z DRILL-6002: Avoid memory copy from direct buffer to heap while spilling to local disk ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r157605488 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java --- @@ -350,7 +350,7 @@ public void testAlgorithm() throws Exception { public MockPartitionSenderRootExec(FragmentContext context, RecordBatch incoming, HashPartitionSender operator) throws OutOfMemoryException { - super(context, incoming, operator); + super(context, incoming, operator, false); --- End diff -- Is the change still necessary? ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r157605837 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderCreator.java --- @@ -35,7 +35,5 @@ public RootExec getRoot(FragmentContext context, assert children != null && children.size() == 1; return new PartitionSenderRootExec(context, children.iterator().next(), config); - --- End diff -- Will be good to exclude format only change from the PR. ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r157545737 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java --- @@ -98,10 +100,12 @@ public int metricId() { public PartitionSenderRootExec(FragmentContext context, RecordBatch incoming, - HashPartitionSender operator) throws OutOfMemoryException { + HashPartitionSender operator, + boolean closeIncoming) throws OutOfMemoryException { --- End diff -- Introduce back constructor without `closeIncoming` argument and call this constructor from it with `closeIncoming` set to `false`. ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1075 DRILL-6030: Managed sort should minimize number of batches in a k-way merge @paul-rogers Please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-6030 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1075.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1075 commit f40d962cf35574c0406937de1ee86ef853234b3e Author: Vlad Rozov <vro...@apache.org> Date: 2017-12-17T17:25:55Z DRILL-6030: Managed sort should minimize number of batches in a k-way merge ---
[GitHub] drill issue #1073: DRILL-5967: Fixed memory leak in OrderedPartitionSender
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1073 LGTM ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157885846 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- IMO, it is better to change the default to avoid upgrade problems. In an upgrade scenario, users may simply overwrite `drill-override.conf` from their prior installations and forget to set the merge limit. Is there a reason not to change the default merge limit? ---
[GitHub] drill issue #1075: DRILL-6030: Managed sort should minimize number of batche...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1075 The scenario when all batches can be merged in memory is covered by 'if (canUseMemoryMerge())` check in `SortImpl.java:399`. The affected code path applies only to cases where merge between spilled and in-memory batches is necessary. Note that this is a short term fix to improve managed sort performance, in a long run, it is necessary to have an ability to merge all batches in memory (using SV4) without spilling and be able to merge it with the spilled data. ---
[GitHub] drill issue #1070: DRILL-6004: Direct buffer bounds checking should be disab...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1070 @paul-rogers Please review ---
[GitHub] drill pull request #1070: DRILL-6004: Direct buffer bounds checking should b...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1070 DRILL-6004: Direct buffer bounds checking should be disabled by default You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-6004 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1070.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1070 commit b863f2a0c9c595dacb924747dea6c449c9ee9917 Author: Vlad Rozov <vro...@apache.org> Date: 2017-12-12T21:37:38Z DRILL-6004: Direct buffer bounds checking should be disabled by default ---
[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1031 @arina-ielchiieva Please test ---
[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1031 I already tested mapr and default profile locally. Once test pass on your cluster, let me know and I'll squash commits. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r150259975 --- Diff: pom.xml --- @@ -442,7 +442,7 @@ -Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on" -Ddrill.test.query.printing.silent=true -Ddrill.catastrophic_to_standard_out=true - -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M + -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M --- End diff -- It may be better to open a separate PR for DRILL-5926 ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r150322554 --- Diff: pom.xml --- @@ -442,7 +442,7 @@ -Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on" -Ddrill.test.query.printing.silent=true -Ddrill.catastrophic_to_standard_out=true - -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M + -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M --- End diff -- DRILL-5926 affects not only this PR but my PR #1031 as well even though they are not related at all. At a minimum the change should be in a separate commit that refers to DRILL-5926 as it is a workaround and not an actual fix for the problem. In the future, the workaround may need to be reverted and if it is mixed with other changes, it will not be clear what needs to be reverted. Note that with other changes I would expect tests to require less memory, not more. ---
[GitHub] drill pull request #1025: DRILL-5936: Refactor MergingRecordBatch based on c...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1025#discussion_r150331389 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java --- @@ -795,6 +788,8 @@ private void generateComparisons(final ClassGenerator g, final VectorAccessib * @param node Reference to the next record to copy from the incoming batches */ private boolean copyRecordToOutgoingBatch(final Node node) { +assert outgoingPosition < OUTGOING_BATCH_SIZE --- End diff -- I added the assert to avoid possible errors during further code refactoring. As it is an assert that will not affect performance in production and there is another assert already, I'd prefer to keep it. ---
[GitHub] drill pull request #1025: DRILL-5936: Refactor MergingRecordBatch based on c...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1025#discussion_r150326663 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java --- @@ -177,11 +177,11 @@ public IterOutcome innerNext() { } boolean schemaChanged = false; -if (prevBatchWasFull) { +if (!prevBatchNotFull) { --- End diff -- @amansinha100 Addressed minor comment, please review delta and I'll squash it. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r150348155 --- Diff: pom.xml --- @@ -442,7 +442,7 @@ -Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on" -Ddrill.test.query.printing.silent=true -Ddrill.catastrophic_to_standard_out=true - -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M + -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M --- End diff -- I am in favor of 1 as well. @paul-rogers One of the test that fails initially tries to allocate 2GB (without re-allocation). The test intermittently fails due to Pooled Allocator not releasing memory back to the system. I don't know if it is supposed to return memory back to the system when it is closed and whether it is supposed to be closed at all during the unit tests. ---
[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1031 @arina-ielchiieva Please attach the full log to JIRA. ---
[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1031 @arina-ielchiieva Please test with the latest commit. I changed mapr version back to 5.2.1, note that 2.7.0-mapr-1707 is compiled with 5.2.2 ---
[GitHub] drill pull request #1041: DRILL-5961: For long running queries (> 10 min) Dr...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1041 DRILL-5961: For long running queries (> 10 min) Drill may raise FragmentSetupException for completed/cancelled fragments You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-5961 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1041.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1041 commit f7e0c44c1277cc3fa0cdf466e01521974c98262d Author: Vlad Rozov <vro...@apache.org> Date: 2017-11-15T00:24:01Z DRILL-5961: For long running queries (> 10 min) Drill may raise FragmentSetupException for completed/cancelled fragments commit 575dec0fdc0dc4efb50569afa568c06f21546e6e Author: Vlad Rozov <vro...@apache.org> Date: 2017-11-15T00:35:09Z DRILL-5961: For long running queries (> 10 min) Drill may raise FragmentSetupException for completed/cancelled fragments ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r149230560 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,49 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); +final long endTime = startTime + EXIT_TIMEOUT; + +exitLock.lock(); + +try { + long currentTime; + while ((currentTime = System.currentTimeMillis()) < endTime) { --- End diff -- I'd recommend changing `while` condition to `queries` and `runningFragments` not empty check as a primary condition for exiting. Timeout (waitTime >0) is the secondary condition. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r149234360 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,49 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); +final long endTime = startTime + EXIT_TIMEOUT; + +exitLock.lock(); + +try { + long currentTime; + while ((currentTime = System.currentTimeMillis()) < endTime) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + if (!exitCondition.await(endTime - currentTime, TimeUnit.MILLISECONDS)) { +break; + } +} catch (InterruptedException e) { + logger.error("Interrupted while waiting to exit"); +} } - exitLatch = new ExtendedLatch(); -} + if (!(queries.isEmpty() && runningFragments.isEmpty())) { +logger.warn("Timed out after {} millis. Shutting down before all fragments and foremen " + + "have completed.", EXIT_TIMEOUT); + } -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +} finally { + exitLock.unlock(); +} } /** - * If it is safe to exit, and the exitLatch is in use, signals it so that waitToExit() will - * unblock. + * A thread calling the {@link #waitToExit()} method is notified when a foreman is retired. */ private void indicateIfSafeToExit() { -synchronized(this) { - if (exitLatch != null) { -if (queries.isEmpty() && runningFragments.isEmpty()) { - exitLatch.countDown(); -} - } -} +exitLock.lock(); +exitCondition.signal(); --- End diff -- I'd recommend adding try/finally and checking the condition before signaling `exitCondition`. Consider renaming `exitCondition` to `isEmpty` or `isQueriesAndFragmentsEmpty`. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r149232323 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,49 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); +final long endTime = startTime + EXIT_TIMEOUT; + +exitLock.lock(); + +try { + long currentTime; + while ((currentTime = System.currentTimeMillis()) < endTime) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + if (!exitCondition.await(endTime - currentTime, TimeUnit.MILLISECONDS)) { +break; + } +} catch (InterruptedException e) { + logger.error("Interrupted while waiting to exit"); +} } - exitLatch = new ExtendedLatch(); -} + if (!(queries.isEmpty() && runningFragments.isEmpty())) { +logger.warn("Timed out after {} millis. Shutting down before all fragments and foremen " + --- End diff -- It may be good to add what queries and runningFragments are running to the log message. ---
[GitHub] drill pull request #1027: DRILL-4779 : Kafka storage plugin
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1027#discussion_r150019576 --- Diff: contrib/storage-kafka/pom.xml --- @@ -0,0 +1,130 @@ + + +http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; + xmlns="http://maven.apache.org/POM/4.0.0; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;> + 4.0.0 + + +drill-contrib-parent +org.apache.drill.contrib +1.12.0-SNAPSHOT + + + drill-storage-kafka + contrib/kafka-storage-plugin + + +UTF-8 --- End diff -- If the setting is necessary, it will be better to set it at the root pom. ---
[GitHub] drill pull request #1027: DRILL-4779 : Kafka storage plugin
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1027#discussion_r150028303 --- Diff: contrib/storage-kafka/pom.xml --- @@ -0,0 +1,130 @@ + + +http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; + xmlns="http://maven.apache.org/POM/4.0.0; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;> + 4.0.0 + + +drill-contrib-parent +org.apache.drill.contrib +1.12.0-SNAPSHOT + + + drill-storage-kafka + contrib/kafka-storage-plugin + + +UTF-8 +0.11.0.1 +**/KafkaTestSuit.class --- End diff -- What is the reason to define `kafka.TestSuite` property? ---
[GitHub] drill pull request #1027: DRILL-4779 : Kafka storage plugin
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1027#discussion_r150030044 --- Diff: contrib/storage-kafka/pom.xml --- @@ -0,0 +1,130 @@ + + +http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; + xmlns="http://maven.apache.org/POM/4.0.0; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;> + 4.0.0 + + +drill-contrib-parent +org.apache.drill.contrib +1.12.0-SNAPSHOT + + + drill-storage-kafka + contrib/kafka-storage-plugin + + +UTF-8 +0.11.0.1 +**/KafkaTestSuit.class + + + + + +org.apache.maven.plugins +maven-surefire-plugin + + +${kafka.TestSuite} + + +**/TestKafkaQueries.java + + + + logback.log.dir + ${project.build.directory}/surefire-reports + + + + + + + + + + org.apache.drill.exec + drill-java-exec + ${project.version} + + --- End diff -- Why is it necessary to exclude zookeeper? If a specific version of zookeeper is required, will it be better to explicitly add zookeeper to the dependency management? ---
[GitHub] drill pull request #1027: DRILL-4779 : Kafka storage plugin
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1027#discussion_r150029170 --- Diff: contrib/storage-kafka/pom.xml --- @@ -0,0 +1,130 @@ + + +http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; + xmlns="http://maven.apache.org/POM/4.0.0; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;> + 4.0.0 + + +drill-contrib-parent +org.apache.drill.contrib +1.12.0-SNAPSHOT + + + drill-storage-kafka + contrib/kafka-storage-plugin + + +UTF-8 +0.11.0.1 +**/KafkaTestSuit.class + + + + + +org.apache.maven.plugins +maven-surefire-plugin + --- End diff -- It will be better to go with the default `maven-surefire-plugin` configuration unless there is a good justification to use custom config. Most of the time this can be achieved by using default test name convention. ---
[GitHub] drill pull request #1031: DRILL-5917: Ban org.json:json library in Drill
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1031 DRILL-5917: Ban org.json:json library in Drill @arina-ielchiieva Please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-5917 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1031.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1031 commit 525989d8603954bfd5e12a680acde7ac00e26300 Author: Vlad Rozov <vro...@apache.org> Date: 2017-11-08T01:07:38Z DRILL-5917: Ban org.json:json library in Drill ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r14668 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); --- End diff -- Usually, lock is called outside of try to avoid calling unlock in finally if lock was not successful. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148889092 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); --- End diff -- Why is this lock necessary? ---
[GitHub] drill pull request #1058: DRILL-6002: Avoid memory copy from direct buffer t...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1058#discussion_r154499154 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/cache/VectorSerializer.java --- @@ -62,27 +72,65 @@ public Writer write(VectorAccessible va) throws IOException { @SuppressWarnings("resource") public Writer write(VectorAccessible va, SelectionVector2 sv2) throws IOException { + checkNotNull(va); WritableBatch batch = WritableBatch.getBatchNoHVWrap( va.getRecordCount(), va, sv2 != null); return write(batch, sv2); } public Writer write(WritableBatch batch, SelectionVector2 sv2) throws IOException { - VectorAccessibleSerializable vas; - if (sv2 == null) { -vas = new VectorAccessibleSerializable(batch, allocator); - } else { -vas = new VectorAccessibleSerializable(batch, sv2, allocator); - } - if (retain) { -vas.writeToStreamAndRetain(stream); - } else { -vas.writeToStream(stream); + return write(batch, sv2, false); +} + +public Writer write(WritableBatch batch, SelectionVector2 sv2, boolean retain) throws IOException { + checkNotNull(batch); + checkNotNull(channel); + final Timer.Context timerContext = metrics.timer(WRITER_TIMER).time(); + + final DrillBuf[] incomingBuffers = batch.getBuffers(); + final UserBitShared.RecordBatchDef batchDef = batch.getDef(); + + try { +/* Write the metadata to the file */ +batchDef.writeDelimitedTo(output); + + +/* If we have a selection vector, dump it to file first */ +if (sv2 != null) { + final int dataLength = sv2.getCount() * SelectionVector2.RECORD_SIZE; + channel.write(sv2.getBuffer(false).nioBuffer(0, dataLength)); +} + +/* Dump the array of ByteBuf's associated with the value vectors */ +for (DrillBuf buf : incomingBuffers) { + /* dump the buffer into the OutputStream */ + channel.write(buf.nioBuffer()); --- End diff -- I agree that we should only write necessary payload and avoid spilling unused buffers. Note that `channel.write()` writes only bytes written to a `ByteBuffer` (`ByteBuffer.remaining()`) and not the whole allocated buffer. Do you mean that there are bytes written to a buffer but they should not be spilled? In this case, I'd suggest limiting the scope of this PR to using `WritableByteChannel` to avoid memory copy from off-heap during spill to local files and handling extra bytes in a separate JIRA/PR. ---
[GitHub] drill pull request #1058: DRILL-6002: Avoid memory copy from direct buffer t...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1058#discussion_r154499915 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/cache/VectorSerializer.java --- @@ -40,20 +52,18 @@ */ public static class Writer { +static final MetricRegistry metrics = DrillMetrics.getRegistry(); +static final String WRITER_TIMER = MetricRegistry.name(VectorAccessibleSerializable.class, "writerTime"); -private final OutputStream stream; -private final BufferAllocator allocator; -private boolean retain; +private final SpillSet spillSet; +private final WritableByteChannel channel; +private final OutputStream output; private long timeNs; -public Writer(BufferAllocator allocator, OutputStream stream) { - this.allocator = allocator; - this.stream = stream; -} - -public Writer retain() { - retain = true; - return this; +private Writer(SpillSet spillSet, String path) throws IOException { --- End diff -- The `Writer` in `VectorSerializer` is used only for spilling. I'd suggest moving `VectorSerializer` to the `spill`package to make it more explicit and keep the dependency on the `SpillSet`. I am not sure if there is a need for generic `Writer` that can handle multiple use cases. I think that `Writer` needs to be optimized to handle spill use case. `SpillSet` is also needed to get `WritableByteChannel` and encapsulates what `Writer` uses from operators. ---
[GitHub] drill pull request #1041: DRILL-5961: For long running queries (> 10 min) Dr...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1041#discussion_r155433860 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/WorkEventBus.java --- @@ -74,83 +70,42 @@ public void statusUpdate(final FragmentStatus status) { public void addFragmentManager(final FragmentManager fragmentManager) { if (logger.isDebugEnabled()) { - logger.debug("Manager created: {}", QueryIdHelper.getQueryIdentifier(fragmentManager.getHandle())); + logger.debug("Fragment {} manager created: {}", QueryIdHelper.getQueryIdentifier(fragmentManager.getHandle()), fragmentManager); } final FragmentManager old = managers.putIfAbsent(fragmentManager.getHandle(), fragmentManager); - if (old != null) { -throw new IllegalStateException( -"Tried to set fragment manager when has already been set for the provided fragment handle."); -} - } - - public FragmentManager getFragmentManagerIfExists(final FragmentHandle handle) { -synchronized (this) { - return managers.get(handle); +if (old != null) { + throw new IllegalStateException( + String.format("Manager {} for fragment {} already exists.", old, QueryIdHelper.getQueryIdentifier(fragmentManager.getHandle(; } } - public FragmentManager getFragmentManager(final FragmentHandle handle) throws FragmentSetupException { -synchronized (this) { - // Check if this was a recently finished (completed or cancelled) fragment. If so, throw away message. - if (recentlyFinishedFragments.asMap().containsKey(handle)) { -if (logger.isDebugEnabled()) { - logger.debug("Fragment: {} was cancelled. Ignoring fragment handle", handle); -} -return null; - } - - // since non-leaf fragments are sent first, it is an error condition if the manager is unavailable. - final FragmentManager m = managers.get(handle); - if (m != null) { -return m; - } -} -throw new FragmentSetupException("Failed to receive plan fragment that was required for id: " -+ QueryIdHelper.getQueryIdentifier(handle)); + public FragmentManager getFragmentManager(final FragmentHandle handle) { +return managers.get(handle); } /** - * Removes fragment manager (for the corresponding the handle) from the work event bus. This method can be called - * multiple times. The manager will be removed only once (the first call). - * @param handle the handle to the fragment - */ - public void removeFragmentManager(final FragmentHandle handle) { -if (logger.isDebugEnabled()) { - logger.debug("Removing fragment manager: {}", QueryIdHelper.getQueryIdentifier(handle)); -} - -synchronized (this) { - final FragmentManager manager = managers.get(handle); - if (manager != null) { -recentlyFinishedFragments.put(handle, 1); -managers.remove(handle); - } else { -logger.warn("Fragment {} not found in the work bus.", QueryIdHelper.getQueryIdentifier(handle)); - } -} - } - - /** - * Cancels and removes fragment manager (for the corresponding the handle) from the work event bus, Currently, used - * for fragments waiting on data (root and intermediate). + * Optionally cancels and removes fragment manager (for the corresponding the handle) from the work event bus. Currently, used + * for fragments waiting on data (root and intermediate). This method can be called multiple times. The manager will be removed + * only once (the first call). * @param handle the handle to the fragment + * @param cancel * @return if the fragment was found and removed from the event bus */ - public boolean cancelAndRemoveFragmentManagerIfExists(final FragmentHandle handle) { -if (logger.isDebugEnabled()) { - logger.debug("Cancelling and removing fragment manager: {}", QueryIdHelper.getQueryIdentifier(handle)); -} - -synchronized (this) { - final FragmentManager manager = managers.get(handle); - if (manager == null) { -return false; + public boolean removeFragmentManager(final FragmentHandle handle, final boolean cancel) { +final FragmentManager manager = managers.remove(handle); +if (manager != null) { + assert !manager.isCancelled() : String.format("Fragment {} manager {} is already cancelled.", QueryIdHelper.getQueryIdentifier(handle), manager); + if (cancel) { +manager.can
[GitHub] drill pull request #1041: DRILL-5961: For long running queries (> 10 min) Dr...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1041#discussion_r155433245 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -277,7 +277,9 @@ public void startFragmentPendingRemote(final FragmentManager fragmentManager) { @Override protected void cleanup() { runningFragments.remove(fragmentHandle); - workBus.removeFragmentManager(fragmentHandle); + if (!fragmentManager.isCancelled()) { +workBus.removeFragmentManager(fragmentHandle, false); --- End diff -- If cleanup is not called as a result of FragmentManager cancellation, it is part of the regular cleanup after run is complete. ---
[GitHub] drill issue #1012: DRILL-5911: Upgrade esri-geometry-api version to 2.0.0 to...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1012 OK. LGTM. ---
[GitHub] drill issue #1013: DRILL-5910: Logging exception when custom AuthenticatorFa...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1013 LGTM ---
[GitHub] drill pull request #1009: DRILL-5905: Exclude jdk-tools from project depende...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1009 DRILL-5905: Exclude jdk-tools from project dependencies You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-5905 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1009.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1009 commit 27b1deb03dbf6697715a9f368512f73b7b4e59c8 Author: Vlad Rozov <vro...@apache.org> Date: 2017-10-25T02:10:37Z DRILL-5905: Exclude jdk-tools from project dependencies ---
[GitHub] drill pull request #1004: DRILL-5876: Use openssl profile to include netty-t...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1004 DRILL-5876: Use openssl profile to include netty-tcnative dependency with the platform specific classifier You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-5876 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1004.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1004 commit 575d09e8e4de6c9ff61ab53cb23e4aede29d2398 Author: Vlad Rozov <vro...@apache.org> Date: 2017-10-18T22:00:37Z DRILL-5876: Use openssl profile to include netty-tcnative dependency with the platform specific classifier ---
[GitHub] drill issue #991: DRILL-5876: Remove netty-tcnative dependency from java-exe...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/991 @parthchandra open PR #1004 ---
[GitHub] drill issue #1012: DRILL-5911: Upgrade esri-geometry-api version to 2.0.0 to...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1012 @vvysotskyi I refer to JSON format that ESRI supports by itself. ---
[GitHub] drill issue #1013: DRILL-5910: Add factory name to ClassNotFoundException me...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1013 @vladimirtkach Who is the target of the message? If it is Java developers, the message makes sense. If it is for drill users or devops, the message is useless. ---
[GitHub] drill issue #1013: DRILL-5910: Add factory name to ClassNotFoundException me...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1013 @arina-ielchiieva I guess just logging an exception does not help unless it targets Java developers. Consider logging a message that has information regarding which authentication provider can't be instantiated/initialized and continue. ---
[GitHub] drill pull request #1238: DRILL-6281: Refactor TimedRunnable
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184694930 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function<Future, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer<TimedCallable> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution; +long maxExecution; +int count; +int startedCount; +private int doneCount; +// measure thread creation times +long earliestStart; +long latestStart; +long totalStart; + +@Override +public void accept(TimedCallable task) { + count++; + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +totalStart += threadStart; +long executionTime = task.getExecutionTime(TimeUnit.NANOSECONDS); +if (executionTime != -1) { + done
[GitHub] drill issue #1214: DRILL-6331: Revisit Hive Drill native parquet implementat...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1214 When moving files around please preserve the history of modifications done to the file. ---
[GitHub] drill pull request #1238: DRILL-6281: Refactor TimedRunnable
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184691926 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,258 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function<Future, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer<TimedCallable> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution = 0; +long maxExecution = 0; +int startedCount = 0; +private int doneCount = 0; +// measure thread creation times +long earliestStart = Long.MAX_VALUE; +long latestStart = 0; +long totalStart = 0; + +@Override +public void accept(TimedCallable task) { + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +totalStart += threadStart; +long executionTime = task.getExecutionTime(TimeUnit.NANOSECONDS); +if (executionTime != -1) { + done
[GitHub] drill pull request #1238: DRILL-6281: Refactor TimedRunnable
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184724657 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function<Future, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer<TimedCallable> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution; +long maxExecution; +int count; +int startedCount; +private int doneCount; +// measure thread creation times +long earliestStart; +long latestStart; +long totalStart; + +@Override +public void accept(TimedCallable task) { + count++; + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +totalStart += threadStart; +long executionTime = task.getExecutionTime(TimeUnit.NANOSECONDS); +if (executionTime != -1) { + done
[GitHub] drill issue #1237: DRILL-6348: Fixed code so that Unordered Receiver reports...
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1237 IMO, it will be good to understand what other operators do as well. For example what Project or Filter operators do. Do they take ownership of incoming batches? And if they do, when is the ownership taken? I do not suggest that we change how Sender and Receiver control **all** aspects of communication, at least not as part of this JIRA/PR. The difference in my and your approach is whether or not UnorderedReceiver and other receivers are pass-through operators. My view is that receivers are not pass-through operators and they are buffering operators as they receive batches from the network and buffer them before downstream operators are ready to consume those batches. In your view, receivers are pass-through operators that get batches from fragment queue or some other queue and pass them to downstream. As there is no wait and no processing between getting a batch from fragment queue and passing it to the next operator, I don't see why a receiver needs to take the ownership. ---
[GitHub] drill pull request #1237: DRILL-6348: Fixed code so that Unordered Receiver ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184804819 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java --- @@ -149,25 +149,32 @@ private RawFragmentBatch getNextBatch() throws IOException { } } + private RawFragmentBatch getNextNotEmptyBatch() throws IOException { +RawFragmentBatch batch; +try { + stats.startWait(); --- End diff -- it may throw `AssertException` now and other exceptions may be added in the future. ---
[GitHub] drill pull request #1224: DRILL-6321: Customize Drill's conformance. Allow s...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1224#discussion_r185351651 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillConformance.java --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.planner.sql; + +import org.apache.calcite.sql.validate.SqlConformanceEnum; +import org.apache.calcite.sql.validate.SqlDelegatingConformance; + +/** + * Drill's SQL conformance is SqlConformanceEnum.DEFAULT except for method isApplyAllowed(). + * Since Drill is going to allow OUTER APPLY and CROSS APPLY to allow each row from left child of Join + * to join with output of right side (sub-query or table function that will be invoked for each row). + * Refer to DRILL-5999 for more information. + */ +public class DrillConformance extends SqlDelegatingConformance { --- End diff -- Why not to introduce top-level class when needed. To override the behavior of the single method an anonymous class is more than sufficient. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185374827 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTTAS.java --- @@ -164,121 +166,113 @@ public void testResolveTemporaryTableWithPartialSchema() throws Exception { @Test public void testPartitionByWithTemporaryTables() throws Exception { String temporaryTableName = "temporary_table_with_partitions"; -mockRandomUUID(UUID.nameUUIDFromBytes(temporaryTableName.getBytes())); +cleanSessionDirectory(); test("create TEMPORARY table %s partition by (c1) as select * from (" + "select 'A' as c1 from (values(1)) union all select 'B' as c1 from (values(1))) t", temporaryTableName); -checkPermission(temporaryTableName); +checkPermission(); } - @Test(expected = UserRemoteException.class) + @Test public void testCreationOutsideOfDefaultTemporaryWorkspace() throws Exception { -try { - String temporaryTableName = "temporary_table_outside_of_default_workspace"; - test("create TEMPORARY table %s.%s as select 'A' as c1 from (values(1))", temp2_schema, temporaryTableName); -} catch (UserRemoteException e) { - assertThat(e.getMessage(), containsString(String.format( - "VALIDATION ERROR: Temporary tables are not allowed to be created / dropped " + - "outside of default temporary workspace [%s].", DFS_TMP_SCHEMA))); - throw e; -} +String temporaryTableName = "temporary_table_outside_of_default_workspace"; + +thrown.expect(UserRemoteException.class); +thrown.expectMessage(containsString(String.format( +"VALIDATION ERROR: Temporary tables are not allowed to be created / dropped " + +"outside of default temporary workspace [%s].", DFS_TMP_SCHEMA))); + +test("create TEMPORARY table %s.%s as select 'A' as c1 from (values(1))", temp2_schema, temporaryTableName); } - @Test(expected = UserRemoteException.class) + @Test public void testCreateWhenTemporaryTableExistsWithoutSchema() throws Exception { String temporaryTableName = "temporary_table_exists_without_schema"; -try { - test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName); - test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName); -} catch (UserRemoteException e) { - assertThat(e.getMessage(), containsString(String.format( - "VALIDATION ERROR: A table or view with given name [%s]" + - " already exists in schema [%s]", temporaryTableName, DFS_TMP_SCHEMA))); - throw e; -} + +thrown.expect(UserRemoteException.class); +thrown.expectMessage(containsString(String.format( +"VALIDATION ERROR: A table or view with given name [%s]" + +" already exists in schema [%s]", temporaryTableName, DFS_TMP_SCHEMA))); + +test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName); +test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName); } - @Test(expected = UserRemoteException.class) + @Test public void testCreateWhenTemporaryTableExistsCaseInsensitive() throws Exception { String temporaryTableName = "temporary_table_exists_without_schema"; -try { - test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName); - test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName.toUpperCase()); -} catch (UserRemoteException e) { - assertThat(e.getMessage(), containsString(String.format( - "VALIDATION ERROR: A table or view with given name [%s]" + - " already exists in schema [%s]", temporaryTableName.toUpperCase(), DFS_TMP_SCHEMA))); - throw e; -} + +thrown.expect(UserRemoteException.class); +thrown.expectMessage(containsString(String.format( +"VALIDATION ERROR: A table or view with given name [%s]" + --- End diff -- and possibly `expectUserRemoteExceptionWithTableExistsMessage(String tableName, String schemaName)`. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185353418 --- Diff: exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributedConcurrent.java --- @@ -177,7 +177,7 @@ public void run() { } } - //@Test --- End diff -- What is the reason the test was disabled before? ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185375540 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTTAS.java --- @@ -498,47 +489,50 @@ public void testDropTemporaryTableAsViewWithoutException() throws Exception { .go(); } - @Test(expected = UserRemoteException.class) + @Test public void testDropTemporaryTableAsViewWithException() throws Exception { String temporaryTableName = "temporary_table_to_drop_like_view_with_exception"; test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", temporaryTableName); -try { - test("drop view %s.%s", DFS_TMP_SCHEMA, temporaryTableName); -} catch (UserRemoteException e) { - assertThat(e.getMessage(), containsString(String.format( - "VALIDATION ERROR: Unknown view [%s] in schema [%s]", temporaryTableName, DFS_TMP_SCHEMA))); - throw e; +thrown.expect(UserRemoteException.class); +thrown.expectMessage(containsString(String.format( +"VALIDATION ERROR: Unknown view [%s] in schema [%s]", temporaryTableName, DFS_TMP_SCHEMA))); + +test("drop view %s.%s", DFS_TMP_SCHEMA, temporaryTableName); + } + + private static String getSessionId() throws Exception { --- End diff -- Consider mocking getSessionId() in the `UserSession`. This method needs to be tested by itself. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185369981 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/coord/zk/TestZookeeperClient.java --- @@ -125,7 +125,7 @@ public void testHasPathThrowsDrillRuntimeException() { Mockito .when(client.getCache().getCurrentData(absPath)) -.thenThrow(Exception.class); +.thenThrow(RuntimeException.class); --- End diff -- OK, but I am not sure what does this method test. `ZookeeperClient.hasPath(String path)` is not used in production. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185374205 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTTAS.java --- @@ -164,121 +166,113 @@ public void testResolveTemporaryTableWithPartialSchema() throws Exception { @Test public void testPartitionByWithTemporaryTables() throws Exception { String temporaryTableName = "temporary_table_with_partitions"; -mockRandomUUID(UUID.nameUUIDFromBytes(temporaryTableName.getBytes())); +cleanSessionDirectory(); test("create TEMPORARY table %s partition by (c1) as select * from (" + "select 'A' as c1 from (values(1)) union all select 'B' as c1 from (values(1))) t", temporaryTableName); -checkPermission(temporaryTableName); +checkPermission(); } - @Test(expected = UserRemoteException.class) + @Test public void testCreationOutsideOfDefaultTemporaryWorkspace() throws Exception { -try { - String temporaryTableName = "temporary_table_outside_of_default_workspace"; - test("create TEMPORARY table %s.%s as select 'A' as c1 from (values(1))", temp2_schema, temporaryTableName); -} catch (UserRemoteException e) { - assertThat(e.getMessage(), containsString(String.format( - "VALIDATION ERROR: Temporary tables are not allowed to be created / dropped " + - "outside of default temporary workspace [%s].", DFS_TMP_SCHEMA))); - throw e; -} +String temporaryTableName = "temporary_table_outside_of_default_workspace"; + +thrown.expect(UserRemoteException.class); --- End diff -- Consider introducing a new method to set `thrown` and message, something like `void expectUserRemoteExceptionWithMessage(String message)`. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185385028 --- Diff: exec/java-exec/src/test/resources/drill-udf/pom.xml --- @@ -0,0 +1,90 @@ + + +http://maven.apache.org/POM/4.0.0; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> + 4.0.0 + + org.apache.drill.udf + drill-udf + 1.0 + + +${project.name} +1.13.0 --- End diff -- Is it OK to use old version? Does Drill support semver API compatibility for UDFs? If yes, how is it enforced? If no, compilation may fail. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185383432 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/udf/dynamic/JarBuilder.java --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.udf.dynamic; + +import ch.qos.logback.classic.Level; +import ch.qos.logback.classic.Logger; +import ch.qos.logback.classic.LoggerContext; +import ch.qos.logback.classic.spi.ILoggingEvent; +import ch.qos.logback.core.ConsoleAppender; +import org.apache.maven.cli.MavenCli; +import org.apache.maven.cli.logging.Slf4jLogger; +import org.codehaus.plexus.DefaultPlexusContainer; +import org.codehaus.plexus.PlexusContainer; +import org.codehaus.plexus.logging.BaseLoggerManager; +import org.slf4j.LoggerFactory; + +import java.util.LinkedList; +import java.util.List; + +public class JarBuilder { + + private final MavenCli cli; + + public JarBuilder() { +this.cli = new MavenCli() { + @Override + protected void customizeContainer(PlexusContainer container) { +((DefaultPlexusContainer) container).setLoggerManager(new BaseLoggerManager() { + @Override + protected org.codehaus.plexus.logging.Logger createLogger(String s) { +return new Slf4jLogger(setupLogger(JarBuilder.class.getName(), Level.INFO)); + } +}); + } +}; + } + + /** + * Builds jars using embedded maven. Includes files / resources based given pattern, + * otherwise using defaults provided in pom.xml. + * + * @param jarName jar name + * @param projectDir project dir + * @param includeFiles pattern indicating which files should be included + * @param includeResources pattern indicating which resources should be included + * + * @return build exit code, 0 if build was successful + */ + public int build(String jarName, String projectDir, String includeFiles, String includeResources) { +System.setProperty("maven.multiModuleProjectDirectory", projectDir); +List params = new LinkedList<>(); +params.add("clean"); +params.add("package"); +params.add("-DskipTests"); +params.add("-Djar.finalName=" + jarName); +if (includeFiles != null) { + params.add("-Dinclude.files=" + includeFiles); +} +if (includeResources != null) { + params.add("-Dinclude.resources=" + includeResources); +} +return cli.doMain(params.toArray(new String[params.size()]), projectDir, System.out, System.err); + } + + private static Logger setupLogger(String string, Level logLevel) { --- End diff -- Is this necessary? ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r185497944 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/coord/zk/TestZookeeperClient.java --- @@ -125,7 +125,7 @@ public void testHasPathThrowsDrillRuntimeException() { Mockito .when(client.getCache().getCurrentData(absPath)) -.thenThrow(Exception.class); +.thenThrow(RuntimeException.class); --- End diff -- IMO, this method needs to be changed to test `ZookeeperClient.hasPath(String path, boolean consistent)`. It is OK to do it in this PR or in a separate commit/JIRA/PR. If you decide to do it in a separate commit, please file JIRA. ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r184245358 --- Diff: pom.xml --- @@ -798,7 +798,7 @@ com.googlecode.jmockit jmockit - 1.3 + 1.7 --- End diff -- Can it be done as a precursor PR? 1.7 version is quite old too. Can it be upgraded to the latest (org.jmockit:jmockit:1.39)? ---
[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1225#discussion_r184411700 --- Diff: exec/java-exec/pom.xml --- @@ -593,6 +593,48 @@ netty-tcnative ${netty.tcnative.classifier} + + org.apache.maven + maven-embedder + 3.3.9 --- End diff -- Consider using the latest release available. ---
[GitHub] drill issue #1238: DRILL-6281: Refactor TimedRunnable
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1238 The step is necessary to do RCA for DRILL-5908. There are way too many issues with the current implementation to list them in JIRA or PR and the major issue is the usage of homegrown solutions where Java (or other 3rd party libraries) already provides a required functionality out of the box. There is no need to use `Runnable` instead of `Callable` and provide custom `Callable` functionality. It is not necessary to wait on a `CountDownLatch` when `ExecutionService` provides the ability to invoke all tasks and return results when all tasks complete or a timeout expires. ---
[GitHub] drill issue #1238: DRILL-6281: Refactor TimedRunnable
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1238 I did not change how tasks (`Runnable` or `Callable`) behave and did not look into converting `Callable/Runnable` to a `ForkJoinTask`. Whether existing tasks can be scheduled recursively or not depends on the nature of those tasks and is not the scope of this PR. I'd suggest filing a JIRA if one does not exist already to look into `Fork/Join` (this is what I would expect from the developer who put "TODO"). ---
[GitHub] drill issue #1238: DRILL-6281: Refactor TimedRunnable
Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1238 There is not enough info available to debug and/or troubleshoot DRILL-5908 and I prefer instead of trying to find bugs in homegrown solution replace it with Java out of the box functionality and at the same time provide an ability to log enough information to do RCA for DRILL-5908. IMO, there are no unreasonable requests on PR/JIRA ð. ---
[GitHub] drill pull request #1237: DRILL-6348: Fixed code so that Unordered Receiver ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184590436 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/AllocationManager.java --- @@ -253,10 +261,12 @@ public boolean transferBalance(final BufferLedger target) { target.historicalLog.recordEvent("incoming(from %s)", owningLedger.allocator.name); } -boolean overlimit = target.allocator.forceAllocate(size); +// Release first to handle the case where the current and target allocators were part of the same +// parent / child tree. allocator.releaseBytes(size); +boolean allocationFit = target.allocator.forceAllocate(size); --- End diff -- In this case, changing the order of `forceAllocate()` and `releaseBytes()` is incorrect as ownership is not changed, but the old owner does not account for that memory anymore. ---
[GitHub] drill pull request #1237: DRILL-6348: Fixed code so that Unordered Receiver ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184596895 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/RawFragmentBatch.java --- @@ -77,4 +83,46 @@ public long getByteCount() { public boolean isAckSent() { return ackSent.get(); } + + /** + * Transfer ownership of this DrillBuf to the target allocator. This is done for better memory + * accounting (that is, the operator should be charged with the body's Drillbuf memory). + * + * NOTES - + * + * This operation is a NOOP when a) the current allocator (associated with the DrillBuf) is not the + * owning allocator or b) the target allocator is already the owner + * When transfer happens, a new RawFragmentBatch instance is allocated; this is done for proper + * DrillBuf reference count accounting + * The RPC handling code caches a reference to this RawFragmentBatch object instance; release() + * calls should be routed to the previous DrillBuf + * + * + * @param targetAllocator target allocator + * @return a new {@link RawFragmentBatch} object instance on success (where the buffer ownership has + * been switched to the target allocator); otherwise this operation is a NOOP (current instance + * returned) + */ + public RawFragmentBatch transferBodyOwnership(BufferAllocator targetAllocator) { +if (body == null) { + return this; // NOOP +} + +if (!body.getLedger().isOwningLedger() + || body.getLedger().isOwner(targetAllocator)) { + + return this; +} + +int writerIndex = body.writerIndex(); +TransferResult transferResult = body.transferOwnership(targetAllocator); + +// Set the index and increment reference count +transferResult.buffer.writerIndex(writerIndex); + +// Clear the current Drillbuffer since caller will perform release() on the new one +body.release(); + +return new RawFragmentBatch(getHeader(), transferResult.buffer, getSender(), false); --- End diff -- This actually brings a question why `newRawFragmentBatch` is released in `IncomingBuffers.batchArrived()` instead of releasing `transferredBuffer` after `RawFragmentBatch` is constructed in `newRawFragmentBatch`. ---