[GitHub] drill pull request #656: DRILL-5034: Select timestamp from hive generated pa...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/656#discussion_r98070065 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java --- @@ -323,18 +323,28 @@ public static DateCorruptionStatus checkForCorruptDateValuesInStatistics(Parquet * @param binaryTimeStampValue * hive, impala timestamp values with nanoseconds precision * are stored in parquet Binary as INT96 (12 constant bytes) - * + * @param retainLocalTimezone + * parquet files don't keep local timeZone according to the + * https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#timestamp;>Parquet spec, + * but some tools (hive, for example) retain local timezone for parquet files by default + * Note: Impala doesn't retain local timezone by default * @return Unix Timestamp - the number of milliseconds since January 1, 1970, 00:00:00 GMT * represented by @param binaryTimeStampValue . */ -public static long getDateTimeValueFromBinary(Binary binaryTimeStampValue) { +public static long getDateTimeValueFromBinary(Binary binaryTimeStampValue, boolean retainLocalTimezone) { // This method represents binaryTimeStampValue as ByteBuffer, where timestamp is stored as sum of // julian day number (32-bit) and nanos of day (64-bit) NanoTime nt = NanoTime.fromBinary(binaryTimeStampValue); int julianDay = nt.getJulianDay(); long nanosOfDay = nt.getTimeOfDayNanos(); - return (julianDay - JULIAN_DAY_NUMBER_FOR_UNIX_EPOCH) * DateTimeConstants.MILLIS_PER_DAY + long dateTime = (julianDay - JULIAN_DAY_NUMBER_FOR_UNIX_EPOCH) * DateTimeConstants.MILLIS_PER_DAY + nanosOfDay / NANOS_PER_MILLISECOND; + if (retainLocalTimezone) { +return new org.joda.time.DateTime(dateTime, org.joda.time.chrono.JulianChronology.getInstance()) + .withZoneRetainFields(org.joda.time.DateTimeZone.UTC).getMillis(); --- End diff -- Trying to understand this: Why are you calling .withZoneRetainFields(org.joda.time.DateTimeZone.UTC) if retainLocalTimezone is true ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/578#discussion_r98512248 --- Diff: contrib/native/client/src/clientlib/saslAuthenticatorImpl.cpp --- @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include +#include +#include "saslAuthenticatorImpl.hpp" + +#include "drillClientImpl.hpp" +#include "logger.hpp" + +namespace Drill { + +#define DEFAULT_SERVICE_NAME "drill" + +#define KERBEROS_SIMPLE_NAME "kerberos" +#define KERBEROS_SASL_NAME "gssapi" +#define PLAIN_NAME "plain" + +const std::map<std::string, std::string> SaslAuthenticatorImpl::MECHANISM_MAPPING = boost::assign::map_list_of +(KERBEROS_SIMPLE_NAME, KERBEROS_SASL_NAME) +(PLAIN_NAME, PLAIN_NAME) +; + +boost::mutex SaslAuthenticatorImpl::s_mutex; +bool SaslAuthenticatorImpl::s_initialized = false; + +SaslAuthenticatorImpl::SaslAuthenticatorImpl(const DrillUserProperties* const properties) : +m_properties(properties), m_pConnection(NULL), m_secret(NULL) { + +if (!s_initialized) { +boost::lock_guard lock(SaslAuthenticatorImpl::s_mutex); +if (!s_initialized) { +s_initialized = true; + +// set plugin path if provided +if (DrillClientConfig::getSaslPluginPath()) { +char *saslPluginPath = const_cast(DrillClientConfig::getSaslPluginPath()); +sasl_set_path(0, saslPluginPath); +} + +sasl_client_init(NULL); +{ // for debugging purposes +const char **mechanisms = sasl_global_listmech(); +int i = 1; --- End diff -- Why does this index start from 1 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/578#discussion_r100938180 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.hpp --- @@ -362,6 +363,7 @@ class DrillClientImpl : public DrillClientImplBase{ m_handshakeVersion(0), m_handshakeStatus(exec::user::SUCCESS), m_bIsConnected(false), +m_saslDone(false), --- End diff -- m_saslAuthenticator should be NULL initialized --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #600: DRILL-4373: Drill and Hive have incompatible timest...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/600#discussion_r82314071 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java --- @@ -45,4 +53,34 @@ public static int getIntFromLEBytes(byte[] input, int start) { } return out; } + + /** + * Utilities for converting from parquet INT96 binary (impala, hive timestamp) + * to date time value. This utilizes the Joda library. + */ + public static class NanoTimeUtils { + +public static final long NANOS_PER_DAY = TimeUnit.DAYS.toNanos(1); +public static final long NANOS_PER_HOUR = TimeUnit.HOURS.toNanos(1); +public static final long NANOS_PER_MINUTE = TimeUnit.MINUTES.toNanos(1); +public static final long NANOS_PER_SECOND = TimeUnit.SECONDS.toNanos(1); +public static final long NANOS_PER_MILLISECOND = TimeUnit.MILLISECONDS.toNanos(1); + + /** + * @param binaryTimeStampValue + * hive, impala timestamp values with nanoseconds precision + * are stored in parquet Binary as INT96 + * + * @return the number of milliseconds since January 1, 1970, 00:00:00 GMT + * represented by @param binaryTimeStampValue . + */ +public static long getDateTimeValueFromBinary(Binary binaryTimeStampValue) { + NanoTime nt = NanoTime.fromBinary(binaryTimeStampValue); + int julianDay = nt.getJulianDay(); + long nanosOfDay = nt.getTimeOfDayNanos(); + return DateTimeUtils.fromJulianDay(julianDay-0.5d) + nanosOfDay/NANOS_PER_MILLISECOND; --- End diff -- 1. I would recommend not using Joda. Do the calculations directly, like in ConvertFromImpalaTimestamp. Joda uses non-standard, hence confusing, terminology. What Joda calls and uses as JulianDay, is actually Julian Date. Seems like you have identified this discrepancy and adjusted for it by subtracting 0.5 from _julianDay_. Note: (I guess you have already figured this out) : The actual code and the Joda code in the comment, in ConvertFromImpalaTimestamp, are inconsistent. Took me a day to figure out the reason behind this ! A bug should be opened to delete the comment. 2. Can you please also leave a comment stating that 2440588 is the JDN for the Unix Epoch. 3. Please leave a comment stating that the order of the calls to get _julianDay_ and _nanosOfDay_ matters. You can do this by just stating how timestamps are stored in INT96 i.e 32-bit JDN followed by 64-bit nanosOfDay. 4. Consistent(single or none) spacing for binary operators (+-/) used here would be nice. Single spacing would be preferable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #634: DRILL-4974: NPE in FindPartitionConditions.analyzeCall() f...
Github user bitblender commented on the issue: https://github.com/apache/drill/pull/634 @amansinha100 Can you please review this change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #600: DRILL-4373: Drill and Hive have incompatible timest...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/600#discussion_r83761501 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java --- @@ -45,4 +53,34 @@ public static int getIntFromLEBytes(byte[] input, int start) { } return out; } + + /** + * Utilities for converting from parquet INT96 binary (impala, hive timestamp) + * to date time value. This utilizes the Joda library. + */ + public static class NanoTimeUtils { + +public static final long NANOS_PER_DAY = TimeUnit.DAYS.toNanos(1); +public static final long NANOS_PER_HOUR = TimeUnit.HOURS.toNanos(1); +public static final long NANOS_PER_MINUTE = TimeUnit.MINUTES.toNanos(1); +public static final long NANOS_PER_SECOND = TimeUnit.SECONDS.toNanos(1); +public static final long NANOS_PER_MILLISECOND = TimeUnit.MILLISECONDS.toNanos(1); + + /** + * @param binaryTimeStampValue + * hive, impala timestamp values with nanoseconds precision + * are stored in parquet Binary as INT96 + * + * @return the number of milliseconds since January 1, 1970, 00:00:00 GMT + * represented by @param binaryTimeStampValue . + */ +public static long getDateTimeValueFromBinary(Binary binaryTimeStampValue) { + NanoTime nt = NanoTime.fromBinary(binaryTimeStampValue); + int julianDay = nt.getJulianDay(); + long nanosOfDay = nt.getTimeOfDayNanos(); + return DateTimeUtils.fromJulianDay(julianDay-0.5d) + nanosOfDay/NANOS_PER_MILLISECOND; --- End diff -- Sorry for the late reply. For some reason, I did not see these comments till now. About 1) Yes, you are correct. I just want the comments in ConvertFromImpalaTimestamp to be removed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #634: DRILL-4974: NPE in FindPartitionConditions.analyzeC...
GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/634 DRILL-4974: NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions Changes: Added a missing null check in FindPartitionConditions.analyzeCall(), to ensure that opStack.peek() value is dereferenced only after a null-check. Without this check, if the expression is holistic, opStack can be null, so using the value of opStack.peek() without a check can cause an NPE. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill DRILL-4974 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/634.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #634 commit a519a0987280abeb00e33a8088d2f7d6c9809eed Author: karthik <kmanivan...@maprtech.com> Date: 2016-10-20T20:43:17Z DRILL-4974: Add missing null check in FindPartitionConditions.analyzeCall() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #690: DRILL-5121 A memory leak is observed when exact cas...
GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/690 DRILL-5121 A memory leak is observed when exact case is not specified for a column in a filter condition Fix for https://issues.apache.org/jira/browse/DRILL-5121. Changes fieldVectorMap in ScanBatch to a CaseInsensitiveMap to workaround the memory leak described in the bug. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill DRILL-5121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/690.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #690 commit 8178e2ac620cdde534b8235e1b699b92bbad87c8 Author: karthik <kmanivan...@maprtech.com> Date: 2016-11-14T18:36:53Z DRILL-5121 Fix for memory leak. Changes fieldVectorMap in ScanBatch to a CaseInsensitiveMap --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #638: DRILL-4982: Separate Hive reader classes for differ...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/638#discussion_r90574793 --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveAbstractReader.java --- @@ -218,17 +229,18 @@ private void init() throws ExecutionSetupException { } } } catch (Exception e) { - throw new ExecutionSetupException("Failure while initializing HiveRecordReader: " + e.getMessage(), e); + throw new ExecutionSetupException("Failure while initializing HiveOrcReader: " + e.getMessage(), e); --- End diff -- "HiveOrcReader:" in the exception-string should be changed so that it prints the actual reader type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/697#discussion_r94350780 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java --- @@ -110,9 +110,14 @@ protected void readField(long recordsToReadInThisPass) { /** * Class for reading parquet fixed binary type INT96, which is used for storing hive, - * impala timestamp values with nanoseconds precision. So it reads such values as a drill timestamp. + * impala timestamp values with nanoseconds precision (12 bytes). So it reads such values as a drill timestamp (8 bytes). */ static class NullableFixedBinaryAsTimeStampReader extends NullableFixedByteAlignedReader { +/** + * The width of each element of the TimeStampVector is 8 byte(s). + */ +private static final int timestampLengthInBits = 64; --- End diff -- I would recommend the use of uppercase in naming constants. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/697#discussion_r94350967 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java --- @@ -132,6 +137,9 @@ protected void readField(long recordsToReadInThisPass) { valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, getDateTimeValueFromBinary(binaryTimeStampValue)); } } + // The nanos precision is cut to millis. Therefore the length of single timestamp value is 8 bytes(s) + // instead of 12 byte(s). + dataTypeLengthInBits = timestampLengthInBits; --- End diff -- Just trying to understand this better: Is it OK to set the length to 64 bits even if PARQUET_READER_INT96_AS_TIMESTAMP is set to false? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #695: DRILL-4868: fix how hive function set DrillBuf.
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/695#discussion_r93133495 --- Diff: contrib/storage-hive/core/src/main/codegen/templates/ObjectInspectorHelper.java --- @@ -172,48 +172,35 @@ public static JBlock getDrillObject(JCodeModel m, ObjectInspector oi, booleanJC._then().assign(returnValueHolder.ref("value"), JExpr.lit(1)); booleanJC._else().assign(returnValueHolder.ref("value"), JExpr.lit(0)); - <#elseif entry.hiveType == "VARCHAR"> -JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", - castedOI.invoke("getPrimitiveJavaObject").arg(returnValue) + <#elseif entry.hiveType == "VARCHAR" || entry.hiveType == "CHAR" || entry.hiveType == "STRING" || entry.hiveType == "BINARY"> +<#if entry.hiveType == "VARCHAR"> + JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", + castedOI.invoke("getPrimitiveJavaObject").arg(returnValue) .invoke("getValue") .invoke("getBytes")); - -jc._else().add(returnValueHolder.ref("buffer") - .invoke("setBytes").arg(JExpr.lit(0)).arg(data)); - - -jc._else().assign(returnValueHolder.ref("start"), JExpr.lit(0)); -jc._else().assign(returnValueHolder.ref("end"), data.ref("length")); - <#elseif entry.hiveType == "CHAR"> JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", -castedOI.invoke("getPrimitiveJavaObject").arg(returnValue) -.invoke("getStrippedValue") -.invoke("getBytes")); - -jc._else().add(returnValueHolder.ref("buffer") -.invoke("setBytes").arg(JExpr.lit(0)).arg(data)); - - -jc._else().assign(returnValueHolder.ref("start"), JExpr.lit(0)); -jc._else().assign(returnValueHolder.ref("end"), data.ref("length")); - - <#elseif entry.hiveType == "STRING"> -JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", - castedOI.invoke("getPrimitiveJavaObject").arg(returnValue) + castedOI.invoke("getPrimitiveJavaObject").arg(returnValue) +.invoke("getStrippedValue") +.invoke("getBytes")); +<#elseif entry.hiveType == "STRING"> + JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", + castedOI.invoke("getPrimitiveJavaObject").arg(returnValue) .invoke("getBytes")); -jc._else().add(returnValueHolder.ref("buffer") - .invoke("setBytes").arg(JExpr.lit(0)).arg(data)); -jc._else().assign(returnValueHolder.ref("start"), JExpr.lit(0)); -jc._else().assign(returnValueHolder.ref("end"), data.ref("length")); - <#elseif entry.hiveType == "BINARY"> +<#elseif entry.hiveType == "BINARY"> +JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", + castedOI.invoke("getPrimitiveJavaObject").arg(returnValue)); + -JVar data = jc._else().decl(m.directClass(byte[].class.getCanonicalName()), "data", - castedOI.invoke("getPrimitiveJavaObject").arg(returnValue)); -jc._else().add(returnValueHolder.ref("buffer") +JConditional jnullif = jc._else()._if(data.eq(JExpr._null())); +jnullif._then().assign(returnValueHolder.ref("isSet"), JExpr.lit(0)); --- End diff -- Why is isSet set to 0 here ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #783: DRILL-5324: Provide simplified column reader/writer...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/783#discussion_r111289506 --- Diff: exec/vector/src/main/codegen/templates/ColumnAccessors.java --- @@ -0,0 +1,333 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +<@pp.dropOutputFile /> +<@pp.changeOutputFile name="/org/apache/drill/exec/vector/accessor/ColumnAccessors.java" /> +<#include "/@includes/license.ftl" /> +<#macro getType label> +@Override +public ValueType valueType() { + <#if label == "Int"> + return ValueType.INTEGER; + <#else> + return ValueType.${label?upper_case}; + +} + +<#macro bindReader prefix drillType> + <#if drillType = "Decimal9" || drillType == "Decimal18"> +private MaterializedField field; + +private ${prefix}${drillType}Vector.Accessor accessor; + +@Override +public void bind(RowIndex vectorIndex, ValueVector vector) { + bind(vectorIndex); + <#if drillType = "Decimal9" || drillType == "Decimal18"> + field = vector.getField(); + + accessor = ((${prefix}${drillType}Vector) vector).getAccessor(); +} + + <#if drillType = "Decimal9" || drillType == "Decimal18"> +@Override +public void bind(RowIndex vectorIndex, MaterializedField field, VectorAccessor va) { + bind(vectorIndex, field, va); + this.field = field; +} + + + private ${prefix}${drillType}Vector.Accessor accessor() { + if (vectorAccessor == null) { +return accessor; + } else { +return ((${prefix}${drillType}Vector) vectorAccessor.vector()).getAccessor(); + } +} + +<#macro get drillType accessorType label isArray> +@Override +public ${accessorType} get${label}(<#if isArray>int index) { + <#if isArray> +<#assign index=", index"/> +<#assign getObject="getSingleObject"> + <#else> +<#assign index=""/> +<#assign getObject="getObject"> + + <#if drillType == "VarChar"> + return new String(accessor().get(vectorIndex.index()${index}), Charsets.UTF_8); + <#elseif drillType == "Var16Char"> + return new String(accessor().get(vectorIndex.index()${index}), Charsets.UTF_16); + <#elseif drillType == "VarBinary"> + return accessor().get(vectorIndex.index()${index}); + <#elseif drillType == "Decimal9" || drillType == "Decimal18"> + return DecimalUtility.getBigDecimalFromPrimitiveTypes( +accessor().get(vectorIndex.index()${index}), +field.getScale(), +field.getPrecision()); + <#elseif accessorType == "Decimal18"> + return DecimalUtilities.getBigDecimalFromPrimitiveTypes(accessor().${getObject}(vectorIndex.index()${index}); --- End diff -- As discusses offline, this seems to be deadcode as there is no DecimalUtilities class in the Drill source base. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #875: DRILL-5671 Set secure ACLs (Access Control List) fo...
GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/875 DRILL-5671 Set secure ACLs (Access Control List) for Drill ZK nodes in a secure cluster You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill DRILL-5671-PR-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/875.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #875 commit 0b43d451df019026cc6d50128d229340580bb82e Author: karthik <kmanivan...@maprtech.com> Date: 2017-06-20T17:45:27Z DRILL 5671 - Set secure ACLs (Access Control List) for Drill ZK nodes in a secure cluster --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #875: DRILL-5671 Set secure ACLs (Access Control List) fo...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/875#discussion_r127367344 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKSecureACLProvider.java --- @@ -0,0 +1,71 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.coord.zk; + +import com.google.common.collect.ImmutableList; +import org.apache.curator.framework.api.ACLProvider; +import org.apache.zookeeper.ZooDefs.Ids; +import org.apache.zookeeper.data.ACL; + +import java.util.List; + +/** + * ZKSecureACLProvider restricts access to znodes created by Drill + * The cluster discovery znode i.e. the znode containing the list of Drillbits is + * readable by anyone. + * For all other znodes only the creator of the znode, i.e the Drillbit user, has full access. + */ + +public class ZKSecureACLProvider implements ACLProvider { + +static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ZKSecureACLProvider.class); +// Creator has full access +static ImmutableList DEFAULT_ACL = new ImmutableList.Builder() + .addAll(Ids.CREATOR_ALL_ACL.iterator()) + .build(); +// Creator has full access +// Everyone else has only read access +static ImmutableList DRILL_CLUSTER_ACL = new ImmutableList.Builder() --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #875: DRILL-5671 Set secure ACLs (Access Control List) fo...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/875#discussion_r127367354 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -112,7 +112,8 @@ drill.exec: { retry: { count: 7200, delay: 500 -} +}, +use.secure_acl: false --- End diff -- Opened https://maprdrill.atlassian.net/browse/MD-2278 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #875: DRILL-5671 Set secure ACLs (Access Control List) fo...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/875#discussion_r127540793 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKACLProviderFactory.java --- @@ -0,0 +1,44 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.coord.zk; + +import org.apache.curator.framework.api.ACLProvider; +import org.apache.curator.framework.imps.DefaultACLProvider; +import org.apache.drill.common.config.DrillConfig; +import org.apache.drill.exec.ExecConstants; + + +public class ZKACLProviderFactory { + +static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(ZKACLProviderFactory.class); + +public static ACLProvider getACLProvider(DrillConfig config, String clusterId, String zkRoot) { --- End diff -- Will make that change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #875: DRILL-5671 Set secure ACLs (Access Control List) fo...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/875#discussion_r127540711 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java --- @@ -39,6 +39,7 @@ String ZK_TIMEOUT = "drill.exec.zk.timeout"; String ZK_ROOT = "drill.exec.zk.root"; String ZK_REFRESH = "drill.exec.zk.refresh"; + String ZK_SECURE_ACL = "drill.exec.zk.use.secure_acl"; --- End diff -- I was going to make this "zk.use_secure_acl" but when I was looking for an example I found "bit.use.ip", so I modeled it on that. I will change this to "zk.apply_secure_acl" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131459203 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/ResultSetLoaderImpl.java --- @@ -0,0 +1,412 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.Collection; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.memory.BufferAllocator; +import org.apache.drill.exec.physical.rowSet.ResultSetLoader; +import org.apache.drill.exec.physical.rowSet.TupleLoader; +import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.ValueVector; + +/** + * Implementation of the result set loader. + * @see {@link ResultSetLoader} + */ + +public class ResultSetLoaderImpl implements ResultSetLoader, WriterIndexImpl.WriterIndexListener { + + public static class ResultSetOptions { +public final int vectorSizeLimit; +public final int rowCountLimit; +public final boolean caseSensitive; +public final ResultVectorCache inventory; --- End diff -- The name 'inventory' does not convey the intent clearly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131684173 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/TupleSetImpl.java --- @@ -0,0 +1,551 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.ArrayList; +import java.util.List; + +import org.apache.drill.common.types.TypeProtos.DataMode; +import org.apache.drill.exec.expr.TypeHelper; +import org.apache.drill.exec.physical.rowSet.ColumnLoader; +import org.apache.drill.exec.physical.rowSet.TupleLoader; +import org.apache.drill.exec.physical.rowSet.TupleSchema; +import org.apache.drill.exec.physical.rowSet.impl.ResultSetLoaderImpl.VectorContainerBuilder; +import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode; +import org.apache.drill.exec.record.MaterializedField; +import org.apache.drill.exec.vector.AllocationHelper; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VectorOverflowException; +import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter; +import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory; + +/** + * Implementation of a column when creating a row batch. + * Every column resides at an index, is defined by a schema, + * is backed by a value vector, and and is written to by a writer. + * Each column also tracks the schema version in which it was added + * to detect schema evolution. Each column has an optional overflow + * vector that holds overflow record values when a batch becomes + * full. + * + * Overflow vectors require special consideration. The vector class itself + * must remain constant as it is bound to the writer. To handle overflow, + * the implementation must replace the buffer in the vector with a new + * one, saving the full vector to return as part of the final row batch. + * This puts the column in one of three states: + * + * Normal: only one vector is of concern - the vector for the active + * row batch. + * Overflow: a write to a vector caused overflow. For all columns, + * the data buffer is shifted to a harvested vector, and a new, empty + * buffer is put into the active vector. + * Excess: a (small) column received values for the row that will + * overflow due to a later column. When overflow occurs, the excess + * column value, from the overflow record, resides in the active + * vector. It must be shifted from the active vector into the new + * overflow buffer. + */ + +public class TupleSetImpl implements TupleSchema { + + public static class TupleLoaderImpl implements TupleLoader { + +public TupleSetImpl tupleSet; + +public TupleLoaderImpl(TupleSetImpl tupleSet) { + this.tupleSet = tupleSet; +} + +@Override +public TupleSchema schema() { return tupleSet; } + +@Override +public ColumnLoader column(int colIndex) { + // TODO: Cache loaders here + return tupleSet.columnImpl(colIndex).writer; +} + +@Override +public ColumnLoader column(String colName) { + ColumnImpl col = tupleSet.columnImpl(colName); + if (col == null) { +throw new UndefinedColumnException(colName); + } + return col.writer; +} + +@Override +public TupleLoader loadRow(Object... values) { --- End diff -- Is there a need to verify the types of the incoming args? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131685349 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/TupleSetImpl.java --- @@ -0,0 +1,551 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.ArrayList; +import java.util.List; + +import org.apache.drill.common.types.TypeProtos.DataMode; +import org.apache.drill.exec.expr.TypeHelper; +import org.apache.drill.exec.physical.rowSet.ColumnLoader; +import org.apache.drill.exec.physical.rowSet.TupleLoader; +import org.apache.drill.exec.physical.rowSet.TupleSchema; +import org.apache.drill.exec.physical.rowSet.impl.ResultSetLoaderImpl.VectorContainerBuilder; +import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode; +import org.apache.drill.exec.record.MaterializedField; +import org.apache.drill.exec.vector.AllocationHelper; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.VectorOverflowException; +import org.apache.drill.exec.vector.accessor.impl.AbstractColumnWriter; +import org.apache.drill.exec.vector.accessor.impl.ColumnAccessorFactory; + +/** + * Implementation of a column when creating a row batch. + * Every column resides at an index, is defined by a schema, + * is backed by a value vector, and and is written to by a writer. + * Each column also tracks the schema version in which it was added + * to detect schema evolution. Each column has an optional overflow + * vector that holds overflow record values when a batch becomes + * full. + * + * Overflow vectors require special consideration. The vector class itself + * must remain constant as it is bound to the writer. To handle overflow, + * the implementation must replace the buffer in the vector with a new + * one, saving the full vector to return as part of the final row batch. + * This puts the column in one of three states: + * + * Normal: only one vector is of concern - the vector for the active + * row batch. + * Overflow: a write to a vector caused overflow. For all columns, + * the data buffer is shifted to a harvested vector, and a new, empty + * buffer is put into the active vector. + * Excess: a (small) column received values for the row that will --- End diff -- 'Excess' is the LOOK_AHEAD state, correct? I think it would be better if the comments use the same terminology as in the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131554894 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/ResultSetLoaderImpl.java --- @@ -0,0 +1,412 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.Collection; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.memory.BufferAllocator; +import org.apache.drill.exec.physical.rowSet.ResultSetLoader; +import org.apache.drill.exec.physical.rowSet.TupleLoader; +import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode; +import org.apache.drill.exec.record.VectorContainer; +import org.apache.drill.exec.vector.ValueVector; + +/** + * Implementation of the result set loader. + * @see {@link ResultSetLoader} + */ + +public class ResultSetLoaderImpl implements ResultSetLoader, WriterIndexImpl.WriterIndexListener { + + public static class ResultSetOptions { +public final int vectorSizeLimit; +public final int rowCountLimit; +public final boolean caseSensitive; +public final ResultVectorCache inventory; +private final Collection selection; + +public ResultSetOptions() { + vectorSizeLimit = ValueVector.MAX_BUFFER_SIZE; + rowCountLimit = ValueVector.MAX_ROW_COUNT; + caseSensitive = false; + selection = null; + inventory = null; +} + +public ResultSetOptions(OptionBuilder builder) { + this.vectorSizeLimit = builder.vectorSizeLimit; + this.rowCountLimit = builder.rowCountLimit; + this.caseSensitive = builder.caseSensitive; + this.selection = builder.selection; + this.inventory = builder.inventory; +} + } + + public static class OptionBuilder { +private int vectorSizeLimit; +private int rowCountLimit; +private boolean caseSensitive; +private Collection selection; +private ResultVectorCache inventory; + +public OptionBuilder() { + ResultSetOptions options = new ResultSetOptions(); + vectorSizeLimit = options.vectorSizeLimit; + rowCountLimit = options.rowCountLimit; + caseSensitive = options.caseSensitive; +} + +public OptionBuilder setCaseSensitive(boolean flag) { + caseSensitive = flag; + return this; +} + +public OptionBuilder setRowCountLimit(int limit) { + rowCountLimit = Math.min(limit, ValueVector.MAX_ROW_COUNT); + return this; +} + +public OptionBuilder setSelection(Collection selection) { + this.selection = selection; + return this; +} + +public OptionBuilder setVectorCache(ResultVectorCache inventory) { + this.inventory = inventory; + return this; +} + +// TODO: No setter for vector length yet: is hard-coded +// at present in the value vector. + +public ResultSetOptions build() { + return new ResultSetOptions(this); +} + } + + public static class VectorContainerBuilder { +private final ResultSetLoaderImpl rowSetMutator; +private int lastUpdateVersion = -1; +private VectorContainer container; + +public VectorContainerBuilder(ResultSetLoaderImpl rowSetMutator) { + this.rowSetMutator = rowSetMutator; + container = new VectorContainer(rowSetMutator.allocator); +} + +public void update() { + if (lastUpdateVersion < rowSetMutator.schemaVersion()) { +rowSetMutator.rootTuple.buildContainer(this); +container.buildSchema(SelectionVectorMode.NONE); +lastUpdateVersion = rowSetMutator.schemaVersion(); + } +} + +public VectorContainer container() { return container; }
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131216670 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/ColumnLoaderImpl.java --- @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import org.apache.drill.exec.physical.rowSet.ColumnLoader; + +/** + * Implementation interface for a column loader. Adds to the public interface + * a number of methods needed to coordinate batch overflow. + */ + +public interface ColumnLoaderImpl extends ColumnLoader { --- End diff -- "Impl" in an interface name sounds odd. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r130429208 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/TupleSchema.java --- @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet; + +import org.apache.drill.exec.physical.rowSet.impl.MaterializedSchema; +import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.MaterializedField; + +/** + * Defines the schema of a tuple: either the top-level row or a nested + * "map" (really structure). A schema is a collection of columns (backed + * by vectors in the loader itself.) Columns are accessible by name or + * index. New columns may be added at any time; the new column takes the + * next available index. + */ + +public interface TupleSchema { + + public interface TupleColumnSchema { +MaterializedField schema(); + +/** + * Report if a column is selected. + * @param colIndex index of the column to check + * @return true if the column is selected (data is collected), + * false if the column is unselected (data is discarded) + */ + +boolean isSelected(); --- End diff -- What does it mean for a column to be selected? Selected in the query? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131564509 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/ResultVectorCache.java --- @@ -0,0 +1,181 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.apache.drill.common.types.TypeProtos.MajorType; +import org.apache.drill.exec.expr.TypeHelper; +import org.apache.drill.exec.memory.BufferAllocator; +import org.apache.drill.exec.record.MaterializedField; +import org.apache.drill.exec.vector.ValueVector; + +/** + * Manages an inventory of value vectors used across row batch readers. + * Drill semantics for batches is complex. Each operator logically returns + * a batch of records on each call of the Drill Volcano iterator protocol + * next() operation. However, the batches "returned" are not + * separate objects. Instead, Drill enforces the following semantics: + * + * If a next() call returns OK then the set of vectors + * in the "returned" batch must be identical to those in the prior batch. Not + * just the same type; they must be the same ValueVector objects. + * (The buffers within the vectors will be different.) + * If the set of vectors changes in any way (add a vector, remove a + * vector, change the type of a vector), then the next() call + * must return OK_NEW_SCHEMA. + * + * These rules create interesting constraints for the scan operator. + * Conceptually, each batch is distinct. But, it must share vectors. The + * {@link ResultSetLoader} class handles this by managing the set of vectors + * used by a single reader. + * + * Readers are independent: each may read a distinct schema (as in JSON.) + * Yet, the Drill protocol requires minimizing spurious OK_NEW_SCHEMA + * events. As a result, two readers run by the same scan operator must + * share the same set of vectors, despite the fact that they may have + * different schemas and thus different ResultSetLoaders. + * + * The purpose of this inventory is to persist vectors across readers, even + * when, say, reader B does not use a vector that reader A created. + * + * The semantics supported by this class include: + * + * Ability to "pre-declare" columns based on columns that appear in + * an explicit select list. This ensures that the columns are known (but + * not their types). + * Ability to reuse a vector across readers if the column retains the same + * name and type (minor type and mode.) + * Ability to flush unused vectors for readers with changing schemas + * if a schema change occurs. + * Support schema "hysteresis"; that is, the a "sticky" schema that + * minimizes spurious changes. Once a vector is declared, it can be included + * in all subsequent batches (provided the column is nullable or an array.) + * + */ +public class ResultVectorCache { + + /** + * State of a projected vector. At first all we have is a name. + * Later, we'll discover the type. + */ + + private static class VectorState { +protected final String name; +protected ValueVector vector; +protected boolean touched; + +public VectorState(String name) { + this.name = name; +} + +public boolean satisfies(MaterializedField colSchema) { + if (vector == null) { +return false; + } + MaterializedField vectorSchema = vector.getField(); + return vectorSchema.getType().equals(colSchema.getType()); +} + } + + private final BufferAllocator allocator; + private final Map<String, VectorState> vectors = new HashMap<>(); +
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r131284158 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/LogicalTupleLoader.java --- @@ -0,0 +1,204 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +import org.apache.drill.exec.physical.rowSet.ColumnLoader; +import org.apache.drill.exec.physical.rowSet.TupleLoader; +import org.apache.drill.exec.physical.rowSet.TupleSchema; +import org.apache.drill.exec.physical.rowSet.TupleSchema.TupleColumnSchema; +import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode; +import org.apache.drill.exec.record.MaterializedField; + +/** + * Shim inserted between an actual tuple loader and the client to remove columns + * that are not projected from input to output. The underlying loader handles only + * the projected columns in order to improve efficiency. This class presents the + * full table schema, but returns null for the non-projected columns. This allows + * the reader to work with the table schema as defined by the data source, but + * skip those columns which are not projected. Skipping non-projected columns avoids + * creating value vectors which are immediately discarded. It may also save the reader + * from reading unwanted data. + */ +public class LogicalTupleLoader implements TupleLoader { + + public static final int UNMAPPED = -1; + + private static class MappedColumn implements TupleColumnSchema { + +private final MaterializedField schema; +private final int mapping; + +public MappedColumn(MaterializedField schema, int mapping) { + this.schema = schema; + this.mapping = mapping; +} + +@Override +public MaterializedField schema() { return schema; } + +@Override +public boolean isSelected() { return mapping != UNMAPPED; } + +@Override +public int vectorIndex() { return mapping; } + } + + /** + * Implementation of the tuple schema that describes the full data source + * schema. The underlying loader schema is a subset of these columns. Note + * that the columns appear in the same order in both schemas, but the loader + * schema is a subset of the table schema. + */ + + private class LogicalTupleSchema implements TupleSchema { + +private final Set selection = new HashSet<>(); +private final TupleSchema physicalSchema; + +private LogicalTupleSchema(TupleSchema physicalSchema, Collection selection) { + this.physicalSchema = physicalSchema; + this.selection.addAll(selection); +} + +@Override +public int columnCount() { return logicalSchema.count(); } + +@Override +public int columnIndex(String colName) { + return logicalSchema.indexOf(rsLoader.toKey(colName)); +} + +@Override +public TupleColumnSchema metadata(int colIndex) { return logicalSchema.get(colIndex); } + +@Override +public MaterializedField column(int colIndex) { return logicalSchema.get(colIndex).schema(); } + +@Override +public TupleColumnSchema metadata(String colName) { return logicalSchema.get(colName); } + +@Override +public MaterializedField column(String colName) { return logicalSchema.get(colName).schema(); } + +@Override +public int addColumn(MaterializedField columnSchema) { + String key = rsLoader.toKey(columnSchema.getName()); + int pIndex; + if (selection.contains(key)) { --- End diff -- selection is already normalized
[GitHub] drill pull request #866: DRILL-5657: Implement size-aware result set loader
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/866#discussion_r130250994 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/ResultSetLoader.java --- @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet; + +import org.apache.drill.exec.record.VectorContainer; + +/** + * Builds a result set (series of zero or more row sets) based on a defined + * schema which may + * evolve (expand) over time. Automatically rolls "overflow" rows over + * when a batch fills. + * + * Many of the methods in this interface are verify that the loader is --- End diff -- "to verify" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #875: DRILL-5671 Set secure ACLs (Access Control List) for Drill...
Github user bitblender commented on the issue: https://github.com/apache/drill/pull/875 Yes. There will be a commit addressing the issues raised here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #894: DRILL-5701: drill.connections.rpc.
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/894#discussion_r134553446 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/rpc/user/security/TestUserBitKerberos.java --- @@ -137,6 +142,41 @@ public Void run() throws Exception { test("SELECT * FROM cp.`region.json` LIMIT 5"); } + @Test + public void testUnecryptedConnectionCounter() throws Exception { +final Properties connectionProps = new Properties(); +connectionProps.setProperty(DrillProperties.SERVICE_PRINCIPAL, krbHelper.SERVER_PRINCIPAL); +connectionProps.setProperty(DrillProperties.KERBEROS_FROM_SUBJECT, "true"); +final Subject clientSubject = JaasKrbUtil.loginUsingKeytab(krbHelper.CLIENT_PRINCIPAL, +krbHelper.clientKeytab.getAbsoluteFile()); + +Subject.doAs(clientSubject, new PrivilegedExceptionAction() { + @Override + public Void run() throws Exception { +updateClient(connectionProps); +return null; + } +}); + +// Run few queries using the new client +testBuilder() +.sqlQuery("SELECT session_user FROM (SELECT * FROM sys.drillbits LIMIT 1)") +.unOrdered() +.baselineColumns("session_user") +.baselineValues(krbHelper.CLIENT_SHORT_NAME) +.go(); + +// Check encrypted counters value --- End diff -- Please explain the logic behind the different count values in the assertion. Same for other tests as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r122526465 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -806,10 +998,32 @@ public void generateTestDataAlt(int size) { } <#-- type.width --> +/** + * Backfill missing offsets from the given last written position to the + * given current write position. Used by the "new" size-safe column + * writers to allow skipping values. The set() and setSafe() + * do not fill empties. See DRILL-5529 and DRILL-. + * @param lastWrite the position of the last valid write: the offset + * to be copied forward + * @param index the current write position filling occurs up to, + * but not including, this position + */ + +public void fillEmptiesBounded(int lastWrite, int index) +throws VectorOverflowException { + for (int i = lastWrite; i < index; i++) { +<#if type.width <= 8> --- End diff -- You can avoid an extra op in the loop by adjusting the bounds of the induction variable. The compiler's induction variable analysis might automatically figure this one out. public void fillEmptiesBounded(int lastWrite, int index) throws VectorOverflowException { for (int i = lastWrite + 1; i <= index; i++) { setSafe(i, (int) 0);<-- one less addition in the loop and less register pressure } } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r118806486 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java --- @@ -174,6 +175,40 @@ public ByteBuf setDouble(int index, double value) { return this; } + // Clone of the super class checkIndex, but this version returns a boolean rather + // than throwing an exception. + + protected boolean hasCapacity(int index, int fieldLength) { +if (fieldLength < 0) { --- End diff -- change this to an assertion as per our discussion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r122533121 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -548,6 +567,23 @@ public void setSafe(int index, ByteBuffer bytes, int start, int length) { } } +public void setScalar(int index, DrillBuf bytes, int start, int length) throws VectorOverflowException { + assert index >= 0; + + if (index >= MAX_ROW_COUNT) { +throw new VectorOverflowException(); + } + int currentOffset = offsetVector.getAccessor().get(index); + final int newSize = currentOffset + length; + if (newSize > MAX_BUFFER_SIZE) { +throw new VectorOverflowException(); + } + while (! data.setBytesBounded(currentOffset, bytes, start, length)) { --- End diff -- indentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r122541355 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java --- @@ -174,6 +175,40 @@ public ByteBuf setDouble(int index, double value) { return this; } + // Clone of the super class checkIndex, but this version returns a boolean rather + // than throwing an exception. + + protected boolean hasCapacity(int index, int fieldLength) { +if (fieldLength < 0) { +throw new IllegalArgumentException("length: " + fieldLength + " (expected: >= 0)"); +} +return (! (index < 0 || index > capacity() - fieldLength)); + } + + // Clone of the super class setBytes(), but with bounds checking done as a boolean, + // not assertion. + + public boolean setBytesBounded(int index, byte[] src, int srcIndex, int length) { +if (! hasCapacity(index, length)) { + return false; +} +if (length != 0) { --- End diff -- Remove length check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r117412175 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -234,6 +233,37 @@ void DrillClientImpl::Close() { } /* + * Write bytesToWrite length data bytes pointed by dataPtr. It handles EINTR error + * occurred during write_some sys call and does a retry on that. + * + * Parameters: + * dataPtr - in param - Pointer to data bytes to write on socket. + * bytesToWrite - in param - Length of data bytes to write from dataPtr. + * errorCode- out param - Error code set by boost. + */ +void DrillClientImpl::doWriteToSocket(const char* dataPtr, size_t bytesToWrite, +boost::system::error_code& errorCode) { +if(0 == bytesToWrite) { --- End diff -- Should you check for a NULL dataPtr ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r117412544 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -364,7 +395,41 @@ connectionStatus_t DrillClientImpl::recvHandshake(){ return CONN_SUCCESS; } -void DrillClientImpl::handleHandshake(ByteBuf_t _buf, +/* + * Read bytesToRead length data bytes from socket into inBuf. It handles EINTR error + * occurred during read_some sys call and does a retry on that. + * + * Parameters: + * inBuf- out param - Pointer to buffer to read data into from socket. + * bytesToRead - in param - Length of data bytes to read from socket. + * errorCode- out param - Error code set by boost. + */ +void DrillClientImpl::doReadFromSocket(ByteBuf_t inBuf, size_t bytesToRead, + boost::system::error_code& errorCode) { + +// Check if bytesToRead is zero +if(0 == bytesToRead) { --- End diff -- Does a NULL inBuf have to be handled ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #837: DRILL-5514: Enhance VectorContainer to merge two ro...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/837#discussion_r122287096 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java --- @@ -110,13 +110,16 @@ public void testContainerMerge() { RowSet mergedRs = left.merge(right); comparison.verifyAndClear(mergedRs); -// Add a selection vector. Ensure the SV appears in the merged -// result. Test as a row set since container's don't actually -// carry the selection vector. +// Add a selection vector. Merging is forbidden. --- End diff -- Maybe this can be changed to "//Merging data with a selection vector is forbidden". As is the comment implies that we are adding a selection vector. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #837: DRILL-5514: Enhance VectorContainer to merge two ro...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/837#discussion_r122287615 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/BatchSchema.java --- @@ -162,20 +162,22 @@ private boolean majorTypeEqual(MajorType t1, MajorType t2) { * Merge two schema to produce a new, merged schema. The caller is responsible * for ensuring that column names are unique. The order of the fields in the * new schema is the same as that of this schema, with the other schema's fields - * appended in the order defined in the other schema. The resulting selection - * vector mode is the same as this schema. (That is, this schema is assumed to - * be the main part of the batch, possibly with a selection vector, with the - * other schema representing additional, new columns.) + * appended in the order defined in the other schema. + * + * Merging data with selection vectors is unlikely to be useful, or work well. --- End diff -- Can you please leave a comment about why this is unlikely to be useful, or work well? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114442641 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -245,6 +264,64 @@ connectionStatus_t DrillClientImpl::sendSync(rpc::OutBoundRpcMessage& msg){ } } +/* + * Send handler for sending encrypted messages over wire. It encrypts the send buffer using wrap api provided by + * saslAuthenticatorImpl and then transmit the encrypted bytes over wire. + * + * Return: + * connectionStatus_t - CONN_SUCCESS - In case of successful send + * - CONN_FAILURE - In case of failure to send + */ +connectionStatus_t DrillClientImpl::sendSyncEncrypted() { + +boost::system::error_code ec; + +// Split the encoded message into the rawWrapSendSize and then encrypt each chunk. Each encrypted chunk along with +// its encrypted length in network order (added by Cyrus-SASL plugin) is sent over wire. +const int wrapChunkSize = m_encryptionCtxt.getRawWrapSendSize(); +int numChunks = ceil((double)m_wbuf.size() / wrapChunkSize); +int lengthToEncrypt = m_wbuf.size(); +int currentChunkLen = std::min(wrapChunkSize, lengthToEncrypt); +uint32_t startIndex = 0, wrappedLen = 0; +const char* wrappedChunk = NULL; +std::stringstream errorMsg; + +// Encrypt and send each chunk +while(numChunks > 0) { +int wrapResult = m_saslAuthenticator->wrap(reinterpret_cast(m_wbuf.data() + startIndex), + currentChunkLen, , wrappedLen); +if(SASL_OK != wrapResult) { +errorMsg << "Sasl wrap failed while encrypting chunk of length: " << currentChunkLen << " , EncodeError: " + << wrapResult; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::sendSyncEncrypted - " << errorMsg.str() + << " ,StartIndex: " << startIndex << ". ChunkNum: " << numChunks --- End diff -- Did you want to print the chunk number or chunks remaining? ChunkNum should be Total chunks - Chunks Remaining. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r115369295 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); +lengthBytesLength = bytes_read; + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Length bytes = " << bytes_read << std::endl;) +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Msg Length = " << rmsgLen << std::endl;) + +if(rmsgLen>0){ +leftover = LEN_PREFIX_BUFLEN - bytes_read; + +// Allocate a buffer for reading all the bytes in bufWithLen and length number of bytes + bufferWithLenBytesSize = rmsgLen + bytes_read; +*bufferWithLenBytes = new AllocatedBuffer(bufferWithLenBytesSize); + +if(*bufferWithLenBytes == NULL){ +return handleQryError(QRY_CLIENT_OUTOFMEM, getMessage(ERR_QRY_OUTOFMEM), NULL); +} + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ " + << *bufferWithLenBytes << ", size = " << bufferWithLenBytesSize << " ]\n";) + +// Copy the memory of bufWithLen into bufferWithLenBytesSize +memcpy((*bufferWithLenBytes)->m_pBuffer, bufWithLen, LEN_PREFIX_BUFLEN); + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Copied bufWithLen into bufferWithLenBytes. " + << "Now reading data (rmsgLen - leftover) : " << (rmsgLen - leftover) + << std::endl;) + +// Read the entire data left from socket and copy to currentBuffer. +ByteBuf_t b = (*bufferWithLenBytes)->m_pBuffer + LEN_PREFIX_BUFLEN; +size_t bytesToRead = rmsgLen - leftover; + +while(1){ +bytes_read = this->m_socket.read_some(boost::asio::buffer(b, bytesToRead), error); +if(error) break; --- End diff -- Shouldn't EINTR (interrupted) be handled like a temporary failure, with a subsequent retry? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114442866 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -245,6 +264,64 @@ connectionStatus_t DrillClientImpl::sendSync(rpc::OutBoundRpcMessage& msg){ } } +/* + * Send handler for sending encrypted messages over wire. It encrypts the send buffer using wrap api provided by + * saslAuthenticatorImpl and then transmit the encrypted bytes over wire. + * + * Return: + * connectionStatus_t - CONN_SUCCESS - In case of successful send + * - CONN_FAILURE - In case of failure to send + */ +connectionStatus_t DrillClientImpl::sendSyncEncrypted() { + +boost::system::error_code ec; + +// Split the encoded message into the rawWrapSendSize and then encrypt each chunk. Each encrypted chunk along with +// its encrypted length in network order (added by Cyrus-SASL plugin) is sent over wire. +const int wrapChunkSize = m_encryptionCtxt.getRawWrapSendSize(); +int numChunks = ceil((double)m_wbuf.size() / wrapChunkSize); +int lengthToEncrypt = m_wbuf.size(); +int currentChunkLen = std::min(wrapChunkSize, lengthToEncrypt); +uint32_t startIndex = 0, wrappedLen = 0; +const char* wrappedChunk = NULL; +std::stringstream errorMsg; + +// Encrypt and send each chunk +while(numChunks > 0) { +int wrapResult = m_saslAuthenticator->wrap(reinterpret_cast(m_wbuf.data() + startIndex), + currentChunkLen, , wrappedLen); +if(SASL_OK != wrapResult) { +errorMsg << "Sasl wrap failed while encrypting chunk of length: " << currentChunkLen << " , EncodeError: " + << wrapResult; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::sendSyncEncrypted - " << errorMsg.str() + << " ,StartIndex: " << startIndex << ". ChunkNum: " << numChunks + << ", Closing connection.";) +return handleConnError(CONN_FAILURE, getMessage(ERR_CONN_WFAIL, errorMsg.str().c_str())); +} + +// Send the encrypted chunk. +size_t s = m_socket.write_some(boost::asio::buffer(wrappedChunk, wrappedLen), ec); + +if(ec || s==0){ +errorMsg << "Failure while sending encrypted chunk. Error: " << ec.message().c_str(); +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::sendSyncEncrypted - " << errorMsg.str() << " Chunk:" --- End diff -- Did you want to print the chunk number or chunks remaining? ChunkNum should be Total chunks - Chunks Remaining. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114442367 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -245,6 +264,64 @@ connectionStatus_t DrillClientImpl::sendSync(rpc::OutBoundRpcMessage& msg){ } } +/* + * Send handler for sending encrypted messages over wire. It encrypts the send buffer using wrap api provided by + * saslAuthenticatorImpl and then transmit the encrypted bytes over wire. + * + * Return: + * connectionStatus_t - CONN_SUCCESS - In case of successful send + * - CONN_FAILURE - In case of failure to send + */ +connectionStatus_t DrillClientImpl::sendSyncEncrypted() { + +boost::system::error_code ec; + +// Split the encoded message into the rawWrapSendSize and then encrypt each chunk. Each encrypted chunk along with +// its encrypted length in network order (added by Cyrus-SASL plugin) is sent over wire. +const int wrapChunkSize = m_encryptionCtxt.getRawWrapSendSize(); +int numChunks = ceil((double)m_wbuf.size() / wrapChunkSize); +int lengthToEncrypt = m_wbuf.size(); +int currentChunkLen = std::min(wrapChunkSize, lengthToEncrypt); +uint32_t startIndex = 0, wrappedLen = 0; --- End diff -- "currentChunkOffset" would be a better name than 'startIndex'. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r115370795 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); +lengthBytesLength = bytes_read; + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Length bytes = " << bytes_read << std::endl;) +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Msg Length = " << rmsgLen << std::endl;) + +if(rmsgLen>0){ +leftover = LEN_PREFIX_BUFLEN - bytes_read; + +// Allocate a buffer for reading all the bytes in bufWithLen and length number of bytes + bufferWithLenBytesSize = rmsgLen + bytes_read; +*bufferWithLenBytes = new AllocatedBuffer(bufferWithLenBytesSize); + +if(*bufferWithLenBytes == NULL){ +return handleQryError(QRY_CLIENT_OUTOFMEM, getMessage(ERR_QRY_OUTOFMEM), NULL); +} + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ " + << *bufferWithLenBytes << ", size = " << bufferWithLenBytesSize << " ]\n";) + +// Copy the memory of bufWithLen into bufferWithLenBytesSize +memcpy((*bufferWithLenBytes)->m_pBuffer, bufWithLen, LEN_PREFIX_BUFLEN); + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Copied bufWithLen into bufferWithLenBytes. " + << "Now reading data (rmsgLen - leftover) : " << (rmsgLen - leftover) + << std::endl;) + +// Read the entire data left from socket and copy to currentBuffer. +ByteBuf_t b = (*bufferWithLenBytes)->m_pBuffer + LEN_PREFIX_BUFLEN; +size_t bytesToRead = rmsgLen - leftover; + +while(1){ +bytes_read = this->m_socket.read_some(boost::asio::buffer(b, bytesToRead), error); +if(error) break; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Data Message: actual bytes read = " << bytes_read << std::endl;) +if(bytes_read == bytesToRead) break; +bytesToRead -= bytes_read; +b += bytes_read; +} +} else { +return handleQryError(QRY_INTERNAL_ERROR, getMessage(ERR_QRY_INVREADLEN), NULL); +} + +return error ? handleQryError(QRY_COMM_ERROR, getMessage(ERR_QRY_COMMERR, error.message().c_str()), NULL) + : QRY_SUCCESS; +} + + +/* + * Function to read entire RPC message from socket and decode it to InboundRpcMessage + * Parameters: + * _buf- in param - Buffer containing the length bytes. + * allocatedBuffer - out param - Buffer containing the length bytes and entire RPC message bytes. + * msg - out param - Decoded InBound
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114623038 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); +lengthBytesLength = bytes_read; + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Length bytes = " << bytes_read << std::endl;) +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Msg Length = " << rmsgLen << std::endl;) + +if(rmsgLen>0){ +leftover = LEN_PREFIX_BUFLEN - bytes_read; + +// Allocate a buffer for reading all the bytes in bufWithLen and length number of bytes + bufferWithLenBytesSize = rmsgLen + bytes_read; +*bufferWithLenBytes = new AllocatedBuffer(bufferWithLenBytesSize); + +if(*bufferWithLenBytes == NULL){ +return handleQryError(QRY_CLIENT_OUTOFMEM, getMessage(ERR_QRY_OUTOFMEM), NULL); +} + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ " + << *bufferWithLenBytes << ", size = " << bufferWithLenBytesSize << " ]\n";) + +// Copy the memory of bufWithLen into bufferWithLenBytesSize +memcpy((*bufferWithLenBytes)->m_pBuffer, bufWithLen, LEN_PREFIX_BUFLEN); + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Copied bufWithLen into bufferWithLenBytes. " + << "Now reading data (rmsgLen - leftover) : " << (rmsgLen - leftover) + << std::endl;) + +// Read the entire data left from socket and copy to currentBuffer. +ByteBuf_t b = (*bufferWithLenBytes)->m_pBuffer + LEN_PREFIX_BUFLEN; +size_t bytesToRead = rmsgLen - leftover; + +while(1){ +bytes_read = this->m_socket.read_some(boost::asio::buffer(b, bytesToRead), error); +if(error) break; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Data Message: actual bytes read = " << bytes_read << std::endl;) +if(bytes_read == bytesToRead) break; +bytesToRead -= bytes_read; +b += bytes_read; +} +} else { +return handleQryError(QRY_INTERNAL_ERROR, getMessage(ERR_QRY_INVREADLEN), NULL); +} + +return error ? handleQryError(QRY_COMM_ERROR, getMessage(ERR_QRY_COMMERR, error.message().c_str()), NULL) + : QRY_SUCCESS; +} + + +/* + * Function to read entire RPC message from socket and decode it to InboundRpcMessage + * Parameters: + * _buf- in param - Buffer containing the length bytes. + * allocatedBuffer - out param - Buffer containing the length bytes and entire RPC message bytes. + * msg - out param - Decoded InBound
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114629588 --- Diff: contrib/native/client/src/clientlib/utils.cpp --- @@ -111,4 +111,52 @@ AllocatedBuffer::~AllocatedBuffer(){ m_bufSize = 0; } +EncryptionContext::EncryptionContext(const bool& encryptionReqd, const int& wrapChunkSize, const int& rawSendSize) { +this->m_bEncryptionReqd = encryptionReqd; +this->m_maxWrapChunkSize = wrapChunkSize; +this->m_rawWrapSendSize = rawSendSize; +} + +EncryptionContext::EncryptionContext() { +this->m_bEncryptionReqd = false; +// SASL Framework only allows 3 octet length field during negotiation so maximum wrap message +// length can be 16MB i.e. 0XFF --- End diff -- // so maximum wrap message length can be 16MB i.e. 0XFF Max maxWrapChunkSize has to be strictly less than 16MB. 0XFF = 16MB - 1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r115371638 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); +lengthBytesLength = bytes_read; + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Length bytes = " << bytes_read << std::endl;) +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Msg Length = " << rmsgLen << std::endl;) + +if(rmsgLen>0){ +leftover = LEN_PREFIX_BUFLEN - bytes_read; + +// Allocate a buffer for reading all the bytes in bufWithLen and length number of bytes + bufferWithLenBytesSize = rmsgLen + bytes_read; +*bufferWithLenBytes = new AllocatedBuffer(bufferWithLenBytesSize); + +if(*bufferWithLenBytes == NULL){ +return handleQryError(QRY_CLIENT_OUTOFMEM, getMessage(ERR_QRY_OUTOFMEM), NULL); +} + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ " + << *bufferWithLenBytes << ", size = " << bufferWithLenBytesSize << " ]\n";) + +// Copy the memory of bufWithLen into bufferWithLenBytesSize +memcpy((*bufferWithLenBytes)->m_pBuffer, bufWithLen, LEN_PREFIX_BUFLEN); + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Copied bufWithLen into bufferWithLenBytes. " + << "Now reading data (rmsgLen - leftover) : " << (rmsgLen - leftover) + << std::endl;) + +// Read the entire data left from socket and copy to currentBuffer. +ByteBuf_t b = (*bufferWithLenBytes)->m_pBuffer + LEN_PREFIX_BUFLEN; +size_t bytesToRead = rmsgLen - leftover; + +while(1){ +bytes_read = this->m_socket.read_some(boost::asio::buffer(b, bytesToRead), error); +if(error) break; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Data Message: actual bytes read = " << bytes_read << std::endl;) +if(bytes_read == bytesToRead) break; +bytesToRead -= bytes_read; +b += bytes_read; +} +} else { +return handleQryError(QRY_INTERNAL_ERROR, getMessage(ERR_QRY_INVREADLEN), NULL); +} + +return error ? handleQryError(QRY_COMM_ERROR, getMessage(ERR_QRY_COMMERR, error.message().c_str()), NULL) + : QRY_SUCCESS; +} + + +/* + * Function to read entire RPC message from socket and decode it to InboundRpcMessage + * Parameters: + * _buf- in param - Buffer containing the length bytes. + * allocatedBuffer - out param - Buffer containing the length bytes and entire RPC message bytes. + * msg - out param - Decoded InBound
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114828848 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or --- End diff -- How about 'bufWithLenSize'? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r115372156 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); +lengthBytesLength = bytes_read; + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Length bytes = " << bytes_read << std::endl;) +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Msg Length = " << rmsgLen << std::endl;) + +if(rmsgLen>0){ +leftover = LEN_PREFIX_BUFLEN - bytes_read; + +// Allocate a buffer for reading all the bytes in bufWithLen and length number of bytes + bufferWithLenBytesSize = rmsgLen + bytes_read; +*bufferWithLenBytes = new AllocatedBuffer(bufferWithLenBytesSize); + +if(*bufferWithLenBytes == NULL){ +return handleQryError(QRY_CLIENT_OUTOFMEM, getMessage(ERR_QRY_OUTOFMEM), NULL); +} + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ " + << *bufferWithLenBytes << ", size = " << bufferWithLenBytesSize << " ]\n";) + +// Copy the memory of bufWithLen into bufferWithLenBytesSize +memcpy((*bufferWithLenBytes)->m_pBuffer, bufWithLen, LEN_PREFIX_BUFLEN); + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Copied bufWithLen into bufferWithLenBytes. " + << "Now reading data (rmsgLen - leftover) : " << (rmsgLen - leftover) + << std::endl;) + +// Read the entire data left from socket and copy to currentBuffer. +ByteBuf_t b = (*bufferWithLenBytes)->m_pBuffer + LEN_PREFIX_BUFLEN; +size_t bytesToRead = rmsgLen - leftover; + +while(1){ +bytes_read = this->m_socket.read_some(boost::asio::buffer(b, bytesToRead), error); +if(error) break; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Data Message: actual bytes read = " << bytes_read << std::endl;) +if(bytes_read == bytesToRead) break; +bytesToRead -= bytes_read; +b += bytes_read; +} +} else { +return handleQryError(QRY_INTERNAL_ERROR, getMessage(ERR_QRY_INVREADLEN), NULL); +} + +return error ? handleQryError(QRY_COMM_ERROR, getMessage(ERR_QRY_COMMERR, error.message().c_str()), NULL) + : QRY_SUCCESS; +} + + +/* + * Function to read entire RPC message from socket and decode it to InboundRpcMessage + * Parameters: + * _buf- in param - Buffer containing the length bytes. + * allocatedBuffer - out param - Buffer containing the length bytes and entire RPC message bytes. + * msg - out param - Decoded InBound
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r11272 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -245,6 +264,64 @@ connectionStatus_t DrillClientImpl::sendSync(rpc::OutBoundRpcMessage& msg){ } } +/* + * Send handler for sending encrypted messages over wire. It encrypts the send buffer using wrap api provided by + * saslAuthenticatorImpl and then transmit the encrypted bytes over wire. + * + * Return: + * connectionStatus_t - CONN_SUCCESS - In case of successful send + * - CONN_FAILURE - In case of failure to send + */ +connectionStatus_t DrillClientImpl::sendSyncEncrypted() { + +boost::system::error_code ec; + +// Split the encoded message into the rawWrapSendSize and then encrypt each chunk. Each encrypted chunk along with +// its encrypted length in network order (added by Cyrus-SASL plugin) is sent over wire. +const int wrapChunkSize = m_encryptionCtxt.getRawWrapSendSize(); +int numChunks = ceil((double)m_wbuf.size() / wrapChunkSize); +int lengthToEncrypt = m_wbuf.size(); +int currentChunkLen = std::min(wrapChunkSize, lengthToEncrypt); +uint32_t startIndex = 0, wrappedLen = 0; +const char* wrappedChunk = NULL; +std::stringstream errorMsg; + +// Encrypt and send each chunk +while(numChunks > 0) { +int wrapResult = m_saslAuthenticator->wrap(reinterpret_cast(m_wbuf.data() + startIndex), + currentChunkLen, , wrappedLen); +if(SASL_OK != wrapResult) { +errorMsg << "Sasl wrap failed while encrypting chunk of length: " << currentChunkLen << " , EncodeError: " + << wrapResult; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::sendSyncEncrypted - " << errorMsg.str() + << " ,StartIndex: " << startIndex << ". ChunkNum: " << numChunks + << ", Closing connection.";) +return handleConnError(CONN_FAILURE, getMessage(ERR_CONN_WFAIL, errorMsg.str().c_str())); +} + +// Send the encrypted chunk. +size_t s = m_socket.write_some(boost::asio::buffer(wrappedChunk, wrappedLen), ec); + +if(ec || s==0){ +errorMsg << "Failure while sending encrypted chunk. Error: " << ec.message().c_str(); +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::sendSyncEncrypted - " << errorMsg.str() << " Chunk:" + << numChunks << ", Original Length: " << currentChunkLen + << ", StartIndex:" << startIndex << ", Closing connection.";) +return handleConnError(CONN_FAILURE, getMessage(ERR_CONN_WFAIL, errorMsg.str().c_str())); --- End diff -- What happens to the memory allocated for wrappedChunk when this path is taken? Does handleConnError() deal with it ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114775522 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -370,6 +453,33 @@ void DrillClientImpl::handleHShakeReadTimeout(const boost::system::error_code & return; } +/* + * Check's if client has explicitly expressed interest in encrypted connections only. It looks for USERPROP_ENCRYPTION + * connection string property. If set to true then returns true else returns false + */ +bool DrillClientImpl::clientNeedsEncryption(const DrillUserProperties* userProperties) { +bool needsEncryption = false; +// check if userProperties is null +if(!userProperties) { +return needsEncryption; +} + +// Loop through the property to find USERPROP_ENCRYPTION and it's value +for (size_t i = 0; i < userProperties->size(); i++) { --- End diff -- Not related to your change: I wonder why DrillUserProperties was not implemented as a map? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r115369469 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); +lengthBytesLength = bytes_read; + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Length bytes = " << bytes_read << std::endl;) +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Msg Length = " << rmsgLen << std::endl;) + +if(rmsgLen>0){ +leftover = LEN_PREFIX_BUFLEN - bytes_read; + +// Allocate a buffer for reading all the bytes in bufWithLen and length number of bytes + bufferWithLenBytesSize = rmsgLen + bytes_read; +*bufferWithLenBytes = new AllocatedBuffer(bufferWithLenBytesSize); + +if(*bufferWithLenBytes == NULL){ +return handleQryError(QRY_CLIENT_OUTOFMEM, getMessage(ERR_QRY_OUTOFMEM), NULL); +} + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ " + << *bufferWithLenBytes << ", size = " << bufferWithLenBytesSize << " ]\n";) + +// Copy the memory of bufWithLen into bufferWithLenBytesSize +memcpy((*bufferWithLenBytes)->m_pBuffer, bufWithLen, LEN_PREFIX_BUFLEN); + +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Copied bufWithLen into bufferWithLenBytes. " + << "Now reading data (rmsgLen - leftover) : " << (rmsgLen - leftover) + << std::endl;) + +// Read the entire data left from socket and copy to currentBuffer. +ByteBuf_t b = (*bufferWithLenBytes)->m_pBuffer + LEN_PREFIX_BUFLEN; +size_t bytesToRead = rmsgLen - leftover; + +while(1){ +bytes_read = this->m_socket.read_some(boost::asio::buffer(b, bytesToRead), error); +if(error) break; +DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Data Message: actual bytes read = " << bytes_read << std::endl;) +if(bytes_read == bytesToRead) break; +bytesToRead -= bytes_read; +b += bytes_read; +} +} else { +return handleQryError(QRY_INTERNAL_ERROR, getMessage(ERR_QRY_INVREADLEN), NULL); +} + +return error ? handleQryError(QRY_COMM_ERROR, getMessage(ERR_QRY_COMMERR, error.message().c_str()), NULL) --- End diff -- what happens to memory allocated for bufferWithLenBytes on error? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114828323 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the --- End diff -- 'bufferWithLenBytes' this name is confusing given that there is already a 'bufWithLen'. How about something like 'bufWithRPCMsg'? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #809: Drill-4335: C++ client changes for supporting encry...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/809#discussion_r114818140 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -854,75 +990,328 @@ void DrillClientImpl::waitForResults(){ } } -status_t DrillClientImpl::readMsg(ByteBuf_t _buf, -AllocatedBufferPtr* allocatedBuffer, +/* + * Decode the length of the message from bufWithLen and then read entire message from the socket. + * Parameters: + * bufWithLen - in param - buffer containing the bytes which has length of the RPC message/encrypted chunk + * bufferWithLenBytes - out param - buffer pointer which points to memory allocated in this function and has the + *entire one RPC message / encrypted chunk along with the length of the message + * lengthBytesLength - out param - bytes of bufWithLen which contains the length of the entire RPC message or + *encrypted chunk + * lengthDecodeHandler - in param - function pointer with length decoder to use. For encrypted chunk we use + *lengthDecode and for plain RPC message we use rpcLengthDecode. + * Return: + * status_t- QRY_SUCCESS- In case of success. + * - QRY_COMM_ERROR/QRY_INTERNAL_ERROR/QRY_CLIENT_OUTOFMEM - In cases of error. + */ +status_t DrillClientImpl::readLenBytesFromSocket(ByteBuf_t bufWithLen, AllocatedBufferPtr* bufferWithLenBytes, + uint32_t& lengthBytesLength, lengthDecoder lengthDecodeHandler) { + +uint32_t rmsgLen = 0; +size_t bytes_read = 0; +size_t leftover = 0; +boost::system::error_code error; +*bufferWithLenBytes = NULL; +size_t bufferWithLenBytesSize = 0; + +bytes_read = (this->*lengthDecodeHandler)(bufWithLen, rmsgLen); --- End diff -- What does it mean to access a local using the 'this' ptr? Can't this be written as bytes_read = (lengthDecodeHandler)(bufWithLen, rmsgLen); ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #837: DRILL-5514: Enhance VectorContainer to merge two ro...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/837#discussion_r120198724 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/BatchSchema.java --- @@ -157,4 +158,26 @@ private boolean majorTypeEqual(MajorType t1, MajorType t2) { return true; } + /** + * Merge two schema to produce a new, merged schema. The caller is responsible + * for ensuring that column names are unique. The order of the fields in the + * new schema is the same as that of this schema, with the other schema's fields + * appended in the order defined in the other schema. The resulting selection + * vector mode is the same as this schema. (That is, this schema is assumed to + * be the main part of the batch, possibly with a selection vector, with the + * other schema representing additional, new columns.) + * @param otherSchema the schema to merge with this one + * @return the new, merged, schema + */ + + public BatchSchema merge(BatchSchema otherSchema) { +if (otherSchema.selectionVectorMode != SelectionVectorMode.NONE && +selectionVectorMode != otherSchema.selectionVectorMode) { + throw new IllegalArgumentException("Left schema must carry the selection vector mode"); +} +List mergedFields = new ArrayList<>(); --- End diff -- List mergedFields = new ArrayList(this.fields.size() + otherSchema.fields.size()) would avoid having to potentially grow the ArrayList twice. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #837: DRILL-5514: Enhance VectorContainer to merge two ro...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/837#discussion_r118797793 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/BatchSchema.java --- @@ -157,4 +158,26 @@ private boolean majorTypeEqual(MajorType t1, MajorType t2) { return true; } + /** + * Merge two schema to produce a new, merged schema. The caller is responsible + * for ensuring that column names are unique. The order of the fields in the + * new schema is the same as that of this schema, with the other schema's fields + * appended in the order defined in the other schema. The resulting selection + * vector mode is the same as this schema. (That is, this schema is assumed to + * be the main part of the batch, possibly with a selection vector, with the + * other schema representing additional, new columns.) + * @param otherSchema the schema to merge with this one + * @return the new, merged, schema + */ + + public BatchSchema merge(BatchSchema otherSchema) { +if (otherSchema.selectionVectorMode != SelectionVectorMode.NONE && +selectionVectorMode != otherSchema.selectionVectorMode) { + throw new IllegalArgumentException("Left schema must carry the selection vector mode"); --- End diff -- "Left schema must carry the same selection vector mode" + "as the right schema"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #983: MD-2769: DRILL-5819: Default value of security.admi...
GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/983 MD-2769: DRILL-5819: Default value of security.admin.user_groups and security.admin.users is true The values for admin user/groups in the config file was incorrectly set to "true". These options have no meaningful static default values. At runtime, the process user/groups should be used as the default admin user/groups. New accessors have been added for these options and these have to be used instead of the usual option manager accessors. Dummy default strings are used in the config file, which when returned by the usual options manager, indicate that the user has not changed these options. This change implements the above-mentioned items and also adds a UI section to see the current admin user/groups. This UI section is only shown to authenticated admin users. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill DRILL-5819 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/983.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #983 commit 3ba5401a3f6780dbbffd36ef2a9f67b12089a9ef Author: karthik <kmanivan...@maprtech.com> Date: 2017-10-04T18:23:04Z MD-2769: DRILL-5819: Default value of security.admin.user_groups and security.admin.users is true ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r137851895 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/package-info.java --- @@ -0,0 +1,295 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Handles the details of the result set loader implementation. + * + * The primary purpose of this loader, and the most complex to understand and + * maintain, is overflow handling. + * + * Detailed Use Cases + * + * Let's examine it by considering a number of + * use cases. + * + * Rowabcdefgh + * n-2XX-- + * n-1 -- + * n X!O O O + * + * Here: + * + * n-2, n-1, and n are rows. n is the overflow row. + * X indicates a value was written before overflow. + * Blank indicates no value was written in that row. + * ! indicates the value that triggered overflow. + * - indicates a column that did not exist prior to overflow. + * + * Column a is written before overflow occurs, b causes overflow, and all other + * columns either are not written, or written after overflow. + * + * The scenarios, identified by column names above, are: + * + * a + * a contains values for all three rows. + * + * Two values were written in the "main" batch, while a third was written to + * what becomes the overflow row. + * When overflow occurs, the last write position is at n. It must be moved + * back to n-1. + * Since data was written to the overflow row, it is copied to the look- + * ahead batch. + * The last write position in the lookahead batch is 0 (since data was + * copied into the 0th row. + * When harvesting, no empty-filling is needed. + * When starting the next batch, the last write position must be set to 0 to + * reflect the presence of the value for row n. + * + * + * b + * b contains values for all three rows. The value for row n triggers + * overflow. + * + * The last write position is at n-1, which is kept for the "main" + * vector. + * A new overflow vector is created and starts empty, with the last write + * position at -1. + * Once created, b is immediately written to the overflow vector, advancing + * the last write position to 0. + * Harvesting, and starting the next for column b works the same as column + * a. + * + * + * c + * Column c has values for all rows. + * + * The value for row n is written after overflow. + * At overflow, the last write position is at n-1. + * At overflow, a new lookahead vector is created with the last write + * position at -1. + * The value of c is written to the lookahead vector, advancing the last + * write position to -1. + * Harvesting, and starting the next for column c works the same as column + * a. + * + * + * d + * Column d writes values to the last two rows before overflow, but not to + * the overflow row. + * + * The last write position for the main batch is at n-1. + * The last write position in the lookahead batch remains at -1. + * Harvesting for column d requires filling an empty value for row n-1. + * When starting the next batch, the last write position must be set to -1, + * indicating no data yet written. + * + * + * f + * Column f has no data in the last position of the main batch, and no data + * in the overflow row. + * + * The last write position is at n-2. + * An empty value must be written into position n-1 during harvest. + * On start of the next batch, the last write position starts at -1. + * + * + * g + * Column g is added after overflow, and has a value written to the overflow + * row. + * + * On harvest, column g is simply skipped. + * On start of the next row, the last write position can be left unchan
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r137852118 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/package-info.java --- @@ -0,0 +1,295 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Handles the details of the result set loader implementation. + * + * The primary purpose of this loader, and the most complex to understand and + * maintain, is overflow handling. + * + * Detailed Use Cases + * + * Let's examine it by considering a number of + * use cases. + * + * Rowabcdefgh + * n-2XX-- + * n-1 -- + * n X!O O O + * + * Here: + * + * n-2, n-1, and n are rows. n is the overflow row. + * X indicates a value was written before overflow. + * Blank indicates no value was written in that row. + * ! indicates the value that triggered overflow. + * - indicates a column that did not exist prior to overflow. --- End diff -- What does an 'O' value mean in the diagram above? ---
[GitHub] drill pull request #983: DRILL-5819: Default value of security.admin.user_gr...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/983#discussion_r144143097 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/server/TestOptionsAuthEnabled.java --- @@ -115,4 +124,59 @@ private void setOptHelper() throws Exception { test(String.format("ALTER SYSTEM SET `%s` = %d;", ExecConstants.SLICE_TARGET, ExecConstants.SLICE_TARGET_DEFAULT)); } } + + @Test + public void testAdminUserOptions() throws Exception { +FixtureBuilder builder = ClusterFixture.builder(); + +try (ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + OptionManager optionManager = cluster.drillbit().getContext().getOptionManager(); + + // Admin Users Tests + // config file should have the 'fake' default admin user and it should be returned + // by the option manager if the option has not been set by the user + String configAdminUser = optionManager.getOption(ExecConstants.ADMIN_USERS_VALIDATOR);; + assertEquals(configAdminUser, ExecConstants.ADMIN_USERS_VALIDATOR.DEFAULT_ADMIN_USERS); + + // Option accessor should never return the 'fake' default from the config + String adminUser1 = ExecConstants.ADMIN_USERS_VALIDATOR.getAdminUsers(optionManager); + assertNotEquals(adminUser1, ExecConstants.ADMIN_USERS_VALIDATOR.DEFAULT_ADMIN_USERS); + + // Change TEST_ADMIN_USER if necessary + String TEST_ADMIN_USER = "ronswanson"; + if (adminUser1.equals(TEST_ADMIN_USER)) { +TEST_ADMIN_USER += "thefirst"; + } + // Check if the admin option accessor honors a user-supplied values + String sql = String.format("ALTER SYSTEM SET `%s`='%s'", ExecConstants.ADMIN_USERS_KEY, TEST_ADMIN_USER); + client.queryBuilder().sql(sql).run(); --- End diff -- Done. ---
[GitHub] drill pull request #983: DRILL-5819: Default value of security.admin.user_gr...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/983#discussion_r144143072 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -75,6 +80,29 @@ public ClusterInfo getClusterInfoJSON() { final DrillConfig config = dbContext.getConfig(); final boolean userEncryptionEnabled = config.getBoolean(ExecConstants.USER_ENCRYPTION_SASL_ENABLED); final boolean bitEncryptionEnabled = config.getBoolean(ExecConstants.BIT_ENCRYPTION_SASL_ENABLED); +// If the user is logged in and is admin user then show the admin user info +// For all other cases the user info need-not or should-not be displayed +OptionManager optionManager = work.getContext().getOptionManager(); +final boolean isUserLoggedIn = AuthDynamicFeature.isUserLoggedIn(sc); +String adminUsers = isUserLoggedIn ? + ExecConstants.ADMIN_USERS_VALIDATOR.getAdminUsers(optionManager) : null; +String adminUserGroups = isUserLoggedIn ? + ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.getAdminUserGroups(optionManager) : null; + +// separate groups by comma + space +if (adminUsers != null) { + String[] groups = adminUsers.split(","); + adminUsers = DrillStringUtils.join(groups, ", "); +} + +// separate groups by comma + space +if (adminUserGroups != null) { + String[] groups = adminUserGroups.split(","); + adminUserGroups = DrillStringUtils.join(groups, ", "); --- End diff -- I pushing changes that handle ill-formatted user input ---
[GitHub] drill pull request #983: DRILL-5819: Default value of security.admin.user_gr...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/983#discussion_r144143087 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/server/TestOptionsAuthEnabled.java --- @@ -115,4 +124,59 @@ private void setOptHelper() throws Exception { test(String.format("ALTER SYSTEM SET `%s` = %d;", ExecConstants.SLICE_TARGET, ExecConstants.SLICE_TARGET_DEFAULT)); } } + + @Test + public void testAdminUserOptions() throws Exception { +FixtureBuilder builder = ClusterFixture.builder(); + +try (ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + OptionManager optionManager = cluster.drillbit().getContext().getOptionManager(); + + // Admin Users Tests + // config file should have the 'fake' default admin user and it should be returned + // by the option manager if the option has not been set by the user + String configAdminUser = optionManager.getOption(ExecConstants.ADMIN_USERS_VALIDATOR);; + assertEquals(configAdminUser, ExecConstants.ADMIN_USERS_VALIDATOR.DEFAULT_ADMIN_USERS); + + // Option accessor should never return the 'fake' default from the config + String adminUser1 = ExecConstants.ADMIN_USERS_VALIDATOR.getAdminUsers(optionManager); + assertNotEquals(adminUser1, ExecConstants.ADMIN_USERS_VALIDATOR.DEFAULT_ADMIN_USERS); + + // Change TEST_ADMIN_USER if necessary + String TEST_ADMIN_USER = "ronswanson"; + if (adminUser1.equals(TEST_ADMIN_USER)) { +TEST_ADMIN_USER += "thefirst"; + } + // Check if the admin option accessor honors a user-supplied values + String sql = String.format("ALTER SYSTEM SET `%s`='%s'", ExecConstants.ADMIN_USERS_KEY, TEST_ADMIN_USER); + client.queryBuilder().sql(sql).run(); + String adminUser2 = ExecConstants.ADMIN_USERS_VALIDATOR.getAdminUsers(optionManager); + assertEquals(adminUser2, TEST_ADMIN_USER); + + // Admin User Groups Tests + + // config file should have the 'fake' default admin user and it should be returned + // by the option manager if the option has not been set by the user + String configAdminUserGroups = optionManager.getOption(ExecConstants.ADMIN_USER_GROUPS_VALIDATOR); + assertEquals(configAdminUserGroups, ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.DEFAULT_ADMIN_USER_GROUPS); + + // Option accessor should never return the 'fake' default from the config + String adminUserGroups1 = ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.getAdminUserGroups(optionManager); + assertNotEquals(adminUserGroups1, ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.DEFAULT_ADMIN_USER_GROUPS); + + // Change TEST_ADMIN_USER_GROUPS if necessary + String TEST_ADMIN_USER_GROUPS = "yakshavers"; + if (adminUserGroups1.equals(TEST_ADMIN_USER_GROUPS)) { +TEST_ADMIN_USER_GROUPS += ",wormracers"; --- End diff -- Done. It was a constant before I changed the code to handle the corner case. ---
[GitHub] drill pull request #983: DRILL-5819: Default value of security.admin.user_gr...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/983#discussion_r144143103 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/server/TestOptionsAuthEnabled.java --- @@ -115,4 +124,59 @@ private void setOptHelper() throws Exception { test(String.format("ALTER SYSTEM SET `%s` = %d;", ExecConstants.SLICE_TARGET, ExecConstants.SLICE_TARGET_DEFAULT)); } } + + @Test + public void testAdminUserOptions() throws Exception { +FixtureBuilder builder = ClusterFixture.builder(); + +try (ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + OptionManager optionManager = cluster.drillbit().getContext().getOptionManager(); + + // Admin Users Tests + // config file should have the 'fake' default admin user and it should be returned + // by the option manager if the option has not been set by the user + String configAdminUser = optionManager.getOption(ExecConstants.ADMIN_USERS_VALIDATOR);; --- End diff -- Done ---
[GitHub] drill pull request #983: DRILL-5819: Default value of security.admin.user_gr...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/983#discussion_r144143113 --- Diff: common/src/main/java/org/apache/drill/common/util/DrillStringUtils.java --- @@ -1,203 +1,258 @@ -/** - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.drill.common.util; - -import io.netty.buffer.ByteBuf; - -import org.apache.commons.lang3.StringEscapeUtils; -import org.apache.commons.lang3.StringUtils; - -public class DrillStringUtils { - - /** - * Converts the long number into more human readable string. - */ - public static String readable(long bytes) { -int unit = 1024; -long absBytes = Math.abs(bytes); -if (absBytes < unit) { - return bytes + " B"; -} -int exp = (int) (Math.log(absBytes) / Math.log(unit)); -char pre = ("KMGTPE").charAt(exp-1); -return String.format("%s%.1f %ciB", (bytes == absBytes ? "" : "-"), absBytes / Math.pow(unit, exp), pre); - } - - - /** - * Unescapes any Java literals found in the {@code String}. - * For example, it will turn a sequence of {@code '\'} and - * {@code 'n'} into a newline character, unless the {@code '\'} - * is preceded by another {@code '\'}. - * - * @param input the {@code String} to unescape, may be null - * @return a new unescaped {@code String}, {@code null} if null string input - */ - public static final String unescapeJava(String input) { -return StringEscapeUtils.unescapeJava(input); - } - - /** - * Escapes the characters in a {@code String} according to Java string literal - * rules. - * - * Deals correctly with quotes and control-chars (tab, backslash, cr, ff, - * etc.) so, for example, a tab becomes the characters {@code '\\'} and - * {@code 't'}. - * - * Example: - * - * input string: He didn't say, "Stop!" - * output string: He didn't say, \"Stop!\" - * - * - * @param input String to escape values in, may be null - * @return String with escaped values, {@code null} if null string input - */ - public static final String escapeJava(String input) { -return StringEscapeUtils.escapeJava(input); - } - - public static final String escapeNewLines(String input) { -if (input == null) { - return null; -} -StringBuilder result = new StringBuilder(); -boolean sawNewline = false; -for (int i = 0; i < input.length(); i++) { - char curChar = input.charAt(i); - if (curChar == '\r' || curChar == '\n') { -if (sawNewline) { - continue; -} -sawNewline = true; -result.append("\\n"); - } else { -sawNewline = false; -result.append(curChar); - } -} -return result.toString(); - } - - /** - * Copied form commons-lang 2.x code as common-lang 3.x has this API removed. - * (http://commons.apache.org/proper/commons-lang/article3_0.html#StringEscapeUtils.escapeSql) - * @param str - * @return - */ - public static String escapeSql(String str) { -return (str == null) ? null : StringUtils.replace(str, "'", "''"); - } - - /** - * Return a printable representation of a byte buffer, escaping the non-printable - * bytes as '\\xNN' where NN is the hexadecimal representation of such bytes. - * - * This function does not modify the {@code readerIndex} and {@code writerIndex} - * of the byte buffer. - */ - public static String toBinaryString(ByteBuf buf, int strStart, int strEnd) { -StringBuilder result = new StringBuilder(); -for (int i = strStart; i < strEnd ; ++i) { - appendByte(result, buf.getByte(i));
[GitHub] drill pull request #983: DRILL-5819: Default value of security.admin.user_gr...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/983#discussion_r144143076 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/server/TestOptionsAuthEnabled.java --- @@ -115,4 +124,59 @@ private void setOptHelper() throws Exception { test(String.format("ALTER SYSTEM SET `%s` = %d;", ExecConstants.SLICE_TARGET, ExecConstants.SLICE_TARGET_DEFAULT)); } } + + @Test + public void testAdminUserOptions() throws Exception { +FixtureBuilder builder = ClusterFixture.builder(); + +try (ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + OptionManager optionManager = cluster.drillbit().getContext().getOptionManager(); + + // Admin Users Tests + // config file should have the 'fake' default admin user and it should be returned + // by the option manager if the option has not been set by the user + String configAdminUser = optionManager.getOption(ExecConstants.ADMIN_USERS_VALIDATOR);; + assertEquals(configAdminUser, ExecConstants.ADMIN_USERS_VALIDATOR.DEFAULT_ADMIN_USERS); + + // Option accessor should never return the 'fake' default from the config + String adminUser1 = ExecConstants.ADMIN_USERS_VALIDATOR.getAdminUsers(optionManager); + assertNotEquals(adminUser1, ExecConstants.ADMIN_USERS_VALIDATOR.DEFAULT_ADMIN_USERS); + + // Change TEST_ADMIN_USER if necessary + String TEST_ADMIN_USER = "ronswanson"; + if (adminUser1.equals(TEST_ADMIN_USER)) { +TEST_ADMIN_USER += "thefirst"; + } + // Check if the admin option accessor honors a user-supplied values + String sql = String.format("ALTER SYSTEM SET `%s`='%s'", ExecConstants.ADMIN_USERS_KEY, TEST_ADMIN_USER); + client.queryBuilder().sql(sql).run(); + String adminUser2 = ExecConstants.ADMIN_USERS_VALIDATOR.getAdminUsers(optionManager); + assertEquals(adminUser2, TEST_ADMIN_USER); + + // Admin User Groups Tests + + // config file should have the 'fake' default admin user and it should be returned + // by the option manager if the option has not been set by the user + String configAdminUserGroups = optionManager.getOption(ExecConstants.ADMIN_USER_GROUPS_VALIDATOR); + assertEquals(configAdminUserGroups, ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.DEFAULT_ADMIN_USER_GROUPS); + + // Option accessor should never return the 'fake' default from the config + String adminUserGroups1 = ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.getAdminUserGroups(optionManager); + assertNotEquals(adminUserGroups1, ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.DEFAULT_ADMIN_USER_GROUPS); + + // Change TEST_ADMIN_USER_GROUPS if necessary + String TEST_ADMIN_USER_GROUPS = "yakshavers"; + if (adminUserGroups1.equals(TEST_ADMIN_USER_GROUPS)) { +TEST_ADMIN_USER_GROUPS += ",wormracers"; + } + // Check if the admin option accessor honors a user-supplied values + sql = String.format("ALTER SYSTEM SET `%s`='%s'", ExecConstants.ADMIN_USER_GROUPS_KEY, TEST_ADMIN_USER_GROUPS); + client.queryBuilder().sql(sql).run(); + String adminUserGroups2 = ExecConstants.ADMIN_USER_GROUPS_VALIDATOR.getAdminUserGroups(optionManager); + assertEquals(adminUserGroups2, TEST_ADMIN_USER_GROUPS); --- End diff -- I am pushing changes which add these tests. ---
[GitHub] drill pull request #997: DRILL-5582: C++ Client: [Threat Modeling] Drillbit ...
GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/997 DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by … …an attacker and this may lead to data being written to the attacker's target instead of Drillbit You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill KM-DRILL-5582 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/997.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #997 commit 488ebefd4a2d096c9f02cbcdfd8c6984901b3444 Author: karthik <kmanivan...@maprtech.com> Date: 2017-10-17T23:18:45Z DRILL-5582: C++ Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may lead to data being written to the attacker's target instead of Drillbit ---
[GitHub] drill pull request #997: DRILL-5882: C++ Client: [Threat Modeling] Drillbit ...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/997#discussion_r145567205 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -595,6 +611,12 @@ connectionStatus_t DrillClientImpl::validateHandshake(DrillUserProperties* prope switch(this->m_handshakeStatus) { case exec::user::SUCCESS: +// Check if client needs auth/encryption and server is not requiring it +if(clientNeedsAuthentication(properties) || clientNeedsEncryption(properties)) { --- End diff -- - Externalized the messages to errmsgs.cpp - Changed the error message to "Client needs a secure connection but server does not support... " to account for the case where auth and/or enc is required by the client but missing on the server ---
[GitHub] drill pull request #997: DRILL-5882: C++ Client: [Threat Modeling] Drillbit ...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/997#discussion_r145568130 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -595,6 +611,12 @@ connectionStatus_t DrillClientImpl::validateHandshake(DrillUserProperties* prope switch(this->m_handshakeStatus) { case exec::user::SUCCESS: +// Check if client needs auth/encryption and server is not requiring it --- End diff -- Yes. The control flow goes through the AUTH_REQUIRED case when the server requires auth. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r151003788 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java --- @@ -157,10 +157,29 @@ public DrillConfig getConfig() { return context.getConfig(); } - public Collection getBits() { + public Collection getAvailableBits() { return coord.getAvailableEndpoints(); } + public Collection getBits() { +return coord.getOnlineEndPoints(); + } + + public boolean isOnline(DrillbitEndpoint endpoint) { return endpoint.getState().equals(DrillbitEndpoint.State.ONLINE); } + + public boolean isForemanOnline() { +DrillbitEndpoint foreman = getEndpoint(); +Collection dbs = getAvailableBits(); +for( DrillbitEndpoint db : dbs) { + if( db.getAddress().equals(foreman.getAddress()) && db.getUserPort() == foreman.getUserPort()) { --- End diff -- Can you please explain the logic in isForemanOnline(). Why do you have to get the list of endpoints from ZK and then check for foreman in that list before making the isOnline test ? Why can't it be done on the foreman object? Is this to make sure that the state is updated in ZK before refusing to take queries ? Why do you assume that the foreman is online if the foreman is not found in the list of endPoints? i.e. if it is not in the dbs list why do you return true in that case ? ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r140644571 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/rowSet/impl/TupleState.java --- @@ -0,0 +1,353 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.rowSet.impl; + +import java.util.ArrayList; +import java.util.List; + +import org.apache.drill.exec.expr.TypeHelper; +import org.apache.drill.exec.physical.rowSet.impl.ColumnState.BaseMapColumnState; +import org.apache.drill.exec.physical.rowSet.impl.ColumnState.MapArrayColumnState; +import org.apache.drill.exec.physical.rowSet.impl.ColumnState.MapColumnState; +import org.apache.drill.exec.record.ColumnMetadata; +import org.apache.drill.exec.record.MaterializedField; +import org.apache.drill.exec.record.TupleMetadata; +import org.apache.drill.exec.record.TupleSchema; +import org.apache.drill.exec.record.TupleSchema.AbstractColumnMetadata; +import org.apache.drill.exec.vector.ValueVector; +import org.apache.drill.exec.vector.accessor.ObjectType; +import org.apache.drill.exec.vector.accessor.ObjectWriter; +import org.apache.drill.exec.vector.accessor.TupleWriter; +import org.apache.drill.exec.vector.accessor.TupleWriter.TupleWriterListener; +import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter; +import org.apache.drill.exec.vector.accessor.writer.AbstractObjectWriter; +import org.apache.drill.exec.vector.accessor.writer.AbstractTupleWriter; +import org.apache.drill.exec.vector.accessor.writer.ColumnWriterFactory; +import org.apache.drill.exec.vector.complex.AbstractMapVector; + +public abstract class TupleState implements TupleWriterListener { + + public static class RowState extends TupleState { + +/** + * The row-level writer for stepping through rows as they are written, + * and for accessing top-level columns. + */ + +private final RowSetLoaderImpl writer; + +public RowState(ResultSetLoaderImpl rsLoader) { + super(rsLoader, rsLoader.projectionSet); + writer = new RowSetLoaderImpl(rsLoader, schema); + writer.bindListener(this); +} + +public RowSetLoaderImpl rootWriter() { return writer; } + +@Override +public AbstractTupleWriter writer() { return writer; } + +@Override +public int innerCardinality() { return resultSetLoader.targetRowCount();} + } + + public static class MapState extends TupleState { + +protected final AbstractMapVector mapVector; +protected final BaseMapColumnState mapColumnState; +protected int outerCardinality; + +public MapState(ResultSetLoaderImpl rsLoader, +BaseMapColumnState mapColumnState, +AbstractMapVector mapVector, +ProjectionSet projectionSet) { + super(rsLoader, projectionSet); + this.mapVector = mapVector; + this.mapColumnState = mapColumnState; + mapColumnState.writer().bindListener(this); +} + +@Override +protected void columnAdded(ColumnState colState) { + @SuppressWarnings("resource") + ValueVector vector = colState.vector(); + + // Can't materialize the child if the map itself is + // not materialized. + + assert mapVector != null || vector == null; + if (vector != null) { +mapVector.putChild(vector.getField().getName(), vector); + } +} + +@Override +public AbstractTupleWriter writer() { + AbstractObjectWriter objWriter = mapColumnState.writer(); + TupleWriter tupleWriter; + if (objWriter.type() == ObjectType.ARRAY) { +tupleWriter = objWriter.array().tuple(); + } else { +tupleWriter = objWriter.tuple(); + } + return (AbstractTupleWriter) tupleWriter; +
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r151762286 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/DrillBuf.java --- @@ -882,4 +882,71 @@ public void print(StringBuilder sb, int indent, Verbosity verbosity) { } } + // The "unsafe" methods are for use ONLY by code that does its own + // bounds checking. They are called "unsafe" for a reason: they will crash + // the JVM if values are addressed out of bounds. + + /** + * Write an integer to the buffer at the given byte index, without + * bounds checks. + * + * @param index byte (not int) index of the location to write + * @param value the value to write + */ + + public void unsafePutInt(int index, int value) { --- End diff -- The first argument in these unsafePutXXX methods is an offset right? Should the 'index' be changed to an 'offset' ? ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r151762298 --- Diff: exec/vector/src/main/codegen/templates/ColumnAccessors.java --- @@ -275,17 +273,17 @@ public boolean isNull() { final int offset = writeIndex(len); <#else> final int writeIndex = writeIndex(); - <#assign putAddr = "bufAddr + writeIndex * VALUE_WIDTH"> + <#assign putAddr = "writeIndex * VALUE_WIDTH"> --- End diff -- putAddr is not an address but an offset, right ? Should be renamed. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148868884 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +176,60 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcceful_shutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); + if ( queries.isEmpty() && runningFragments.isEmpty()) { return; } - + logger.info("Draining " + queries +" queries and "+ runningFragments+" fragments."); exitLatch = new ExtendedLatch(); } - -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +// Wait uninterruptibly until all the queries and running fragments on that drillbit goes down +// to zero +if( forcceful_shutdown) { + exitLatch.awaitUninterruptibly(5000); +} +else { --- End diff -- Inconsistent if-else formatting ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148871345 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -251,6 +252,11 @@ public void run() { final String originalName = currentThread.getName(); currentThread.setName(queryIdString + ":foreman"); +try { + checkForemanState(); +} catch (ForemanException e) { + addToEventQueue(QueryState.FAILED, new ForemanException("Query submission failed since foreman is shutting down")); --- End diff -- why are you not just adding 'e' to the eventQueue ? why do you have to create a new ForemanException object? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148674083 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/ClusterCoordinator.java --- @@ -60,7 +61,26 @@ */ public abstract Collection getAvailableEndpoints(); + /** + * Get a collection of ONLINE drillbit endpoints by excluding the drillbits + * that are in QUIESCENT state (drillbits that are shutting down). Primarily used by the planner + * to plan queries only on ONLINE drillbits and used by the client during initial connection + * phase to connect to a drillbit (foreman) + * @return A collection of ONLINE endpoints + */ + + public abstract Collection getOnlineEndPoints(); + + public abstract RegistrationHandle update(RegistrationHandle handle, State state); + public interface RegistrationHandle { +/** + * Get the drillbit endpoint associated with the registration handle + * @return drillbit endpoint + */ +public abstract DrillbitEndpoint getEndPoint(); + +public abstract void setEndPoint( DrillbitEndpoint endpoint); --- End diff -- spacing ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148874640 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; + +import ch.qos.logback.classic.Level; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState; +import org.apache.drill.exec.server.Drillbit; +import org.junit.AfterClass; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import org.omg.PortableServer.THREAD_POLICY_ID; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + @BeforeClass + public static void setUpTestData() { +for( int i = 0; i < 1000; i++) { + setupFile(i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); +} + }; + + public FixtureBuilder enableWebServer(FixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.close_drillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); + int grace_period = drillbit.getContext().getConfig().getInt("dri
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148668138 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} +else { + registeredBits.add(endpoint); + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} + } +// Iterator iterator = endpointsMap.keySet().iterator() ; --- End diff -- commented code ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148682167 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition --- End diff -- loop below does not handle change in state, so the comment should be changed. Is the endpointsMap necessary because of port hunting ? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148872633 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java --- @@ -471,6 +471,22 @@ public void close() throws Exception { } /** + * Shutdown the drillbit given the name of the drillbit. + */ + public void close_drillbit(final String drillbitname) throws Exception { +Exception ex = null; +for (Drillbit bit : drillbits()) +{ --- End diff -- placement of '{' ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148861255 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; --- End diff -- Drillbit.quiescentMode and Drillbit.forceful_shutdown are volatiles but currentState is not. Can you explain why this is so ? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148171637 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/local/LocalClusterCoordinator.java --- @@ -85,13 +88,62 @@ public void unregister(final RegistrationHandle handle) { endpoints.remove(handle); } + /** + * Update drillbit endpoint state. Drillbit advertises its + * state. State information is used during planning and initial + * client connection phases. + */ + @Override + public RegistrationHandle update(RegistrationHandle handle, State state) { +DrillbitEndpoint endpoint = handle.getEndPoint(); +endpoint = endpoint.toBuilder().setState(state).build(); +handle.setEndPoint(endpoint); +endpoints.put(handle,endpoint); +return handle; + } + @Override public Collection getAvailableEndpoints() { return endpoints.values(); } + /** + * Get a collection of ONLINE Drillbit endpoints by excluding the drillbits + * that are in QUIESCENT state (drillbits shutting down). Primarily used by the planner + * to plan queries only on ONLINE drillbits and used by the client during initial connection + * phase to connect to a drillbit (foreman) + * @return A collection of ONLINE endpoints + */ + @Override + public Collection getOnlineEndPoints() { +Collection runningEndPoints = new ArrayList<>(); +for (DrillbitEndpoint endpoint: endpoints.values()){ + if(!endpoint.hasState() || endpoint.getState().equals(State.ONLINE)) { +runningEndPoints.add(endpoint); + } +} +return runningEndPoints; + } + private class Handle implements RegistrationHandle { private final UUID id = UUID.randomUUID(); +private DrillbitEndpoint drillbitEndpoint; + +/** + * Get the drillbit endpoint associated with the registration handle + * @return drillbit endpoint + */ +public DrillbitEndpoint getEndPoint() { + return drillbitEndpoint; +} + +public void setEndPoint( DrillbitEndpoint endpoint) { --- End diff -- spacing ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148676672 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -200,11 +206,47 @@ public void unregister(RegistrationHandle handle) { } } + /** + * Update drillbit endpoint state. Drillbit advertises its + * state in Zookeeper when a shutdown request of drillbit is + * triggered. State information is used during planning and + * initial client connection phases. + */ + public RegistrationHandle update(RegistrationHandle handle, State state) { +ZKRegistrationHandle h = (ZKRegistrationHandle) handle; + try { +endpoint = h.endpoint.toBuilder().setState(state).build(); +ServiceInstance serviceInstance = ServiceInstance.builder().name(serviceName).id(h.id).payload(endpoint).build(); --- End diff -- suggestion: wrap this long line since you have already wrapped the other lines and comments ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148173100 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -70,7 +72,10 @@ private final CountDownLatch initialConnection = new CountDownLatch(1); private final TransientStoreFactory factory; private ServiceCache serviceCache; + private DrillbitEndpoint endpoint; +//private HashMap<MultiKey, DrillbitEndpoint> endpointsMap = new HashMap<MultiKey, DrillbitEndpoint>(); --- End diff -- commented code ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148682043 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} +else { + registeredBits.add(endpoint); + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} + } +// Iterator iterator = endpointsMap.keySet().iterator() ; --- End diff -- move the common code out of the if-else ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148686859 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -69,14 +73,30 @@ public final static String SYSTEM_OPTIONS_NAME = "org.apache.drill.exec.server.Drillbit.system_options"; - private boolean isClosed = false; - private final ClusterCoordinator coord; private final ServiceEngine engine; private final PersistentStoreProvider storeProvider; private final WorkManager manager; private final BootStrapContext context; private final WebServer webServer; + private final int gracePeriod; + private DrillbitStateManager stateManager; + + public void setQuiescentMode(boolean quiescentMode) { +this.quiescentMode = quiescentMode; + } + + private volatile boolean quiescentMode; + + public void setForceful_shutdown(boolean forceful_shutdown) { +this.forceful_shutdown = forceful_shutdown; + } + + private volatile boolean forceful_shutdown = false; --- End diff -- any reason why you did not camel-case this variable? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148685835 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} +else { + registeredBits.add(endpoint); + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} + } +// Iterator iterator = endpointsMap.keySet().iterator() ; +// while(iterator.hasNext()) { +//MultiKey key = iterator.next(); +//if(!newDrillbitSet.contains(endpointsMap.get(key))) { +// unregisteredBits.add(endpointsMap.get(key)); +// iterator.remove(); +//} +// } + +// Remove all the endpoints that are newly dead + for ( MultiKey key: endpointsMap.keySet()) + { --- End diff -- open-brace placement for the for-loop is inconsistent with other instances ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148669660 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { --- End diff -- I would recommend extracting endpoint.getAddress() and endpoint.getUserPort() into locals ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148872536 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java --- @@ -471,6 +471,22 @@ public void close() throws Exception { } /** + * Shutdown the drillbit given the name of the drillbit. + */ + public void close_drillbit(final String drillbitname) throws Exception { --- End diff -- camel case. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148684246 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -70,7 +72,10 @@ private final CountDownLatch initialConnection = new CountDownLatch(1); private final TransientStoreFactory factory; private ServiceCache serviceCache; + private DrillbitEndpoint endpoint; +//private HashMap<MultiKey, DrillbitEndpoint> endpointsMap = new HashMap<MultiKey, DrillbitEndpoint>(); + private ConcurrentHashMap<MultiKey, DrillbitEndpoint> endpointsMap = new ConcurrentHashMap<MultiKey,DrillbitEndpoint>(); --- End diff -- please add a comment about what this map contains. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148872381 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -348,6 +354,21 @@ public void run() { */ } + /* +Check if the foreman is ONLINE. If not dont accept any new queries. + */ + public void checkForemanState() throws ForemanException{ +DrillbitEndpoint foreman = drillbitContext.getEndpoint(); +Collection dbs = drillbitContext.getAvailableBits(); --- End diff -- I would recommend refactoring this into a generic boolean isOnline(DrillbitEndpoint endpoint) in the DrillbitEndpoint class. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r149544267 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; + +import ch.qos.logback.classic.Level; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState; +import org.apache.drill.exec.server.Drillbit; +import org.junit.AfterClass; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import org.omg.PortableServer.THREAD_POLICY_ID; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + @BeforeClass + public static void setUpTestData() { +for( int i = 0; i < 1000; i++) { + setupFile(i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); +} + }; + + public FixtureBuilder enableWebServer(FixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.close_drillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); + int grace_period = drillbit.getContext().getConfig().getInt("dri
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r149542196 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -348,6 +354,21 @@ public void run() { */ } + /* +Check if the foreman is ONLINE. If not dont accept any new queries. + */ + public void checkForemanState() throws ForemanException{ +DrillbitEndpoint foreman = drillbitContext.getEndpoint(); +Collection dbs = drillbitContext.getAvailableBits(); --- End diff -- Maybe add it to the DrillbitContext class ? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r149541807 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; --- End diff -- I think Drillbit.quiescentMode and Drillbit.forceful_shutdown also need NOT be volatile given the way they are used now. You don't have to enforce happens-before (by preventing re-ordering) here and even if these variables are volatile, the read of these variables in close() can anyway race with the setting of these variables in another thread doing a stop/gracefulShutdown. Let me know if I am missing anything. That said, adding volatiles can only makes the code more correct (and slower). Since this code is not critical you can let it be as it is. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150047464 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -348,6 +354,21 @@ public void run() { */ } + /* +Check if the foreman is ONLINE. If not dont accept any new queries. + */ + public void checkForemanState() throws ForemanException{ +DrillbitEndpoint foreman = drillbitContext.getEndpoint(); +Collection dbs = drillbitContext.getAvailableBits(); --- End diff -- I was thinking of encapsulating code from lines 360 to 367 into a boolean isOnline(), since all the values in that code are derived from the current DrillbitContext. Then your code would be simplified to ` public void checkForemanState() throws ForemanException{ if (!drillbitContext.isOnline()) { throw new ForemanException("Query submission failed since Foreman is shutting down."); } } ` ---
[GitHub] drill pull request #1064: Fix for SHUTDOWN button being visible for non Admi...
GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/1064 Fix for SHUTDOWN button being visible for non Admin users This fix repurposes an earlier change I had made to show admin user info, to selectively enable the SHUTDOWN button. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill DRILL-6017 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1064.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1064 commit 3d310b23061461fe0e058018a35c1b95f8876f71 Author: karthik <kmanivan...@maprtech.com> Date: 2017-12-07T04:13:24Z Fix for SHUTDOWN button being visible for non Admin users ---
[GitHub] drill pull request #1064: DRILL-6017 Fix for SHUTDOWN button being visible f...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/1064#discussion_r155606076 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -112,7 +114,7 @@ - <#if model.shouldShowUserInfo()> + <#if model.shouldShowAdminInfo()> --- End diff -- I have added the check around the shutdown JS code. ---