Time for 1.10 release

2017-02-22 Thread Jinfeng Ni
Hi Drillers,

It has been almost 3 months since we release Drill 1.9. We have
resolved plenty of fixes and improvements (closed around 88 JIRAs
[1]). I propose that we start the 1.10 release process, and set
Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we should
start build a release candidate.

Please reply in this email thread if you have something near complete
and you would like to include in 1.10 release.

I volunteer as the release manager, unless someone else come forward.

Thanks,

Jinfeng

[1] https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769


[GitHub] drill pull request #750: DRILL-5273: CompliantTextReader excessive memory us...

2017-02-22 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/750#discussion_r102650409
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
 ---
@@ -118,12 +118,21 @@ public boolean apply(@Nullable SchemaPath path) {
* @param outputMutator  Used to create the schema in the output record 
batch
* @throws ExecutionSetupException
*/
+  @SuppressWarnings("resource")
   @Override
   public void setup(OperatorContext context, OutputMutator outputMutator) 
throws ExecutionSetupException {
 
 oContext = context;
-readBuffer = context.getManagedBuffer(READ_BUFFER);
-whitespaceBuffer = context.getManagedBuffer(WHITE_SPACE_BUFFER);
+// Note: DO NOT use managed buffers here. They remain in existence
+// until the fragment is shut down. The buffers here are large.
--- End diff --

The reason is a bit different. The original call allocates a managed 
buffer: it is freed only when the fragment context shuts down at the end of 
query execution. But, if we read many files (5000 in one test case), then we 
leave 5000 buffers in existence for the whole query.

Instead, we want to take control over buffer lifetime. We allocate a 
regular (not managed) buffer ourselves, and then release it when this reader 
closes.

That way, instead of accumulating 5000 buffers of 1 MB each, we have only 
one 1 MB buffer in existence at any one time.

Of course, a further refinement would be to allocate the buffer on the 
ScanBatch and have all 5000 readers sequentially share that same buffer. But, I 
was not sure that any performance benefit was worth the cost in extra code 
complexity...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5295) Unable to query INFORMATION_SCHEMA.`TABLES` if MySql storage plugin enabled

2017-02-22 Thread Martina Ponca (JIRA)
Martina Ponca created DRILL-5295:


 Summary: Unable to query INFORMATION_SCHEMA.`TABLES` if MySql 
storage plugin enabled
 Key: DRILL-5295
 URL: https://issues.apache.org/jira/browse/DRILL-5295
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.9.0
 Environment: Drill 1.9. Error can be reproduced running Drill locally 
on Windows and on Linux within zookeeper. I can reproduce it with 2 mysql 
servers.
Reporter: Martina Ponca


Impact: Unable to connect from Qlik Sense to Drill because of MySql Storage 
Plugin enabled.

Steps to repro:
1. Create a new storage plugin to MySql Community Edition 5.5.43. Enable it.
2. Run query: "select * from INFORMATION_SCHEMA.`TABLES`"
3. Error: 
{code}
Error: SYSTEM ERROR: NullPointerException: Error. Type information for table 
bistoremysql.information_schema.CHARACTER_SETS provided is null.
Fragment 0:0
[Error Id: 2717cfe1-413d-4330-ab3f-720ae92ebc50 on mycomputer.domain.lan:31010]

  (java.lang.NullPointerException) Error. Type information for table 
bistoremysql.information_schema.CHARACTER_SETS provided is null.
com.google.common.base.Preconditions.checkNotNull():250

org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator$Tables.visitTableWithType():314

org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator$Tables.visitTables():308

org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():215

org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():208

org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():208

org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():195
org.apache.drill.exec.store.ischema.InfoSchemaTableType.getRecordReader():58
org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch():36
org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch():30
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():148
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():128
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():101
org.apache.drill.exec.physical.impl.ImplCreator.getExec():79
org.apache.drill.exec.work.fragment.FragmentExecutor.run():206
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745 (state=,code=0)
{code}

The full query Qlik Sense runs:
{code:sql}
select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND 
TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY 
TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
{code}

If I disable the mysql storage plugin, I can run the query and connect from 
Qlik (not a workaround). 

This issue cannot be reproduced using Drill 1.5. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread ppadma
Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/757#discussion_r102615118
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -413,4 +413,8 @@
 
   String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
   BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new 
BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true);
+
+  String USE_DYNAMIC_UDFS = "exec.udf.use_dynamic";
--- End diff --

Currently, we have one FunctionRegistryHolder (LocalFunctionRegistry) for 
both static and dynamic functions, from which we register for each query.  It 
gets updated when we download new jars/functions from zookeeper and will not be 
same as what it is during startup if dynamic UDF support is enabled and 
disabled.  That means if I use table I built during startup, it will miss the 
dynamic UDFs that got added in between and to include them, you have to rebuild 
the table.  For that reason, I choose to  add a new option. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5294) Managed External Sort throws an OOM during the merge and spill phase

2017-02-22 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-5294:


 Summary: Managed External Sort throws an OOM during the merge and 
spill phase
 Key: DRILL-5294
 URL: https://issues.apache.org/jira/browse/DRILL-5294
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli


commit # : 38f816a45924654efd085bf7f1da7d97a4a51e38

The below query fails with managed sort while it succeeds on the old sort
{code}
select * from (select columns[433] col433, columns[0], 
columns[1],columns[2],columns[3],columns[4],columns[5],columns[6],columns[7],columns[8],columns[9],columns[10],columns[11]
 from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50])
 d where d.col433 = 'sjka skjf';
Error: RESOURCE ERROR: External Sort encountered an error while spilling to disk

Fragment 1:11

[Error Id: 0aa20284-cfcc-450f-89b3-645c280f33a4 on qa-node190.qa.lab:31010] 
(state=,code=0)
{code}

Env : 
{code}
No of Drillbits : 1
DRILL_MAX_DIRECT_MEMORY="32G"
DRILL_MAX_HEAP="4G"
{code}

Attached the logs and profile. Data is too large for a jira



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread ppadma
Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/757#discussion_r102611281
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -413,4 +413,8 @@
 
   String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
   BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new 
BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true);
+
+  String USE_DYNAMIC_UDFS = "exec.udf.use_dynamic";
--- End diff --

I did not use the existing option "exec.udf.enable_dynamic_support" because 
if that option is enabled and then disabled later, expectation is that dynamic 
UDFs added in that window continue to be available and working even after 
disabling.  

All other comments addressed. Please review new diffs.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102572925
  
--- Diff: contrib/native/client/src/clientlib/saslAuthenticatorImpl.cpp ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+#include 
+#include "saslAuthenticatorImpl.hpp"
+
+#include "drillClientImpl.hpp"
+#include "logger.hpp"
+
+namespace Drill {
+
+#define DEFAULT_SERVICE_NAME "drill"
+
+#define KERBEROS_SIMPLE_NAME "kerberos"
--- End diff --

Addressed in another comment (for simpler names).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102555281
  
--- Diff: contrib/native/client/src/clientlib/saslAuthenticatorImpl.cpp ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+#include 
+#include "saslAuthenticatorImpl.hpp"
+
+#include "drillClientImpl.hpp"
+#include "logger.hpp"
+
+namespace Drill {
+
+#define DEFAULT_SERVICE_NAME "drill"
+
+#define KERBEROS_SIMPLE_NAME "kerberos"
+#define KERBEROS_SASL_NAME "gssapi"
+#define PLAIN_NAME "plain"
+
+const std::map 
SaslAuthenticatorImpl::MECHANISM_MAPPING = boost::assign::map_list_of
+(KERBEROS_SIMPLE_NAME, KERBEROS_SASL_NAME)
+(PLAIN_NAME, PLAIN_NAME)
+;
+
+boost::mutex SaslAuthenticatorImpl::s_mutex;
+bool SaslAuthenticatorImpl::s_initialized = false;
+
+SaslAuthenticatorImpl::SaslAuthenticatorImpl(const DrillUserProperties* 
const properties) :
+m_properties(properties), m_pConnection(NULL), m_pwd_secret(NULL) {
+
+if (!s_initialized) {
+boost::lock_guard 
lock(SaslAuthenticatorImpl::s_mutex);
+if (!s_initialized) {
+// set plugin path if provided
+if (DrillClientConfig::getSaslPluginPath()) {
+char *saslPluginPath = const_cast(DrillClientConfig::getSaslPluginPath());
+sasl_set_path(0, saslPluginPath);
+}
+
+// loads all the available mechanism and factories in the 
sasl_lib referenced by the path
+const int err = sasl_client_init(NULL);
+if (0 != err) {
+std::stringstream errMsg;
+errMsg << "Failed to load authentication libraries. code: 
" << err;
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << errMsg << std::endl;)
+throw std::runtime_error(errMsg.str().c_str());
+}
+{ // for debugging purposes
+const char **mechanisms = sasl_global_listmech();
+int i = 0;
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "SASL mechanisms 
available on client: " << std::endl;)
+while (mechanisms[i] != NULL) {
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << i << " : " << 
mechanisms[i] << std::endl;)
+i++;
+}
+}
+s_initialized = true;
+}
+}
+}
+
+SaslAuthenticatorImpl::~SaslAuthenticatorImpl() {
+if (m_pwd_secret) {
+free(m_pwd_secret);
+}
+// may be used to negotiated security layers before disposing in the 
future
+if (m_pConnection) {
+sasl_dispose(_pConnection);
+}
+m_pConnection = NULL;
+}
+
+typedef int (*sasl_callback_proc_t)(void); // see sasl_callback_ft
+
+int SaslAuthenticatorImpl::userNameCallback(void *context, int id, const 
char **result, unsigned *len) {
+const std::string* const username = static_cast(context);
+
+if ((SASL_CB_USER == id || SASL_CB_AUTHNAME == id)
+&& username != NULL) {
+*result = username->c_str();
+// *len = (unsigned int) username->length();
+}
+return SASL_OK;
+}
+
+int SaslAuthenticatorImpl::passwordCallback(sasl_conn_t *conn, void 
*context, int id, sasl_secret_t **psecret) {
+const SaslAuthenticatorImpl* const authenticator = static_cast(context);
+
+if (SASL_CB_PASS == id) {
+*psecret = authenticator->m_pwd_secret;
+}
+return SASL_OK;
+}
+
+int SaslAuthenticatorImpl::init(const std::vector& 
mechanisms, exec::shared::SaslMessage& response) {
+// find and set parameters
+std::string authMechanismToUse;
+std::string serviceName;
+std::string serviceHost;
+for (size_t i = 0; i < 

[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102572556
  
--- Diff: contrib/native/client/cmakeModules/FindSASL.cmake ---
@@ -0,0 +1,55 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# - Try to find Cyrus SASL
+
+if (MSVC)
--- End diff --

I will do this (as another commit that @bitblender worked on).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #729: Drill 1328: Support table statistics for Parquet

2017-02-22 Thread gparai
Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/729#discussion_r102604228
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -390,4 +391,15 @@
 
   String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
   BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new 
BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true);
+
+  /**
+   * Option whose value is a long value representing the number of bits 
required for computing ndv (using HLL)
+   */
+  LongValidator NDV_MEMORY_LIMIT = new 
PositiveLongValidator("exec.statistics.ndv_memory_limit", 30, 20);
--- End diff --

We are not mixing different lengths during the same run. The session 
setting at the foreman would be passed along in the plan fragment - so 
non-foreman fragments will use the same value. Also, we do not mix lengths 
across different runs. So this should not be an issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #729: Drill 1328: Support table statistics for Parquet

2017-02-22 Thread gparai
Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/729#discussion_r102602855
  
--- Diff: exec/java-exec/src/main/codegen/data/Parser.tdd ---
@@ -39,7 +39,13 @@
 "METADATA",
 "DATABASE",
 "IF",
-"JAR"
+"JAR",
+"ANALYZE",
+"COMPUTE",
+"ESTIMATE",
+"STATISTICS",
+"SAMPLE",
+"PERCENT"
--- End diff --

@sudheeshkatkam mentioned
> Something like this came up before where a list of non reserved keyword 
might result in some ambiguous queries. See DRILL-2116. Also DRILL-3875.

Hence, these keywords were not added to the non-reserved keyword list. 
Also, I am not sure how we can preserve backward compatibility here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/757#discussion_r102597457
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
---
@@ -91,6 +93,7 @@ public DrillbitContext(
 this.systemOptions = new SystemOptionManager(lpPersistence, provider);
 this.functionRegistry = new 
FunctionImplementationRegistry(context.getConfig(), classpathScan, 
systemOptions);
 this.compiler = new CodeCompiler(context.getConfig(), systemOptions);
+this.table = new DrillOperatorTable(this.functionRegistry, 
this.getOptionManager());
--- End diff --

Avoid calling methods in this class in ctor. Use

`this.table = new DrillOperatorTable(functionRegistry, systemOptions);`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/757#discussion_r102598925
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ---
@@ -91,7 +91,12 @@ public QueryContext(final UserSession session, final 
DrillbitContext drillbitCon
 executionControls = new ExecutionControls(queryOptions, 
drillbitContext.getEndpoint());
 plannerSettings = new PlannerSettings(queryOptions, 
getFunctionRegistry());
 plannerSettings.setNumEndPoints(drillbitContext.getBits().size());
-table = new DrillOperatorTable(getFunctionRegistry(), 
drillbitContext.getOptionManager());
+
+if (getOption(ExecConstants.USE_DYNAMIC_UDFS).bool_val) {
--- End diff --

+ `getOption` uses queryOptions. Is that intended?
+ If so, avoid calling methods in this class in ctor. Use 
`queryOptions.getOption(...)`
+ Change declaration to `BooleanValidator USE_DYNAMIC_UDFS_VALIDATOR ...` 
to avoid `.bool_val`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/757#discussion_r102598198
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -413,4 +413,8 @@
 
   String DYNAMIC_UDF_SUPPORT_ENABLED = "exec.udf.enable_dynamic_support";
   BooleanValidator DYNAMIC_UDF_SUPPORT_ENABLED_VALIDATOR = new 
BooleanValidator(DYNAMIC_UDF_SUPPORT_ENABLED, true, true);
+
+  String USE_DYNAMIC_UDFS = "exec.udf.use_dynamic";
--- End diff --

+ Looks like the above option "enables" dynamic UDFs, and this enables 
"using" dynamic UDFs. So why is this option required?
+ If this is intended to be a session option, and the above one is not, 
then make a check to ensure either of the options are enabled in QueryContext.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/757#discussion_r102597862
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java ---
@@ -91,7 +91,12 @@ public QueryContext(final UserSession session, final 
DrillbitContext drillbitCon
 executionControls = new ExecutionControls(queryOptions, 
drillbitContext.getEndpoint());
 plannerSettings = new PlannerSettings(queryOptions, 
getFunctionRegistry());
 plannerSettings.setNumEndPoints(drillbitContext.getBits().size());
-table = new DrillOperatorTable(getFunctionRegistry(), 
drillbitContext.getOptionManager());
+
+if (getOption(ExecConstants.USE_DYNAMIC_UDFS).bool_val) {
+  table = new DrillOperatorTable(getFunctionRegistry(), 
drillbitContext.getOptionManager());
--- End diff --

Avoid calling methods in this class in ctor. Use

`this.table = new 
DrillOperatorTable(drillbitContext.getFunctionImplementationRegistry(), 
drillbitContext.getOptionManager());`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5293) Poor performance of Hash Table due to same hash value as distribution below

2017-02-22 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-5293:
---

 Summary: Poor performance of Hash Table due to same hash value as 
distribution below
 Key: DRILL-5293
 URL: https://issues.apache.org/jira/browse/DRILL-5293
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Codegen
Affects Versions: 1.8.0
Reporter: Boaz Ben-Zvi
Assignee: Boaz Ben-Zvi


The computation of the hash value is basically the same whether for the Hash 
Table (used by Hash Agg, and Hash Join), or for distribution of rows at the 
exchange. As a result, a specific Hash Table (in a parallel minor fragment) 
gets only rows "filtered out" by the partition below ("upstream"), so the 
pattern of this filtering leads to a non uniform usage of the hash buckets in 
the table.
  Here is a simplified example: An exchange partitions into TWO (minor 
fragments), each running a Hash Agg. So the partition sends rows of EVEN hash 
values to the first, and rows of ODD hash values to the second. Now the first 
recomputes the _same_ hash value for its Hash table -- and only the even 
buckets get used !!  (Or with a partition into EIGHT -- possibly only one 
eighth of the buckets would be used !! ) 

   This would lead to longer hash chains and thus a _poor performance_ !

A possible solution -- add a distribution function distFunc (only for 
partitioning) that takes the hash value and "scrambles" it so that the entropy 
in all the bits effects the low bits of the output. This function should be 
applied (in HashPrelUtil) over the generated code that produces the hash value, 
like:

   distFunc( hash32(field1, hash32(field2, hash32(field3, 0))) );

Tested with a huge hash aggregate (64 M rows) and a parallelism of 8 ( 
planner.width.max_per_node = 8 ); minor fragments 0 and 4 used only 1/8 of 
their buckets, the others used 1/4 of their buckets.  Maybe the reason for this 
variance is that distribution is using "hash32AsDouble" and hash agg is using 
"hash32".  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-02-22 Thread ppadma
GitHub user ppadma opened a pull request:

https://github.com/apache/drill/pull/757

DRILL-5290: Provide an option to build operator table once for built-…

…in static functions and reuse it across queries.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppadma/drill DRILL-5290

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/757.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #757


commit 67ae503b0b820cd9f40ae3ef8d703e9135f90a74
Author: Padma Penumarthy 
Date:   2017-02-22T18:31:01Z

DRILL-5290: Provide an option to build operator table once for built-in 
static functions and reuse it across queries.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5292) Better Parquet handling of sparse columns

2017-02-22 Thread Nate Putnam (JIRA)
Nate Putnam created DRILL-5292:
--

 Summary: Better Parquet handling of sparse columns
 Key: DRILL-5292
 URL: https://issues.apache.org/jira/browse/DRILL-5292
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Parquet
Reporter: Nate Putnam


It appears the current implantation of ParquetRecordReader will fill in missing 
columns between files as a NullableIntVector. It would be better if the code 
could determine if that column was defined in a different file (and didn't 
conflict) and use the defined data type. 
 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #756: DRILL-5195: Publish Operator and MajorFragment Stat...

2017-02-22 Thread kkhatua
GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/756

DRILL-5195: Publish Operator and MajorFragment Stats in Profile page

Improved UI
1. Introduction of Tooltips
2. Share of each operator as a percentages of the major fragment and of the 
query
  - This would help identify the most CPU intensive operators within a 
fragment and across the query
3. Rows emitted by each operator
4. For a running query, changes to 'last update' and 'last progress' now 
shows the elapsed time since.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-5195

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/756.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #756


commit d044cf53098516ef2936cefaf1cd6907134f461f
Author: Kunal Khatua 
Date:   2017-02-22T07:06:54Z

DRILL-5190: Display planning and queued time for a query's profile page

Modified UserSharedBit protobuf for marking planning and wait-in-queue end 
times. This will allow for accurately reporting the planning, queued and actual 
execution times of a query.
Planning Time:
In the absence of the planning time's end, for older profiles, the root 
fragment's (i.e. SCREEN operator) start time is taken as the estimated end of 
planning time, and as the estimated start time of the execution phase. 
QueueWait Time:
We do not estimate the queue time if the planning end time is not 
available. 
Execution Time:
We calculate the execution time based on the availability of these 2 
planning time. The computation is done the following way, and reflects a 
decreasing level of accuracy
1. Execution time = [end(QueueWait) - endTime(Query)]
2. Execution time = [end(Planning) - endTime(Query)]
3. Execution time = [start(rootFragment) - endTime(Query)] - {Estimated}

commit ffd684de09ab9eb586755dcf3a80fccb52ec6940
Author: Kunal Khatua 
Date:   2017-02-22T19:01:20Z

DRILL-5195: Publish Operator and MajorFragment Stats in Profile page

Improved UI
1. Introduction of Tooltips
2. Share of each operator as a percentages of the major fragment and of the 
query
  - This would help identify the most CPU intensive operators within a 
fragment and across the query
3. Rows emitted by each operator
4. For a running query, changes to 'last update' and 'last progress' now 
shows the elapsed time since.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #756: DRILL-5195: Publish Operator and MajorFragment Stats in Pr...

2017-02-22 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/756
  
@sudheeshkatkam This commit is rebased on top of another pending PR (#738): 
DRILL-5190. You'll need to apply that before applying this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5291) Parquet Reader produces low density batches - variable width fields

2017-02-22 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5291:
--

 Summary: Parquet Reader produces low density batches - variable 
width fields
 Key: DRILL-5291
 URL: https://issues.apache.org/jira/browse/DRILL-5291
 Project: Apache Drill
  Issue Type: Bug
Reporter: Paul Rogers
 Fix For: 1.8.0


See DRILL-5266 for background. That JIRA analyzed the problem with Parquet 
producing "low density" record batches. That JIRA focused on the issue with 
fixed-width fields: due to a bug, we overestimated the space taken.

Once that bug is fixed, Parquet continues to produce low density batches for 
variable-width fields. DRILL-5266 explains why.

This ticket covers the variable-width case so that we don't lose sight of it 
once the fixed-width case is fixed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5290) Provide an option to build operator table once for built-in static functions and reuse it across queries.

2017-02-22 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5290:
---

 Summary: Provide an option to build operator table once for 
built-in static functions and reuse it across queries.
 Key: DRILL-5290
 URL: https://issues.apache.org/jira/browse/DRILL-5290
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Padma Penumarthy
Assignee: Padma Penumarthy
 Fix For: 1.10


Currently, DrillOperatorTable which contains standard SQL operators and 
functions and Drill User Defined Functions (UDFs) (built-in and dynamic) gets 
built for each query as part of creating QueryContext. This is an expensive 
operation ( ~30 msec to build) and allocates  ~2M on heap for each query. For 
high throughput, concurrent low latency operational queries, we quickly run out 
of heap memory, causing JVM hangs. Build operator table once during startup for 
static built-in functions and save in DrillbitContext, so we can reuse it 
across queries.
Provide an system/session option to not use dynamic UDFs so we can use the 
operator table saved in DrillbitContext and avoid building each time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #747: DRILL-5257: Run-time control of query profiles

2017-02-22 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/747#discussion_r102541268
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -178,6 +182,19 @@ public Foreman(final WorkerBee bee, final 
DrillbitContext drillbitContext,
 final QueryState initialState = queuingEnabled ? QueryState.ENQUEUED : 
QueryState.STARTING;
 recordNewState(initialState);
 enqueuedQueries.inc();
+
+profileOption = setProfileOption(queryContext.getOptions());
+  }
+
+  private ProfileOption setProfileOption(OptionManager options) {
+if (! options.getOption(ExecConstants.ENABLE_QUERY_PROFILE_VALIDATOR)) 
{
--- End diff --

I believe we do allow spacing around operators: a - b instead of a-b, etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5289) Drill should handle OOM due to insufficient heap type of errors more gracefully

2017-02-22 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-5289:


 Summary: Drill should handle OOM due to insufficient heap type of 
errors more gracefully
 Key: DRILL-5289
 URL: https://issues.apache.org/jira/browse/DRILL-5289
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow, Execution - RPC
Affects Versions: 1.10.0
Reporter: Rahul Challapalli


[Git Commit ID will be updated soon]

The below query which uses the managed sort causes an OOM error due to 
insufficient heap, which is a bug in itself. 
{code}
ALTER SESSION SET `exec.sort.disable_managed` = false;
+---+-+
|  ok   |   summary   |
+---+-+
| true  | exec.sort.disable_managed updated.  |
+---+-+
1 row selected (1.096 seconds)
0: jdbc:drill:zk=10.10.100.183:5181> alter session set 
`planner.memory.max_query_memory_per_node` = 14106127360;
+---++
|  ok   |  summary   |
+---++
| true  | planner.memory.max_query_memory_per_node updated.  |
+---++
1 row selected (0.253 seconds)
0: jdbc:drill:zk=10.10.100.183:5181> alter session set 
`planner.width.max_per_node` = 1;
+---+--+
|  ok   |   summary|
+---+--+
| true  | planner.width.max_per_node updated.  |
+---+--+
1 row selected (0.184 seconds)
0: jdbc:drill:zk=10.10.100.183:5181> select * from (select * from 
dfs.`/drill/testdata/resource-manager/250wide.tbl` order by columns[0])d where 
d.columns[0] = 'ljdfhwuehnoiueyf';
{code}
Once the OOM happens chaos follows
{code}
1. Dangling fragments are left behind
2. Query fails but zookeeper thinks its still running
3. Client connection timeouts
4. Profile page shows the same query as both running and failed.
{code}

We should be handling this situation more gracefully as this could be perceived 
as a drillbit stability issue. I attached the jstack. The logs and data set 
used are too big to upload here. Reach out to me if you need more information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102483298
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java
 ---
@@ -0,0 +1,238 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.rpc.security;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.Internal.EnumLite;
+import com.google.protobuf.MessageLite;
+import io.netty.buffer.ByteBuf;
+import org.apache.drill.exec.proto.UserBitShared.SaslMessage;
+import org.apache.drill.exec.proto.UserBitShared.SaslStatus;
+import org.apache.drill.exec.rpc.BasicClient;
+import org.apache.drill.exec.rpc.ClientConnection;
+import org.apache.drill.exec.rpc.RpcException;
+import org.apache.drill.exec.rpc.RpcOutcomeListener;
+import org.apache.hadoop.security.UserGroupInformation;
+
+import javax.security.sasl.SaslClient;
+import javax.security.sasl.SaslException;
+import java.io.IOException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedExceptionAction;
+import java.util.EnumMap;
+import java.util.Map;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+public class AuthenticationOutcomeListener
+implements RpcOutcomeListener {
+  private static final org.slf4j.Logger logger =
+  
org.slf4j.LoggerFactory.getLogger(AuthenticationOutcomeListener.class);
+
+  private static final ImmutableMap 
CHALLENGE_PROCESSORS;
+  static {
+final Map map = new 
EnumMap<>(SaslStatus.class);
+map.put(SaslStatus.SASL_IN_PROGRESS, new SaslInProgressProcessor());
+map.put(SaslStatus.SASL_SUCCESS, new SaslSuccessProcessor());
+map.put(SaslStatus.SASL_FAILED, new SaslFailedProcessor());
+CHALLENGE_PROCESSORS = Maps.immutableEnumMap(map);
+  }
+
+  private final BasicClient 
client;
+  private final R connection;
+  private final T saslRpcType;
+  private final UserGroupInformation ugi;
+  private final RpcOutcomeListener rpcOutcomeListener;
+
+  public AuthenticationOutcomeListener(BasicClient client,
+   R connection, T saslRpcType, 
UserGroupInformation ugi,
+   RpcOutcomeListener 
rpcOutcomeListener) {
+this.client = client;
+this.connection = connection;
+this.saslRpcType = saslRpcType;
+this.ugi = ugi;
+this.rpcOutcomeListener = rpcOutcomeListener;
+  }
+
+  public void initiate(final String mechanismName) {
+logger.trace("Initiating SASL exchange.");
+try {
+  final ByteString responseData;
+  final SaslClient saslClient = connection.getSaslClient();
+  if (saslClient.hasInitialResponse()) {
+responseData = ByteString.copyFrom(evaluateChallenge(ugi, 
saslClient, new byte[0]));
+  } else {
+responseData = ByteString.EMPTY;
+  }
+  client.send(new AuthenticationOutcomeListener<>(client, connection, 
saslRpcType, ugi, rpcOutcomeListener),
+  connection,
+  saslRpcType,
+  SaslMessage.newBuilder()
+  .setMechanism(mechanismName)
+  .setStatus(SaslStatus.SASL_START)
+  .setData(responseData)
+  .build(),
+  SaslMessage.class,
+  true /** the connection will not be backed up at this point */);
+  logger.trace("Initiated SASL exchange.");
+} catch (final Exception e) {
+  rpcOutcomeListener.failed(RpcException.mapException(e));
+}
+  }
+
+  @Override
+  public void failed(RpcException 

[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102483189
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/plain/PlainFactory.java
 ---
@@ -0,0 +1,166 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.rpc.security.plain;
+
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.exec.rpc.security.AuthenticatorFactory;
+import org.apache.drill.exec.rpc.security.FastSaslServerFactory;
+import org.apache.drill.exec.rpc.security.FastSaslClientFactory;
+import org.apache.drill.exec.rpc.user.security.UserAuthenticationException;
+import org.apache.drill.exec.rpc.user.security.UserAuthenticator;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+
+import javax.security.auth.callback.Callback;
+import javax.security.auth.callback.CallbackHandler;
+import javax.security.auth.callback.NameCallback;
+import javax.security.auth.callback.PasswordCallback;
+import javax.security.auth.callback.UnsupportedCallbackException;
+import javax.security.auth.login.LoginException;
+import javax.security.sasl.AuthorizeCallback;
+import javax.security.sasl.SaslClient;
+import javax.security.sasl.SaslException;
+import javax.security.sasl.SaslServer;
+import java.io.IOException;
+import java.security.Security;
+import java.util.Map;
+
+public class PlainFactory implements AuthenticatorFactory {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(PlainFactory.class);
+
+  public static final String SIMPLE_NAME = PlainServer.MECHANISM_NAME;
+
+  static {
+Security.addProvider(new PlainServer.PlainServerProvider());
+  }
+
+  private final UserAuthenticator authenticator;
+
+  public PlainFactory() {
+this.authenticator = null;
+  }
+
+  public PlainFactory(final UserAuthenticator authenticator) {
+this.authenticator = authenticator;
+  }
+
+  @Override
+  public String getSimpleName() {
+return SIMPLE_NAME;
+  }
+
+  @Override
+  public UserGroupInformation createAndLoginUser(Map 
properties) throws IOException {
+final Configuration conf = new Configuration();
+UserGroupInformation.setConfiguration(conf);
+try {
+  return UserGroupInformation.getCurrentUser();
+} catch (final IOException e) {
+  logger.debug("Login failed.", e);
+  final Throwable cause = e.getCause();
+  if (cause instanceof LoginException) {
+throw new SaslException("Failed to login.", cause);
+  }
+  throw new SaslException("Unexpected failure trying to login. ", 
cause);
+}
+  }
+
+  @Override
+  public SaslServer createSaslServer(final UserGroupInformation ugi, final 
Map properties)
+  throws SaslException {
+return 
FastSaslServerFactory.getInstance().createSaslServer(SIMPLE_NAME, null /** 
protocol */,
+null /** serverName */, properties, new 
PlainServerCallbackHandler());
+  }
+
+  @Override
+  public SaslClient createSaslClient(final UserGroupInformation ugi, final 
Map properties)
+  throws SaslException {
+final String userName = (String) properties.get(DrillProperties.USER);
+final String password = (String) 
properties.get(DrillProperties.PASSWORD);
+
+return FastSaslClientFactory.getInstance().createSaslClient(new 
String[]{SIMPLE_NAME},
+null /** authorization ID */, null, null, properties, new 
CallbackHandler() {
+  @Override
+  public void handle(final Callback[] callbacks) throws 
IOException, UnsupportedCallbackException {
+for (final Callback callback : callbacks) {
+  if (callback instanceof NameCallback) {
+

[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102475172
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/ControlConnectionConfig.java
 ---
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.rpc.control;
+
+import org.apache.drill.exec.exception.DrillbitStartupException;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.rpc.BitConnectionConfig;
+import org.apache.drill.exec.server.BootStrapContext;
+import org.apache.drill.exec.work.batch.ControlMessageHandler;
+
+// package private
+class ControlConnectionConfig extends BitConnectionConfig {
+//  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(ControlConnectionConfig.class);
+
+  private final ControlMessageHandler handler;
+
+  private DrillbitEndpoint localEndpoint;
+
+  ControlConnectionConfig(BufferAllocator allocator, BootStrapContext 
context, ControlMessageHandler handler)
+  throws DrillbitStartupException {
+super(allocator, context);
+this.handler = handler;
+  }
+
+  @Override
+  public String getName() {
+return "control"; // unused
+  }
+
+  ControlMessageHandler getMessageHandler() {
+return handler;
+  }
+
+  void setLocalEndpoint(DrillbitEndpoint endpoint) {
--- End diff --

Previous approach was to directly inject the endpoint from the server into 
the registry I believe:

https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/ConnectionManagerRegistry.java#L66


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #578: DRILL-4280: Kerberos Authentication

2017-02-22 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r102473811
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java ---
@@ -88,22 +129,183 @@ public void submitQuery(UserResultsListener 
resultsListener, RunQuery query) {
 send(queryResultHandler.getWrappedListener(resultsListener), 
RpcType.RUN_QUERY, query, QueryId.class);
   }
 
-  public void connect(RpcConnectionHandler handler, 
DrillbitEndpoint endpoint,
-  UserProperties props, UserBitShared.UserCredentials 
credentials) {
+  public CheckedFuture connect(DrillbitEndpoint 
endpoint, DrillProperties parameters,
+   UserCredentials 
credentials) {
+final FutureHandler handler = new FutureHandler();
 UserToBitHandshake.Builder hsBuilder = UserToBitHandshake.newBuilder()
 .setRpcVersion(UserRpcConfig.RPC_VERSION)
 .setSupportListening(true)
 .setSupportComplexTypes(supportComplexTypes)
 .setSupportTimeout(true)
 .setCredentials(credentials)
-.setClientInfos(UserRpcUtils.getRpcEndpointInfos(clientName));
+.setClientInfos(UserRpcUtils.getRpcEndpointInfos(clientName))
+.setSaslSupport(SaslSupport.SASL_AUTH)
+.setProperties(parameters.serializeForServer());
+this.properties = parameters;
+
+
connectAsClient(queryResultHandler.getWrappedConnectionHandler(handler),
+hsBuilder.build(), endpoint.getAddress(), endpoint.getUserPort());
+return handler;
+  }
+
+  /**
+   * Check (after {@link #connect connecting}) if server requires 
authentication.
+   *
+   * @return true if server requires authentication
+   */
+  public boolean serverRequiresAuthentication() {
+return supportedAuthMechs != null;
+  }
+
+  /**
+   * Returns a list of supported authentication mechanism. If called 
before {@link #connect connecting},
+   * returns null. If called after {@link #connect connecting}, returns a 
list of supported mechanisms
+   * iff authentication is required.
+   *
+   * @return list of supported authentication mechanisms
+   */
+  public List getSupportedAuthenticationMechanisms() {
--- End diff --

Providing a callback for authentication seems a more robust approach 
compared to calling another method with a new set of properties...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5288) JSON data not read as string when store.json.all_text_mode=true

2017-02-22 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-5288:
-

 Summary: JSON data not read as string when 
store.json.all_text_mode=true
 Key: DRILL-5288
 URL: https://issues.apache.org/jira/browse/DRILL-5288
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Reporter: Khurram Faraaz


Setting all text mode to true (at system/session level) does not help in this 
case. Data from the JSON input file is not returned as string.
Drill 1.10.0 git commit id: 300e9349

Data used in test
{noformat}
[root@centos-01 ~]# cat f2.json
{"key":"string", "key":123, "key":[1,2,3], "key":true, "key":false, "key":null, 
"key":{"key2":"b"}, "key":"2011-08-21"}
[root@centos-01 ~]#
{noformat}

{noformat}
0: jdbc:drill:schema=dfs.tmp> alter session set `store.json.all_text_mode`=true;
+---++
|  ok   |  summary   |
+---++
| true  | store.json.all_text_mode updated.  |
+---++
1 row selected (0.176 seconds)
0: jdbc:drill:schema=dfs.tmp> select key from `f2.json`;
Error: SYSTEM ERROR: IllegalStateException: You tried to start when you are 
using a ValueWriter of type NullableVarCharWriterImpl.

Fragment 0:0

[Error Id: 3cb5b806-53b6-49bf-ab5e-59c666259463 on centos-01.qa.lab:31010] 
(state=,code=0)
{noformat}

stack trace from drillbit.log

{noformat}
[Error Id: 3cb5b806-53b6-49bf-ab5e-59c666259463 on centos-01.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
IllegalStateException: You tried to start when you are using a ValueWriter of 
type NullableVarCharWriterImpl.

Fragment 0:0

[Error Id: 3cb5b806-53b6-49bf-ab5e-59c666259463 on centos-01.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
 ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293)
 [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262)
 [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: java.lang.IllegalStateException: You tried to start when you are 
using a ValueWriter of type NullableVarCharWriterImpl.
at 
org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.startList(AbstractFieldWriter.java:108)
 ~[vector-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.NullableVarCharWriterImpl.startList(NullableVarCharWriterImpl.java:88)
 ~[vector-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataAllText(JsonReader.java:621)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataAllText(JsonReader.java:466)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch(JsonReader.java:319)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector(JsonReader.java:262)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.write(JsonReader.java:217) 
~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:206)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:179) 
~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at