[GitHub] drill issue #769: DRILL-5313: Fix compilation issue in C++ connector

2017-03-02 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/769
  
Built, and sanity tested using querySubmitter test application.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #769: DRILL-5313: Fix compilation issue in C++ connector

2017-03-02 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/769
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #770: DRILL-5311: Check handshake result in C++ connector

2017-03-02 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/770
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Time for 1.10 release

2017-03-02 Thread Jinfeng Ni
Looks like the C++ client is not able to built successfully [1], after
the change of DRILL-5301 / DRILL-5167.

This seems to be a blocking issue for 1.10.0, and I'll merge the patch
once it's verified/reviewed.

1. https://issues.apache.org/jira/browse/DRILL-5313

On Thu, Mar 2, 2017 at 5:37 PM, Jinfeng Ni  wrote:
> I missed 5208, because it did not show up in Paul's list when he replied to
> this thread.
>
> On Thu, Mar 2, 2017 at 2:58 PM Zelaine Fong  wrote:
>>
>> Jinfeng,
>>
>> I notice the following Jira has the ready-to-commit label but isn’t on
>> your list:
>>
>> DRILL-5208
>>
>> Was this one overlooked?
>>
>> -- Zelaine
>>
>> On 3/2/17, 1:04 PM, "Jinfeng Ni"  wrote:
>>
>> The following PRs have been merged to Apache master.
>>
>> DRILL-4994
>> DRILL-4730
>> DRILL-5301
>> DRILL-5167
>> DRILL-5221
>> DRILL-5258
>> DRILL-5034
>> DRILL-4963
>> DRILL-5252
>> DRILL-5266
>> DRILL-5284
>> DRILL-5304
>> DRILL-5290
>> DRILL-5287
>>
>> QA folks will run tests. If no issue found, I'll build a RC0 candidate
>> for 1.10 and start the vote.
>>
>> Thanks,
>>
>> Jinfeng
>>
>>
>>
>> On Thu, Mar 2, 2017 at 8:30 AM, Jinfeng Ni  wrote:
>> > I'm building a merge branch, and hopefully push to master branch
>> today
>> > if things go smoothly.
>> >
>> >
>> > On Wed, Mar 1, 2017 at 7:13 PM, Padma Penumarthy
>>  wrote:
>> >> Hi Jinfeng,
>> >>
>> >> Please include DRILL-5287, DRILL-5290 and DRILL-5304.
>> >>
>> >> Thanks,
>> >> Padma
>> >>
>> >>
>> >>> On Feb 22, 2017, at 11:16 PM, Jinfeng Ni  wrote:
>> >>>
>> >>> Hi Drillers,
>> >>>
>> >>> It has been almost 3 months since we release Drill 1.9. We have
>> >>> resolved plenty of fixes and improvements (closed around 88 JIRAs
>> >>> [1]). I propose that we start the 1.10 release process, and set
>> >>> Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we
>> should
>> >>> start build a release candidate.
>> >>>
>> >>> Please reply in this email thread if you have something near
>> complete
>> >>> and you would like to include in 1.10 release.
>> >>>
>> >>> I volunteer as the release manager, unless someone else come
>> forward.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Jinfeng
>> >>>
>> >>> [1]
>> https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769
>> >>
>>
>>
>


[GitHub] drill pull request #770: DRILL-5311: Check handshake result in C++ connector

2017-03-02 Thread laurentgo
GitHub user laurentgo opened a pull request:

https://github.com/apache/drill/pull/770

DRILL-5311: Check handshake result in C++ connector

In C++ client connector, DrillClientImpl::recvHandshake always
return success, even in case of connection error (like a tcp
timeout issue). Only on WIN32 platform would the error code be
checked.

Remove the restriction to only check on WIN32, plus add some logging.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/laurentgo/drill laurent/DRILL-5311

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #770


commit 23b7ca2fc082608b4d05faf8986644dc952e5830
Author: Laurent Goujon 
Date:   2017-03-03T04:38:05Z

DRILL-5311: Check handshake result in C++ connector

In C++ client connector, DrillClientImpl::recvHandshake always
return success, even in case of connection error (like a tcp
timeout issue). Only on WIN32 platform would the error code be
checked.

Remove the restriction to only check on WIN32, plus add some logging.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #768: DRILL-5313: Fix build failure in C++ client

2017-03-02 Thread sohami
Github user sohami commented on the issue:

https://github.com/apache/drill/pull/768
  
Thanks for the actual fix. I will close this pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #769: DRILL-5313: Fix compilation issue in C++ connector

2017-03-02 Thread laurentgo
GitHub user laurentgo opened a pull request:

https://github.com/apache/drill/pull/769

DRILL-5313: Fix compilation issue in C++ connector

DRILL-5301 and DRILL-5167 have conflicting changes, which causes
the C++ connector to not compile: the static symbol for the search
escape string has been removed as the server might use a different one.

Fix the issue by using the current search escape string (injected from the
meta to the internal drill client when querying metadata).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/laurentgo/drill laurent/DRILL-5313

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/769.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #769


commit 795bdf53070ba2d20d65f23f0313b2777e334611
Author: Laurent Goujon 
Date:   2017-03-03T04:03:42Z

DRILL-5313: Fix compilation issue in C++ connector

DRILL-5301 and DRILL-5167 have conflicting changes, which causes
the C++ connector to not compile: the static symbol for the search
escape string has been removed as the server might use a different one.

Fix the issue by using the current search escape string (injected from the
meta to the internal drill client when querying metadata).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #768: DRILL-5313: Fix build failure in C++ client

2017-03-02 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/768#discussion_r104085400
  
--- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp ---
@@ -662,8 +662,9 @@ DrillClientQueryResult* 
DrillClientImpl::ExecuteQuery(const PreparedStatement& p
 }
 
 static void updateLikeFilter(exec::user::LikeFilter& likeFilter, const 
std::string& pattern) {
-   likeFilter.set_pattern(pattern);
-   likeFilter.set_escape(meta::DrillMetadata::s_searchEscapeString);
+likeFilter.set_pattern(pattern);
+exec::user::ServerMeta srvrMetaData = 
meta::DrillMetadata::s_defaultServerMeta; 
--- End diff --

it's a workaround but it should use the actual escape character and not the 
default one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #768: DRILL-5313: Fix build failure in C++ client

2017-03-02 Thread sohami
GitHub user sohami opened a pull request:

https://github.com/apache/drill/pull/768

DRILL-5313: Fix build failure in C++ client



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sohami/drill DRILL-5313

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/768.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #768


commit 40e81345d6406cc1ceb38dd1b036332726a40b1a
Author: Sorabh Hamirwasia 
Date:   2017-03-03T03:42:32Z

DRILL-5313: Fix build failure in C++ client




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5313) C++ client build failure on linux

2017-03-02 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5313:


 Summary: C++ client build failure on linux
 Key: DRILL-5313
 URL: https://issues.apache.org/jira/browse/DRILL-5313
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - C++
Affects Versions: 1.10
Reporter: Sorabh Hamirwasia
Assignee: Laurent Goujon


We are seeing below errors while building Drill C++ client on linux platform:

[root@qa-node161 build]# make
[  6%] Built target y2038
[ 38%] Built target protomsgs
[ 41%] Building CXX object 
src/clientlib/CMakeFiles/drillClient.dir/drillClientImpl.cpp.o
/root/drill/drill/contrib/native/client/src/clientlib/drillClientImpl.cpp: In 
function ‘void Drill::updateLikeFilter(exec::user::LikeFilter&, const 
std::string&)’:
/root/drill/drill/contrib/native/client/src/clientlib/drillClientImpl.cpp:782: 
error: ‘s_searchEscapeString’ is not a member of ‘Drill::meta::DrillMetadata’
make[2]: *** [src/clientlib/CMakeFiles/drillClient.dir/drillClientImpl.cpp.o] 
Error 1
make[1]: *** [src/clientlib/CMakeFiles/drillClient.dir/all] Error 2
make: *** [all] Error 2

It looks to be related to one of Laurent's pull request below:
https://github.com/apache/drill/pull/712



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Time for 1.10 release

2017-03-02 Thread Jinfeng Ni
I missed 5208, because it did not show up in Paul's list when he replied to
this thread.

On Thu, Mar 2, 2017 at 2:58 PM Zelaine Fong  wrote:

> Jinfeng,
>
> I notice the following Jira has the ready-to-commit label but isn’t on
> your list:
>
> DRILL-5208
>
> Was this one overlooked?
>
> -- Zelaine
>
> On 3/2/17, 1:04 PM, "Jinfeng Ni"  wrote:
>
> The following PRs have been merged to Apache master.
>
> DRILL-4994
> DRILL-4730
> DRILL-5301
> DRILL-5167
> DRILL-5221
> DRILL-5258
> DRILL-5034
> DRILL-4963
> DRILL-5252
> DRILL-5266
> DRILL-5284
> DRILL-5304
> DRILL-5290
> DRILL-5287
>
> QA folks will run tests. If no issue found, I'll build a RC0 candidate
> for 1.10 and start the vote.
>
> Thanks,
>
> Jinfeng
>
>
>
> On Thu, Mar 2, 2017 at 8:30 AM, Jinfeng Ni  wrote:
> > I'm building a merge branch, and hopefully push to master branch
> today
> > if things go smoothly.
> >
> >
> > On Wed, Mar 1, 2017 at 7:13 PM, Padma Penumarthy <
> ppenumar...@mapr.com> wrote:
> >> Hi Jinfeng,
> >>
> >> Please include DRILL-5287, DRILL-5290 and DRILL-5304.
> >>
> >> Thanks,
> >> Padma
> >>
> >>
> >>> On Feb 22, 2017, at 11:16 PM, Jinfeng Ni  wrote:
> >>>
> >>> Hi Drillers,
> >>>
> >>> It has been almost 3 months since we release Drill 1.9. We have
> >>> resolved plenty of fixes and improvements (closed around 88 JIRAs
> >>> [1]). I propose that we start the 1.10 release process, and set
> >>> Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we
> should
> >>> start build a release candidate.
> >>>
> >>> Please reply in this email thread if you have something near
> complete
> >>> and you would like to include in 1.10 release.
> >>>
> >>> I volunteer as the release manager, unless someone else come
> forward.
> >>>
> >>> Thanks,
> >>>
> >>> Jinfeng
> >>>
> >>> [1]
> https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769
> >>
>
>
>


[GitHub] drill pull request #767: DRILL-5226: Managed external sort fixes

2017-03-02 Thread paul-rogers
GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/767

DRILL-5226: Managed external sort fixes

* Memory leak in managed sort if OOM during sv2 allocation
* "Record batch sizer" does not include overhead for variable-sized
vectors
* Paranoid double-checking of merge batch sizes to prevent OOM when the
sizes differ from expectations
* Revised logging

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5226

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/767.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #767


commit 57d99bdfdc5c7150d64d107b549dfb808f1c92a4
Author: Paul Rogers 
Date:   2017-03-03T00:09:01Z

DRILL-5226: Managed external sort fixes

* Memory leak in managed sort if OOM during sv2 allocation
* "Record batch sizer" does not include overhead for variable-sized
vectors
* Paranoid double-checking of merge batch sizes to prevent OOM when the
sizes differ from expectations
* Revised logging




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-4301) OOM : Unable to allocate sv2 for 1000 records, and not enough batchGroups to spill.

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-4301.

   Resolution: Fixed
Fix Version/s: 1.10.0

> OOM : Unable to allocate sv2 for 1000 records, and not enough batchGroups to 
> spill.
> ---
>
> Key: DRILL-4301
> URL: https://issues.apache.org/jira/browse/DRILL-4301
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Flow
>Affects Versions: 1.5.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
>Assignee: Paul Rogers
> Fix For: 1.10.0
>
>
> Query below in Functional tests, fails due to OOM 
> {code}
> select * from dfs.`/drill/testdata/metadata_caching/fewtypes_boolpartition` 
> where bool_col = true;
> {code}
> Drill version : drill-1.5.0
> JAVA_VERSION=1.8.0
> {noformat}
> version   commit_id   commit_message  commit_time build_email 
> build_time
> 1.5.0-SNAPSHOT2f0e3f27e630d5ac15cdaef808564e01708c3c55
> DRILL-4190 Don't hold on to batches from left side of merge join.   
> 20.01.2016 @ 22:30:26 UTC   Unknown 20.01.2016 @ 23:48:33 UTC
> framework/framework/resources/Functional/metadata_caching/data/bool_partition1.q
>  (connection: 808078113)
> [#1378] Query failed: 
> oadd.org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: 
> One or more nodes ran out of memory while executing the query.
> Unable to allocate sv2 for 1000 records, and not enough batchGroups to spill.
> batchGroups.size 0
> spilledBatchGroups.size 0
> allocated memory 48326272
> allocator limit 46684427
> Fragment 0:0
> [Error Id: 97d58ea3-8aff-48cf-a25e-32363b8e0ecd on drill-demod2:31010]
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> 

[jira] [Resolved] (DRILL-5294) Managed External Sort throws an OOM during the merge and spill phase

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5294.

Resolution: Fixed

> Managed External Sort throws an OOM during the merge and spill phase
> 
>
> Key: DRILL-5294
> URL: https://issues.apache.org/jira/browse/DRILL-5294
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
> Fix For: 1.10.0
>
> Attachments: 2751ce6d-67e6-ae08-3b68-e33b29f9d2a3.sys.drill, 
> drillbit.log, drillbit_scenario2.log, drillbit_scenario3.log, 
> scenario2_profile.sys.drill, scenario3_profile.sys.drill
>
>
> commit # : 38f816a45924654efd085bf7f1da7d97a4a51e38
> The below query fails with managed sort while it succeeds on the old sort
> {code}
> select * from (select columns[433] col433, columns[0], 
> columns[1],columns[2],columns[3],columns[4],columns[5],columns[6],columns[7],columns[8],columns[9],columns[10],columns[11]
>  from dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50])
>  d where d.col433 = 'sjka skjf';
> Error: RESOURCE ERROR: External Sort encountered an error while spilling to 
> disk
> Fragment 1:11
> [Error Id: 0aa20284-cfcc-450f-89b3-645c280f33a4 on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> Env : 
> {code}
> No of Drillbits : 1
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> {code}
> Attached the logs and profile. Data is too large for a jira



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5210) External Sort BatchGroup leaks memory if an OOM occurs during read

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5210.

   Resolution: Fixed
Fix Version/s: 1.10.0

> External Sort BatchGroup leaks memory if an OOM occurs during read
> --
>
> Key: DRILL-5210
> URL: https://issues.apache.org/jira/browse/DRILL-5210
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.10.0
>
>
> The External Sort operator (batch) can spill to disk when it runs out of 
> memory. To do so, it uses a class called {{BatchGroup}}. Later, when the sort 
> merges spilled runs, {{BatchGroup}} reads the run back into memory one batch 
> at a time.
> If an OOM error occurs during the read operation, the partially-read batches 
> leak: they are not released. The fragment executor then issues a memory leak 
> error while shutting down the query.
> This error has probably not been caught until now because the {{BatchGroup}} 
> code does not make use of the fault injector. Elsewhere in the external sort, 
> we use the fault injector to insert a (simulated) OOM exception so we can 
> determine if clean-up occurs properly. No such fault injection code exists in 
> {{BatchGroup}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5262) NPE in managed external sort while spilling to disk

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5262.

Resolution: Fixed

> NPE in managed external sort while spilling to disk
> ---
>
> Key: DRILL-5262
> URL: https://issues.apache.org/jira/browse/DRILL-5262
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
>Priority: Critical
> Fix For: 1.10.0
>
> Attachments: 275da989-3005-1c5f-a40c-2415e6d4e89f.sys.drill, 
> drillbit.log, drill-env.sh, drill-override.conf
>
>
> git.commit.id.abbrev=300e934
> The data (parquet) set used in the below query contains 1000 files which only 
> contain a single row with one integer column and 1 large file ~37 MB. The 
> query fails during spilling
> {code}
> alter session set `planner.memory.max_query_memory_per_node` = 37127360;
> alter session set `planner.width.max_per_node` = 1;
> select count(*) from (select * from small_large_parquet order by col1 desc) 
> d; 
> Error: RESOURCE ERROR: External Sort encountered an error while spilling to 
> disk
> Fragment 2:0
> [Error Id: 50859d9e-373c-4a97-b270-09f1aae74e3b on qa-node183.qa.lab:31010] 
> (state=,code=0)
> {code}
> Exception from the logs
> {code}
> 2017-02-13 17:01:06,430 [275da989-3005-1c5f-a40c-2415e6d4e89f:frag:2:0] INFO  
> o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort 
> encountered an error while spilling to disk (null)
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External 
> Sort encountered an error while spilling to disk
> [Error Id: 50859d9e-373c-4a97-b270-09f1aae74e3b ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
>  ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1336)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1266)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.consolidateBatches(ExternalSortBatch.java:1221)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeSpilledRuns(ExternalSortBatch.java:1122)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:626)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:506)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:92)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> 

[jira] [Resolved] (DRILL-5062) External sort refers to the deprecated HDFS fs.default.name param

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5062.

Resolution: Fixed

> External sort refers to the deprecated HDFS fs.default.name param
> -
>
> Key: DRILL-5062
> URL: https://issues.apache.org/jira/browse/DRILL-5062
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Running a query that uses the external sort produces the following message in 
> the log file:
> {code}
> [org.apache.hadoop.conf.Configuration.deprecation] - fs.default.name is 
> deprecated. Instead, use fs.defaultFS
> {code}
> External sort has the following line in the {{ExternalSortBatch}} constructor:
> {code}
> conf.set("fs.default.name", 
> config.getString(ExecConstants.EXTERNAL_SORT_SPILL_FILESYSTEM));
> {code}
> The {{TestParquetScan}} class has the same deprecated parameter name.
> Looking elsewhere in Drill, it appears that the proper way to configure HDFS 
> is as follows:
> {code}
> fsConf.set(FileSystem.FS_DEFAULT_NAME_KEY, ...
> {code}
> That is, use the constant provided by HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5264) Managed External Sort fails with OOM

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5264.

Resolution: Fixed

> Managed External Sort fails with OOM
> 
>
> Key: DRILL-5264
> URL: https://issues.apache.org/jira/browse/DRILL-5264
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
> Fix For: 1.10.0
>
> Attachments: 275c7003-0e06-42b8-6874-db28bca06d14.sys.drill, 
> drillbit.log, drill-env.sh, drill-override.conf
>
>
> git.commit.id.abbrev=300e934
> The below query fails with an OOM
> {code}
> alter session set `planner.disable_exchanges` = true;
> alter session set `planner.memory.max_query_memory_per_node` = 104857600;
> alter session set `planner.width.max_per_node` = 1;
> select * from dfs.`/drill/testdata/md1362` order by c_email_address;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while executing 
> the query.
> Unable to allocate buffer of size 1048576 due to memory limit. Current 
> allocation: 103972896
> Fragment 0:0
> [Error Id: ba3d1ea7-9bf6-498d-a62a-2ca4e742beea on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> Exception from the logs
> {code}
> 2017-02-14 15:24:17,911 [275c7003-0e06-42b8-6874-db28bca06d14:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes 
> ran out of memory while executing the query. (Unable to allocate buffer of 
> size 1048576 due to memory limit. Current allocation: 103972896)
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.
> Unable to allocate buffer of size 1048576 due to memory limit. Current 
> allocation: 103972896
> [Error Id: ba3d1ea7-9bf6-498d-a62a-2ca4e742beea ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
>  ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:242)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_111]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
> allocate buffer of size 1048576 due to memory limit. Current allocation: 
> 103972896
> at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:217) 
> ~[drill-memory-base-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:192) 
> ~[drill-memory-base-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.VarCharVector.reAlloc(VarCharVector.java:401) 
> ~[vector-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.VarCharVector.copyFromSafe(VarCharVector.java:278)
>  ~[vector-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.NullableVarCharVector.copyFromSafe(NullableVarCharVector.java:355)
>  ~[vector-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.PriorityQueueCopierGen4.doCopy0$(PriorityQueueCopierTemplate.java:290)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PriorityQueueCopierGen4.doCopy(PriorityQueueCopierTemplate.java:280)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PriorityQueueCopierGen4.next(PriorityQueueCopierTemplate.java:76)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next(CopierHolder.java:232)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeSpilledRuns(ExternalSortBatch.java:1140)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:626)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:506)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> 

[jira] [Resolved] (DRILL-5017) Config param drill.exec.sort.external.batch.size is not used

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5017.

Resolution: Fixed

> Config param drill.exec.sort.external.batch.size is not used
> 
>
> Key: DRILL-5017
> URL: https://issues.apache.org/jira/browse/DRILL-5017
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The Drill config file defines the {{drill.exec.sort.external.batch.size}} 
> parameter, as does {{ExecConstants}}:
> {code}
>   String EXTERNAL_SORT_TARGET_BATCH_SIZE = 
> "drill.exec.sort.external.batch.size";
> {code}
> However, this parameter is never used. It seems to be a duplicate of:
> {code}
>   String EXTERNAL_SORT_TARGET_SPILL_BATCH_SIZE = 
> "drill.exec.sort.external.spill.batch.size";
> {code}
> Which, itself, is never used.
> Remove these parameters from {{ExecConstants}}, {{drill-module.conf}}, 
> {{drill-override-example.conf}} (if they appear in those files) and from the 
> documentation (if they appear in the docs.)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5285) Provide detailed, accurate estimate of size consumed by a record batch

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5285.

Resolution: Fixed

> Provide detailed, accurate estimate of size consumed by a record batch
> --
>
> Key: DRILL-5285
> URL: https://issues.apache.org/jira/browse/DRILL-5285
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.10.0
>
>
> DRILL-5080 introduced a {{RecordBatchSizer}} that estimates the space taken 
> by a record batch and determines batch "density."
> Drill provides a large variety of vectors, each with their own internal 
> structure and collections of vectors. For example, fixed vectors use just a 
> data vector. Nullable vectors add an "is set" vector. Variable length vectors 
> add an offset vector. Repeated vectors add a second offset vector.
> The original {{RecordBatchSizer}} attempted to compute sizes for all these 
> vector types. But, the complexity got to be out of hand. This ticket requests 
> to simply bite the bullet and move the calculations into each vector type so 
> that the {{RecordBatchSizer}} can simply use the results of the calculations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5267) Managed external sort spills too often with Parquet data

2017-03-02 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5267.

   Resolution: Fixed
Fix Version/s: (was: 1.10)
   1.10.0

> Managed external sort spills too often with Parquet data
> 
>
> Key: DRILL-5267
> URL: https://issues.apache.org/jira/browse/DRILL-5267
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.10
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.10.0
>
>
> DRILL-5266 describes how Parquet produces low-density record batches. The 
> result of these batches is that the external sort spills more frequently than 
> it should because it sizes spill files based on batch size, not data content 
> of the batch. Since Parquet batches are 95% empty space, the spill files end 
> up far too small.
> Adjust the spill calculations based on actual data content, not the size of 
> the overall record batch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5312) "Record batch sizer" does not include overhead for variable-sized vectors

2017-03-02 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5312:
--

 Summary: "Record batch sizer" does not include overhead for 
variable-sized vectors
 Key: DRILL-5312
 URL: https://issues.apache.org/jira/browse/DRILL-5312
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.10.0


The new "record batch sizer" computes the actual data size of a record given a 
batch of vectors. For most purposes, the record width must include the overhead 
of the offset vectors for variable-sized vectors. The initial code drop 
included only the character data, but not the offset vector size when computing 
row width.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Time for 1.10 release

2017-03-02 Thread Zelaine Fong
Jinfeng,

I notice the following Jira has the ready-to-commit label but isn’t on your 
list:

DRILL-5208

Was this one overlooked?

-- Zelaine 

On 3/2/17, 1:04 PM, "Jinfeng Ni"  wrote:

The following PRs have been merged to Apache master.

DRILL-4994
DRILL-4730
DRILL-5301
DRILL-5167
DRILL-5221
DRILL-5258
DRILL-5034
DRILL-4963
DRILL-5252
DRILL-5266
DRILL-5284
DRILL-5304
DRILL-5290
DRILL-5287

QA folks will run tests. If no issue found, I'll build a RC0 candidate
for 1.10 and start the vote.

Thanks,

Jinfeng



On Thu, Mar 2, 2017 at 8:30 AM, Jinfeng Ni  wrote:
> I'm building a merge branch, and hopefully push to master branch today
> if things go smoothly.
>
>
> On Wed, Mar 1, 2017 at 7:13 PM, Padma Penumarthy  
wrote:
>> Hi Jinfeng,
>>
>> Please include DRILL-5287, DRILL-5290 and DRILL-5304.
>>
>> Thanks,
>> Padma
>>
>>
>>> On Feb 22, 2017, at 11:16 PM, Jinfeng Ni  wrote:
>>>
>>> Hi Drillers,
>>>
>>> It has been almost 3 months since we release Drill 1.9. We have
>>> resolved plenty of fixes and improvements (closed around 88 JIRAs
>>> [1]). I propose that we start the 1.10 release process, and set
>>> Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we should
>>> start build a release candidate.
>>>
>>> Please reply in this email thread if you have something near complete
>>> and you would like to include in 1.10 release.
>>>
>>> I volunteer as the release manager, unless someone else come forward.
>>>
>>> Thanks,
>>>
>>> Jinfeng
>>>
>>> [1] https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769
>>




[GitHub] drill issue #456: DRILL-4566: TDigest, median, and quantile functions

2017-03-02 Thread StevenMPhillips
Github user StevenMPhillips commented on the issue:

https://github.com/apache/drill/pull/456
  
There was a brief discussion on the drill-dev mailing list a few days after 
this PR was posted. Unfortunately the discussion did not culminate in any 
decision. The discussion was mainly around what syntax should we use for these 
functions, since they are actually approximate functions.

If you want to revive the discussion, or propose a resolution, feel free to 
pick up this PR and make sure it can rebase on latest drill master. I probably 
won't be getting to it for at least a few weeks. But if someone else takes it 
up and gets it into a mergeable state, and others in the community are in 
agreement, I think we can merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-5290) Provide an option to build operator table once for built-in static functions and reuse it across queries.

2017-03-02 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong resolved DRILL-5290.
-
Resolution: Fixed

> Provide an option to build operator table once for built-in static functions 
> and reuse it across queries.
> -
>
> Key: DRILL-5290
> URL: https://issues.apache.org/jira/browse/DRILL-5290
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.10.0
>
>
> Currently, DrillOperatorTable which contains standard SQL operators and 
> functions and Drill User Defined Functions (UDFs) (built-in and dynamic) gets 
> built for each query as part of creating QueryContext. This is an expensive 
> operation ( ~30 msec to build) and allocates  ~2M on heap for each query. For 
> high throughput, low latency operational queries, we quickly run out of heap 
> memory, causing JVM hangs. Build operator table once during startup for 
> static built-in functions and save in DrillbitContext, so we can reuse it 
> across queries.
> Provide a system/session option to not use dynamic UDFs so we can use the 
> operator table saved in DrillbitContext and avoid building each time.
> *Please note, changes are adding new option exec.udf.use_dynamic which needs 
> to be documented.*



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-03-02 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong resolved DRILL-5287.
-
Resolution: Fixed

> Provide option to skip updates of ephemeral state changes in Zookeeper
> --
>
> Key: DRILL-5287
> URL: https://issues.apache.org/jira/browse/DRILL-5287
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.10.0
>
>
> We put transient profiles in zookeeper and update state as query progresses 
> and changes states. It is observed that this adds latency of ~45msec for each 
> update in the query execution path. This gets even worse when high number of 
> concurrent queries are in progress. For concurrency=100, the average query 
> response time even for short queries  is 8 sec vs 0.2 sec with these updates 
> disabled. For short lived queries in a high-throughput scenario, it is of no 
> value to update state changes in zookeeper. We need an option to disable 
> these updates for short running operational queries.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5252) A condition returns always true

2017-03-02 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong resolved DRILL-5252.
-
   Resolution: Fixed
Fix Version/s: 1.10.0

> A condition returns always true
> ---
>
> Key: DRILL-5252
> URL: https://issues.apache.org/jira/browse/DRILL-5252
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: JC
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.10.0
>
>
> I've found the following code smell in recent github snapshot.
> Path: 
> exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java
> {code:java}
> 287 
> 288   @Override
> 289   public Boolean visitNullConstant(TypedNullConstant e, LogicalExpression 
> value) throws RuntimeException {
> 290 if (!(value instanceof TypedNullConstant)) {
> 291   return false;
> 292 }
> 293 return e.getMajorType().equals(e.getMajorType());
> 294   }
> 295
> {code}
> Should it be like this?
> {code:java}
> 292 }
> 293 return value.getMajorType().equals(e.getMajorType());
> 294   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5311) C++ connector connect doesn't wait for handshake to complete

2017-03-02 Thread Laurent Goujon (JIRA)
Laurent Goujon created DRILL-5311:
-

 Summary: C++ connector connect doesn't wait for handshake to 
complete
 Key: DRILL-5311
 URL: https://issues.apache.org/jira/browse/DRILL-5311
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - C++
Reporter: Laurent Goujon


The C++ connector connect methods returns okay as soon as the tcp connection is 
succesfully established between client and server, and the handshake message is 
sent. However it doesn't wait for handshake to have completed.

The consequence is that if handshake failed, the error is deferred to the first 
query, which might be unexpected by the application.

I believe that validateHanshake method in drillClientImpl should wait for the 
handshake to complete, as it seems a bit more saner...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5310) Memory leak in managed sort if OOM during sv2 allocation

2017-03-02 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5310:
--

 Summary: Memory leak in managed sort if OOM during sv2 allocation
 Key: DRILL-5310
 URL: https://issues.apache.org/jira/browse/DRILL-5310
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.10.0


See the "identical1" test case in DRILL-5266. Due to misconfiguration, the sort 
was given too little memory to make progress. An OOM error occurred when 
allocating an SV2.

In this scenario, the "converted" record batch is leaked.

Normally, a converted batch is added to the list of in-memory batches, then 
released on {{close()}}. But, in this case, the batch is only a local variable, 
and so leaks.

The code must release this batch in this condition.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Time for 1.10 release

2017-03-02 Thread Jinfeng Ni
The following PRs have been merged to Apache master.

DRILL-4994
DRILL-4730
DRILL-5301
DRILL-5167
DRILL-5221
DRILL-5258
DRILL-5034
DRILL-4963
DRILL-5252
DRILL-5266
DRILL-5284
DRILL-5304
DRILL-5290
DRILL-5287

QA folks will run tests. If no issue found, I'll build a RC0 candidate
for 1.10 and start the vote.

Thanks,

Jinfeng



On Thu, Mar 2, 2017 at 8:30 AM, Jinfeng Ni  wrote:
> I'm building a merge branch, and hopefully push to master branch today
> if things go smoothly.
>
>
> On Wed, Mar 1, 2017 at 7:13 PM, Padma Penumarthy  wrote:
>> Hi Jinfeng,
>>
>> Please include DRILL-5287, DRILL-5290 and DRILL-5304.
>>
>> Thanks,
>> Padma
>>
>>
>>> On Feb 22, 2017, at 11:16 PM, Jinfeng Ni  wrote:
>>>
>>> Hi Drillers,
>>>
>>> It has been almost 3 months since we release Drill 1.9. We have
>>> resolved plenty of fixes and improvements (closed around 88 JIRAs
>>> [1]). I propose that we start the 1.10 release process, and set
>>> Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we should
>>> start build a release candidate.
>>>
>>> Please reply in this email thread if you have something near complete
>>> and you would like to include in 1.10 release.
>>>
>>> I volunteer as the release manager, unless someone else come forward.
>>>
>>> Thanks,
>>>
>>> Jinfeng
>>>
>>> [1] https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769
>>


[GitHub] drill pull request #752: DRILL-5258: Access mock data definition from SQL

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/752


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/701


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #766: DRILL-5304: Queries fail intermittently when there ...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/766


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #712: DRILL-5167: Send escape character for metadata quer...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/712


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #761: DRILL-5284: Roll-up of final fixes for managed sort

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/761


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #749: DRILL-5266: Parquet returns low-density batches

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/749


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #745: DRILL-5252: Fix a condition that always returns tru...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/745


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #733: DRILL-5221: Send cancel message as soon as possible...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/733


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #613: DRILL-4730: Update JDBC DatabaseMetaData implementa...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/613


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #757: DRILL-5290: Provide an option to build operator tab...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/757


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #758: DRILL-5287: Provide option to skip updates of ephem...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/758


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #656: DRILL-5034: Select timestamp from hive generated pa...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/656


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #764: DRILL-5301: Server metadata API

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/764


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #701: DRILL-4963: Fix issues with dynamically loaded overloaded ...

2017-03-02 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/701
  
@jinfengni , 

As it turns out, we do have a comprehensive design for the original feature 
and the MVCC revision. The key goals are that a function, once registered, is 
guaranteed to be available on all Drillbits once it is visible to any 
particular Foreman. Without this guarantee of consistency, DUDFs become 
non-determinstic and will cause customer problems.

We do have a "refresh" operation: registering a DUDF updates ZK which sends 
updates to each node. The problem is the race condition. I register a UDF foo() 
on node A. I run a query from that same node. If my query happens to hit node B 
before the ZK notification, the query will fail. Our goal is that such failure 
cannot happen, hence the need for a "pull" model to augment the ZK-based "push" 
model.

A manual "update" would have the same issue unless we synchronized the 
update across all nodes. Also, the only way to ensure that DUDFs are available 
is to issue an update after adding each DUDF. But, if we did that, we might as 
well make the DUDF registration itself synchronous across all nodes.

And, of course, the node synchronization does not handle the race condition 
in which a new node comes up right after a synchronization starts. We'd have to 
ensure that the new node reads the proper state from ZK. We can do that if we 
first update ZK, then do synchronization to all nodes, then update ZK with the 
fact that all nodes are aware of the DUDF. 

Without the "two-phase" process, our new node can come up, learn of the new 
DUDF and issue a query using the DUDF without some nodes having been notified 
of the synchronization.

Overall, this is a difficult area. Relying on the well-known semantics of 
MVCC makes the problems much easier to solve.

So, the question here is whether it is worth checking in this partial 
solution for 1.10, or just leave the problem open until a complete solution is 
available.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Time for 1.10 release

2017-03-02 Thread Jinfeng Ni
I'm building a merge branch, and hopefully push to master branch today
if things go smoothly.


On Wed, Mar 1, 2017 at 7:13 PM, Padma Penumarthy  wrote:
> Hi Jinfeng,
>
> Please include DRILL-5287, DRILL-5290 and DRILL-5304.
>
> Thanks,
> Padma
>
>
>> On Feb 22, 2017, at 11:16 PM, Jinfeng Ni  wrote:
>>
>> Hi Drillers,
>>
>> It has been almost 3 months since we release Drill 1.9. We have
>> resolved plenty of fixes and improvements (closed around 88 JIRAs
>> [1]). I propose that we start the 1.10 release process, and set
>> Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we should
>> start build a release candidate.
>>
>> Please reply in this email thread if you have something near complete
>> and you would like to include in 1.10 release.
>>
>> I volunteer as the release manager, unless someone else come forward.
>>
>> Thanks,
>>
>> Jinfeng
>>
>> [1] https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769
>


[GitHub] drill issue #456: DRILL-4566: TDigest, median, and quantile functions

2017-03-02 Thread ko3ak
Github user ko3ak commented on the issue:

https://github.com/apache/drill/pull/456
  
Any idea when this pull request will be implemented in mainstream release 
1.10?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #701: DRILL-4963: Fix issues with dynamically loaded overloaded ...

2017-03-02 Thread jinfengni
Github user jinfengni commented on the issue:

https://github.com/apache/drill/pull/701
  
@arina-ielchiieva ,

Regarding your 3rd comment, we probably can discuss further once you have 
the design. I would think we may process "refresh function registry" command as 
a query (or more like CTAS, since it would update something); if one drillbit 
fails, fail the command with proper error message. Use can decide what to do, 
either re-run the command after addressing the cause of failure, or run query 
knowing it might hit problems. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5309) Error: Protocol message was too large. May be malicious.

2017-03-02 Thread Nikolaos Tsipas (JIRA)
Nikolaos Tsipas created DRILL-5309:
--

 Summary: Error: Protocol message was too large.  May be malicious.
 Key: DRILL-5309
 URL: https://issues.apache.org/jira/browse/DRILL-5309
 Project: Apache Drill
  Issue Type: Bug
Reporter: Nikolaos Tsipas


Hi,

I'm getting the following error when running 

{code}
create table logs(ip) as select columns[0] from 
dfs.`/Users/username/workspace/logs`;
{code}

in the embedded apache drill.

{code}
2017-03-02 14:45:37,003 [2747d020-2424-bada-3362-370f201e184b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 420315 out of 
420315 using 16 threads. Time: 12459ms total, 0.471068ms avg, 227ms max.
2017-03-02 14:45:37,003 [2747d020-2424-bada-3362-370f201e184b:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 420315 out of 
420315 using 16 threads. Earliest start: 865.372000 μs, Latest start: 
12427432.35 μs, Average start: 6405853.135060 μs .
2017-03-02 14:46:11,581 [CONTROL-rpc-event-queue] ERROR 
o.a.d.exec.rpc.control.ControlServer - Failure while handling message.
org.apache.drill.exec.rpc.RpcException: Failure while decoding message with 
parser of type. null
at org.apache.drill.exec.rpc.RpcBus.get(RpcBus.java:319) 
~[drill-rpc-1.9.0.jar:1.9.0]
at 
org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:104)
 ~[drill-java-exec-1.9.0.jar:1.9.0]
at 
org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:63) 
~[drill-java-exec-1.9.0.jar:1.9.0]
at 
org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:38) 
~[drill-java-exec-1.9.0.jar:1.9.0]
at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:65) 
~[drill-rpc-1.9.0.jar:1.9.0]
at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:363) 
~[drill-rpc-1.9.0.jar:1.9.0]
at 
org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
 [drill-rpc-1.9.0.jar:1.9.0]
at 
org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:240) 
[drill-rpc-1.9.0.jar:1.9.0]
at 
org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123) 
[drill-rpc-1.9.0.jar:1.9.0]
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) 
[drill-rpc-1.9.0.jar:1.9.0]
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:245) 
[drill-rpc-1.9.0.jar:1.9.0]
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
 [netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.handler.timeout.ReadTimeoutHandler.channelRead(ReadTimeoutHandler.java:150)
 [netty-handler-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
 [netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
 [netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 

Re: Time for 1.10 release

2017-03-02 Thread Padma Penumarthy
Hi Jinfeng,

Please include DRILL-5287, DRILL-5290 and DRILL-5304.

Thanks,
Padma


> On Feb 22, 2017, at 11:16 PM, Jinfeng Ni  wrote:
> 
> Hi Drillers,
> 
> It has been almost 3 months since we release Drill 1.9. We have
> resolved plenty of fixes and improvements (closed around 88 JIRAs
> [1]). I propose that we start the 1.10 release process, and set
> Wednesday 3/1 as the cutoff day for code checkin. After 3/1, we should
> start build a release candidate.
> 
> Please reply in this email thread if you have something near complete
> and you would like to include in 1.10 release.
> 
> I volunteer as the release manager, unless someone else come forward.
> 
> Thanks,
> 
> Jinfeng
> 
> [1] https://issues.apache.org/jira/browse/DRILL/fixforversion/12338769



[GitHub] drill issue #701: DRILL-4963: Fix issues with dynamically loaded overloaded ...

2017-03-02 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/701
  
@jinfengni 

1. Depending on how often udfs are added, we don't expect it to happen 
often though. But you are correct about the overhead for the queries that do 
not use dynamic UDFs.
2. You are right, function registry can be checked several times and can 
slow down the entire query, It's hard to say how much performance will be slow 
down, as it may depend on many factors, like number of parallel queries, number 
of not exact functions in query, ZK time of response and so on).
3. Refresh function registry function is considered but as part of MVCC. It 
could help in current approach but still it could not guarantee that after 
issuing the refresh command all drillbits will sync their local function 
registries with remote one, unless refresh function would wait for all 
drillbits to send their confirmation that sync was done. But what if one of 
drillbits fails to sync, should refresh function have retry mechanism or fail 
immediately, how long it could take the user to wait for refresh command to 
finish execution etc. With MVCC refresh command would need to guarantee that 
only current drillbit is in sync and all above questions will be dropped (more 
in MVCC doc).

Anyway, you are totally right that current approach is covering only the 
gap with function overloading and not optimal and may slow down the queries. 
Having refresh command might partially solve the problem as well but might have 
some issues to be covered. So it's better to dive in MVCC for the most optimal 
implementation. 

Regarding this pull request I don't have strong feelings if it should be 
merged or not. Yes, it would solve problem with functions overloading but it 
may impact performance but it's hard to say how much since many factors may 
have influence.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---