[
https://issues.apache.org/jira/browse/DRILL-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Westin updated DRILL-2911:
--------------------------------
Fix Version/s: (was: 1.2.0)
1.3.0
> Queries fail with connection error when some Drillbit processes are down
> ------------------------------------------------------------------------
>
> Key: DRILL-2911
> URL: https://issues.apache.org/jira/browse/DRILL-2911
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 0.9.0
> Reporter: Abhishek Girish
> Assignee: Chris Westin
> Fix For: 1.3.0
>
> Attachments: drillbit_node1.log, drillbit_node2.log,
> drillbit_node3.log, drillbit_node4.log
>
>
> Drill fails with connection error even when the Drill web UI also shows all
> drill-bits to be up. However, some nodes do not list the Drillbit process.
> Looks like an inconsistent state.
> Queries with simple scans execute successfully:
> {code:sql}
> select i_item_sk from item limit 5;
> +------------+
> | i_item_sk |
> +------------+
> | 1 |
> | 2 |
> | 3 |
> | 4 |
> | 5 |
> +------------+
> 5 rows selected (0.112 seconds)
> {code}
> Any query which might span across multiple drill-bits fails with connection
> error:
> {code:sql}
> SELECT
> *
> FROM item i,
> inventory inv
> WHERE inv.inv_item_sk = i.i_item_sk
> LIMIT 10;
> Query failed: CONNECTION ERROR: Exceeded timeout while waiting send
> intermediate work fragments to remote nodes. Sent 4 and only heard response
> back from 3 nodes.
> [5ada1a3e-d198-478b-941d-3c9bb917e494 on abhi7.qa.lab:31010]
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
> {code}
> The issue could possibly be due to a previous failed query.
> Couldn't find the error code in logs. Have attached logs from all nodes for
> reference.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)