[ 
https://issues.apache.org/jira/browse/DRILL-7364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923940#comment-16923940
 ] 

Abhishek Ravi commented on DRILL-7364:
--------------------------------------

In my experience, the issue of "Failed to fetch messages" on MapR Streams is 
hit when "default stream" has not been configured in plugin config.

 

To answer your additional questions

1) Yes, with 20 partitions the query will run with 20 minor fragments.

2) Each minor fragment is equivalent to a Kafka Consumer and will be part of a 
group.

> Timeout reading from Kafka topic using Kafka plugin
> ---------------------------------------------------
>
>                 Key: DRILL-7364
>                 URL: https://issues.apache.org/jira/browse/DRILL-7364
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Kafka
>    Affects Versions: 1.15.0, 1.16.0
>            Reporter: Aditya Allamraju
>            Priority: Major
>
> When we try to query Mapr-streams(similar to Apache Kafka) topic using Kafka 
> plugin, we see the below timeout being thrown.
> {code:java}
> 0: jdbc:drill:drillbit=10.10.75.158:31010> 
> 0: jdbc:drill:drillbit=10.10.75.158:31010> select count(*) from 
> `/sample-stream:fast-messages` where k<100;
> Error: DATA_READ ERROR: Failed to fetch messages within 200 milliseconds. 
> Consider increasing the value of the property : store.kafka.poll.timeout
> DATA_READ ERROR: Failed to fetch messages within 200 milliseconds. Consider 
> increasing the value of the property : store.kafka.poll.timeout
> [Error Id: 27112f7b-afd8-43cb-9376-32f4c63ad2d8 ]
> Fragment 0:0
> [Error Id: 27112f7b-afd8-43cb-9376-32f4c63ad2d8 on 
> vm75-158.support.mapr.com:31010] (state=,code=0)
> 0: jdbc:drill:drillbit=10.10.75.158:31010> ALTER SYSTEM SET 
> `store.kafka.poll.timeout` = 50000;
> +-------+------------------------------------+
> |  ok   |              summary               |
> +-------+------------------------------------+
> | true  | store.kafka.poll.timeout updated.  |
> +-------+------------------------------------+
> 1 row selected (0.148 seconds)
> 0: jdbc:drill:drillbit=10.10.75.158:31010>
> {code}
> The other interesting behavior is that:
> 1) Even after increasing the timeout value to 50secs, the drill query failed 
> after the execution time crossed 50 secs(~51 secs).
> This pattern continued to whatever value we increased. For ex, after 
> increasing the timeout to 100secs, query failed with above error after 101 
> secs of execution time.
> The user is using Drill 1.15.
> I tried to reproduce this on my test cluster. But this was not consistently 
> reproducing in the test cluster. Whereas in the client's cluster, we were 
> able to reproduce this behavior consistently. We collected the logs. But they 
> have very little info on what's happening.
> I believe it is now essential to know how(and why) the timeout parameter 
> "store.kafka.poll.timeout" is related to the Query execution time to 
> understand this bug.
> I also have few more questions where i couldn't find much documentation.
> _1) Is there a one-to-one mapping between number of partitions of a topic to 
> minor fragments of a query? For ex, If a given topic(t) has say 20 
> partitions, then will the query most likely have 20 minor fragments, other 
> parameters being fairly sized._
> _2) Does each drillbit in the cluster equivalent to a Kafka-Consumer and all 
> such drillbits of a cluster treated as part of a consumer group?_
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to