Re: [Drill 1.10.0/1.12.0] Query Started Taking Time + frequent one or more node lost connectivity error

Anup Tiwari Wed, 14 Mar 2018 01:17:15 -0700

Also i have observed one thing, the query which is taking time is creating
~30-40 fragments and 99.99999% of record is getting written into only one

fragment.





On Wed, Mar 14, 2018 1:37 PM, Anup Tiwari [email protected]  wrote:
Hi Padma,
Please find my highlighted answer w.r.t. your question :-
Connection loss error can happen when zookeeper thinks that a node is dead
becauseit did not get heartbeat from the node. It can be because the node is

busy or you havenetwork problems. Q) Did anything changed in your network ?Answer : No. Also we cross verify Intra communication within nodes and its

working fine.

Q) Is the data static or are you adding new data ? Answer : Data is static.
Q) Do you have metadata caching enabled ?Answer : No.
PARQUET_WRITER seem to be indicate you are doing some kind of CTAS. : This is
correct, we are doing CTAS.
The block missing exception could possibly mean some problem with name node or
bad diskson one of the node. : There is no bad disk also when i checked that
file from hadoop ls command and it is present so can you tell me why here drill
is showing block missing? Also you have mentioned "it could possibly mean
problem with name node"; i have checked namenode is running fine. Also we are
executing some hive queries on same cluster those are running fine so if it is

namenode issue then i think it should affect all queries.





On Mon, Mar 12, 2018 11:24 PM, Padma Penumarthy [email protected]  wrote:
There can be lot of issues here.

Connection loss error can happen when zookeeper thinks that a node is dead
because

it did not get heartbeat from the node. It can be because the node is busy or
you have

network problems. Did anything changed in your network ?

Is the data static or are you adding new data ? Do you have metadata caching
enabled ?

PARQUET_WRITER seem to be indicate you are doing some kind of CTAS.

The block missing exception could possibly mean some problem with name node or
bad disks

on one of the node.




Thanks

Padma

On Mar 12, 2018, at 1:27 AM, Anup Tiwari <[email protected]> wrote:

Hi All,

From last couple of days i am stuck in a problem. I have a query which left

joins 3 drill tables(parquet), everyday it is used to take around 15-20 mins

but

from last couple of days it is taking more than 45 mins and when i tried to

drill down i can see in operator profile that 40% query time is going to

PARQUET_WRITER and 28% time in PARQUET_ROW_GROUP_SCAN. I am not sure if before

this issue the stats were same or not as earlier it gets executed in 15-20 min

max.Also on top of this a table, we used to create a table which is now

showing

below error :-

SYSTEM ERROR: BlockMissingException: Could not obtain block:

BP-1083556055-10.51.2.101-1481111327179:blk_1094763477_21022752

Also in last few days i am getting frequent one or more node lost connectivity

error.

I just upgraded to Drill 1.12.0 from 1.10.0 but above issues are still there.

Any help will be appreciated.

Regards,

Anup Tiwari










Regards,
Anup Tiwari


Regards,
Anup Tiwari

Re: [Drill 1.10.0/1.12.0] Query Started Taking Time + frequent one or more node lost connectivity error

Reply via email to