[ 
https://issues.apache.org/jira/browse/IMPALA-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Maheshwari updated IMPALA-3289:
--------------------------------------
    Description: 
When disk performance is drastically degraded during query execution Impala 
will not recognize this and the query will appear to "hang".

A threshold could be set for disk IO performance below which the query will be 
cancelled thus advising the user there is an issue.

Some error messages - 
{code:java}
E0226 07:27:41.546187 14795 tmp-file-mgr.cc:211] 
0541dda3dc371844:c22b251700000000] Error for temporary file 
'/data2/impala/impalad/impala-scratch/0541dda3dc371844:c22b251700000000_6b169f50-3a00-4ce6-a19e-fe9360aaed87':
 Disk I/O error on gbrpsr000012838.intranet.barcapint.com:22000: open() failed 
for 
/data2/impala/impalad/impala-scratch/0541dda3dc371844:c22b251700000000_6b169f50-3a00-4ce6-a19e-fe9360aaed87.
 Disk level I/O error occured. errno=5

W1028 21:00:05.312568 56851 DfsClientShmManager.java:365] 
EndpointShmManager(DatanodeInfoWithStorage[22.50.92.142:1004,DS-4af8e8f7-c6b6-43e7-8a0a-19d445a7a32e,DISK],
 parent=ShortCircuitShmManager(2301f5f2)): error shutting down shm: got 
IOException calling shutdown(SHUT_RDWR) 

impalad.WARNING:W0226 15:15:45.458577 25224 BlockReaderFactory.java:647] 
0d43c912dd091557:ab21fb05000000f5] 
BlockReaderFactory(fileName=/warehouse/datalake/AAAAAAAAA.dat, 
block=BP-1018268685-35.49.40.158-1438950312819:blk_5003986013_3938223123): 
unknown response code ERROR while attempting to set up short-circuit access. 
RegisteredShm(62d9cfb1e2af3c6697ace97f93109c88): slot 125 is already in use.. 
Short-circuit read for DataNode 
DatanodeInfoWithStorage[22.50.92.142:1004,DS-ce9c7134-ad13-47fc-93c0-8cec6c3f3e7e,DISK]
 is disabled temporarily for 1 seconds based on 
dfs.domain.socket.disable.interval.seconds.{code}

  was:
When disk performance is drastically degraded during query execution Impala 
will not recognize this and the query will appear to "hang".

A threshold could be set for disk IO performance below which the query will be 
cancelled thus advising the user there is an issue.


> Disk performance threshold to avoid "hang"
> ------------------------------------------
>
>                 Key: IMPALA-3289
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3289
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>    Affects Versions: Impala 2.3.0
>            Reporter: Thomas Scott
>            Priority: Minor
>              Labels: resource-management
>
> When disk performance is drastically degraded during query execution Impala 
> will not recognize this and the query will appear to "hang".
> A threshold could be set for disk IO performance below which the query will 
> be cancelled thus advising the user there is an issue.
> Some error messages - 
> {code:java}
> E0226 07:27:41.546187 14795 tmp-file-mgr.cc:211] 
> 0541dda3dc371844:c22b251700000000] Error for temporary file 
> '/data2/impala/impalad/impala-scratch/0541dda3dc371844:c22b251700000000_6b169f50-3a00-4ce6-a19e-fe9360aaed87':
>  Disk I/O error on gbrpsr000012838.intranet.barcapint.com:22000: open() 
> failed for 
> /data2/impala/impalad/impala-scratch/0541dda3dc371844:c22b251700000000_6b169f50-3a00-4ce6-a19e-fe9360aaed87.
>  Disk level I/O error occured. errno=5
> W1028 21:00:05.312568 56851 DfsClientShmManager.java:365] 
> EndpointShmManager(DatanodeInfoWithStorage[22.50.92.142:1004,DS-4af8e8f7-c6b6-43e7-8a0a-19d445a7a32e,DISK],
>  parent=ShortCircuitShmManager(2301f5f2)): error shutting down shm: got 
> IOException calling shutdown(SHUT_RDWR) 
> impalad.WARNING:W0226 15:15:45.458577 25224 BlockReaderFactory.java:647] 
> 0d43c912dd091557:ab21fb05000000f5] 
> BlockReaderFactory(fileName=/warehouse/datalake/AAAAAAAAA.dat, 
> block=BP-1018268685-35.49.40.158-1438950312819:blk_5003986013_3938223123): 
> unknown response code ERROR while attempting to set up short-circuit access. 
> RegisteredShm(62d9cfb1e2af3c6697ace97f93109c88): slot 125 is already in use.. 
> Short-circuit read for DataNode 
> DatanodeInfoWithStorage[22.50.92.142:1004,DS-ce9c7134-ad13-47fc-93c0-8cec6c3f3e7e,DISK]
>  is disabled temporarily for 1 seconds based on 
> dfs.domain.socket.disable.interval.seconds.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to