[ 
https://issues.apache.org/jira/browse/IMPALA-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631805#comment-17631805
 ] 

Michael Smith commented on IMPALA-11704:
----------------------------------------

Found that exhaustive testing wasn't sufficient. Running the whole test suite 
with data cache enabled found an error
{code}
F1109 03:43:19.240250  7283 hdfs-file-reader.cc:318] 
b14bc10d21ff3351:7e55cc7800000000] Check failed: exclusive_hdfs_fh_ != nullptr
*** Check failure stack trace: ***
    @          0x3753e0d  google::LogMessage::Fail()
    @          0x3755d44  google::LogMessage::SendToLog()
    @          0x37537ec  google::LogMessage::Flush()
    @          0x3756269  google::LogMessageFatal::~LogMessageFatal()
    @          0x1e763d2  impala::io::HdfsFileReader::CachedFile()
    @          0x1e6c904  impala::io::ScanRange::ReadFromCache()
    @          0x1e601fb  impala::io::RequestContext::TryReadFromCache()
    @          0x1e625e4  impala::io::RequestContext::GetNextUnstartedRange()
    @          0x1a4f376  impala::HdfsScanNode::GetNextScanRangeToRead()
    @          0x1962320  impala::HdfsScanNodeBase::StartNextScanRange()
    @          0x1a5393a  impala::HdfsScanNode::ScannerThread()
    @          0x1a543ce  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x18b7182  impala::Thread::SuperviseThread()
    @          0x18b7f8b  boost::detail::thread_data<>::run()
    @          0x2380f77  thread_proxy
    @     0x7f7b49971ea5  start_thread
    @     0x7f7b468a6b0d  __clone
{code}

> Remote Ozone scans are slow even after data cache warmup
> --------------------------------------------------------
>
>                 Key: IMPALA-11704
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11704
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.1.1
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>             Fix For: Impala 4.2.0
>
>
> From [~drorke]:
> {quote}
> Running some basic performance sanity tests ... with Impala TPC-DS queries 
> against Ozone vs HDFS.  Impala appears to be using it's data cache for both 
> Ozone and HDFS remote reads, but in the case of Ozone reads I'm still seeing 
> long scan times and high I/O wait times even after cache warmup. Excerpts 
> below from profiles of q90.  Note in both cases the Impala profiles show 100% 
> cache hit rates but for some reason the scan IO wait times are still much 
> longer for the Ozone scans.
> {noformat}
> HDFS:
> - TotalTime: 1s924ms
> - ScannerIoWaitTime: 52.037ms
> Ozone:
> - TotalTime: 8s917ms
> - ScannerIoWaitTime: 7s454ms{noformat}
> If I disable the local cache explicitly via query option I get the following 
> times for the same scan:
> {noformat}
> HDFS:
> - TotalTime: 7s792ms
> - ScannerIoWaitTime: 6s244ms
> Ozone:
> - TotalTime: 8s963ms
> - ScannerIoWaitTime: 7s464ms{noformat}
> {quote}
> Investigating a bit, [~joemcdonnell] noticed in the Ozone profile
> {noformat}
>  - ScannerIoWaitTime: 7s454ms
>  - TotalRawHdfsOpenFileTime: 5s782ms
> {noformat}
> Based on profile differences around {{TotalRawHdfsOpenFileTime=5s782ms}} (vs 
> {{0ms}} for HDFS), I believe this is a difference in performance when using 
> the data cache but the file handle cache is disabled. That traces back to an 
> incomplete implementation of 
> [IMPALA-10147|https://issues.apache.org/jira/browse/IMPALA-10147].
> A data read:
> 1. [Checks that it can open a file 
> handle|https://github.infra.cloudera.com/CDH/Impala/blob/CDWH-2022.0.10.1/be/src/runtime/io/scan-range.cc#L199].
>  When file handle cache is enabled, this is a 
> [noop|https://github.infra.cloudera.com/CDH/Impala/blob/CDWH-2022.0.10.1/be/src/runtime/io/hdfs-file-reader.cc#L67].
> 2. It will then try to read data. If data cache is enabled, it will [try to 
> read from the data 
> cache|https://github.infra.cloudera.com/CDH/Impala/blob/CDWH-2022.0.10.1/be/src/runtime/io/hdfs-file-reader.cc#L137].
> 3. If data cache hits, that data is returned and any open file handles are 
> unused.
> When the file handle cache is disabled, opening the file handle [calls 
> hdfsOpenFile and 
> hdfsSeek|https://github.infra.cloudera.com/CDH/Impala/blob/CDWH-2022.0.10.1/be/src/runtime/io/hdfs-file-reader.cc#L70-L72].
>  {{hdfsOpenFile}} in particular is monitored and added to the profile as 
> {{TotalRawHdfsOpenFileTime}}. That time in the Ozone profile accounts for 
> most of the difference in performance between HDFS and Ozone in this case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to