Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24416 )

Change subject: IMPALA-15079: Cleanup Open/Close locking in SharedJdbcConnection
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/24416/1/be/src/exec/data-source-scan-node.cc
File be/src/exec/data-source-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/24416/1/be/src/exec/data-source-scan-node.cc@104
PS1, Line 104:   {
             :     std::unique_lock<std::mutex> lk(lock_);
             :     ++ref_count_;
             :     if (ref_count_ > 1) {
             :       // Subsequent callers: wait for the first caller to 
finish, then reuse.
             :       open_cv_.wait(lk, [this] { return open_done_; });
             :       result->__set_scan_handle(scan_handle_);
             :       return open_status_;
             :     }
             :   }
             :
             :   // First caller: initialize the executor and open the Java 
data source outside the lock.
             :   // All N C++ scanner threads share this single connection.
             :   Status s = executor_.Init(jar_path, class_name, api_version, 
init_string);
             :   if (s.ok()) s = executor_.Open(params, result);
             :   if (s.ok()) s = Status(result->status);
             :
             :   {
             :     std::unique_lock<std::mutex> lk(lock_);
             :     if (s.ok()) scan_handle_ = result->scan_handle;
             :     open_status_ = s;
             :     open_done_ = true;
             :     open_cv_.notify_all();
             :   }
> Why don't we just drop open_cv_? I don't think Open() is perf-critical, so
I can remove it and in this case it shouldn't matter, but generally I think 
that it is a better habit not to use locks on waiting something potentially 
slow.

std::mutex calls to sys futex quickly, but Impala also heavily uses SpinLock, 
and for that doing the same would waste some cpu while it tries to loop and 
repeatedly take the lock.

I am also looking forward to pulling in abseil and using its mutex heavily 
instead of std::mutex / boost::mutex / SpinLock. It provides a nice middle 
ground (spins a bit but less than SpinLock) and has a built in condition 
variable, so you can wait directly on the mutex, making code like this less 
error prone.
https://abseil.io/about/design/mutex

This change will add abseil to toolchain: 
https://gerrit.cloudera.org/#/c/24403/7/bin/bootstrap_toolchain.py



--
To view, visit http://gerrit.cloudera.org:8080/24416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9f5c4ac71031b74b51bfa27ed04be10ed92e222
Gerrit-Change-Number: 24416
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Pranav Lodha <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 15 Jun 2026 09:00:11 +0000
Gerrit-HasComments: Yes

Reply via email to