[
https://issues.apache.org/jira/browse/IMPALA-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Ho updated IMPALA-5953:
-------------------------------
Issue Type: Task (was: Sub-task)
Parent: (was: IMPALA-5865)
> Consider moving wholesale to kudu::Thread/ThreadPool
> ----------------------------------------------------
>
> Key: IMPALA-5953
> URL: https://issues.apache.org/jira/browse/IMPALA-5953
> Project: IMPALA
> Issue Type: Task
> Components: Backend
> Reporter: Sailesh Mukil
> Priority: Major
> Labels: observability, refactor, util
>
> I went over the effort required to switch to kudu::Thread and
> kudu::ThreadPool completely and I've broken it down into some high level
> tasks, which are listed below.
> A rough estimate of time needed for this task with one person on it is 1 week
> without too many distractions.
> +Diffs between kudu::Thread and impala::Thread:+
> * kudu::Thread use pthreads, whereas impala::Thread uses boost::thread().
> Need to measure for performance differences. Effort: High - for perf
> evaluation
> * kudu::Thread has no ThreadGroup equivalent. Effort: Easy
> * kudu::Thread needs to use a ThreadJoiner to join threads. Effort: Easy
> * kudu::Thread uses kudu::MetricEntity whereas impala::Thread uses
> impala::MetricGroup. Effort: Moderate
> * kudu::Thread does not understand impala::Webserver. Needs to be implemented
> inside Kudu. Effort: Moderate/High
> +Diffs between kudu::ThreadPool impala::ThreadPool:+
> kudu::ThreadPool has very different APIs than impala::ThreadPool
> * kudu::ThreadPool has no CallableThreadPool, but has SubmitFunction() which
> tries to achieve the same thing. Effort: Moderate
> * kudu::ThreadPool does not have a default worker task that we can assign at
> object creation time.
> * We either need to convert everything to Submit(Runnable) or SubmitFunc().
> Effort: High, and we need to check if this affects performance in any way due
> to the increased rate of passing around function pointers.
> ---------------
> Below, I list the things we'd need to add to the kudu::Thread code and
> differentiate between the possibility of having a shared utility repo for
> Impala and Kudu, versus just periodically updating the kudu utils code inside
> Impala:
> # Add a ThreadGroup class (Easy, non-invasive)
> # Add a ThreadPool type that takes a worker task on object creation. Only if
> needed. (Easy, non-invasive)
> # Add a Impala specific instrumentation functions for metrics. (Easy, but
> slightly invasive)
> # Add a impala Webserver callback for kudu::Thread (Moderate/High, quite
> invasive as we need to include an Impala header inside kudu) I already got
> this working 80% last night.
> An alternative would be to have a function that exports all thread
> information, and have another callback from the Impala code registered that
> uses this exported information to display on the webpage. That way, we avoid
> including Impala headers inside Kudu. But this would involve more work since
> we'd need to add a struct/class that's understood by both Impala and Kudu.
> Tasks 1 and 2 are only "adding" to the code and not modifying any existing
> code.
> * +Shared repo:+ If we're sharing a repo, it can just be added to that repo
> as new APIs.
> * +Periodic update:+ It will be easy to keep moving this to the newer copies
> since we can pretty much copy paste, unless the kudu::Thread APIs change,
> which I doubt they will.
> Task 3 is slightly invasive:
> * +Shared repo:+ If we're sharing a repo, then we will most likely need to
> standardize on other things as well, such as Metrics, in which case, this
> wouldn't be invasive any more, and we can just use Kudu's current
> instrumentation code.
> * +Periodic update:+ Even though it's slightly invasive, it wouldn't be hard
> to keep a track of and move the diffs to the newer copies.
> Task 4:
> * +Shared repo:+ If we're sharing a repo, then we will have to go with the
> suggestion of exporting thread instrumentation as I suggested above.
> * +Periodic update:+ It wouldn't be hard to keep copying to the newer copies,
> since the only invasive part is the adding of the Impala header.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]