Excessive ExecService RPCs with multi-threaded ingest

Josh Elser Fri, 21 Oct 2016 15:54:12 -0700

Hi folks,

I was doing some testing earlier this week and Enis's keen eye caughtsomething rather interesting.

When using YCSB to ingest data into a table with a secondary index using8 threads and batch size of 1000 rows, the number of ExecServicecoprocessor calls actually exceeded the number of Multi calls to writethe data (something like 21k ExecService calls to 18k Multi calls).

I dug into this some more and noticed that it's because each thread iscreating its own ServerCache to store the serialized IndexMetadatabefore shipping the data table updates. So, when we have 8 threads allwriting mutations for the same data and index table, we have ~8x theServerCache entries being created than if we had just one thread.

Looking at the code, I completely understand why they're local to thethread and not shared on the Connection (very tricky), but I'm curiousif anyone had noticed this before or if there are reasons to not try toshare these ServerCache(s) across threads. Looking at the data being putinto the ServerCache, it appears to be exactly the same for each of thethreads sending mutations. I'm thinking that we could do safely bytracking when we are loading (or have loaded) the data into theServerCache and doing some reference counting to determine when itsactually safe to delete the ServerCache.

I hope to find/make some time to get a patch up, but thought I'd take amoment to write it up if anyone has opinions/feedback.


Thanks!

- Josh

Excessive ExecService RPCs with multi-threaded ingest

Reply via email to