Heyo,

I'm attempting to use logical decoding with the streaming replication
protocol to perform change-data-capture on PostgreSQL 9.5.4.  I'm seeing
the replication stream "stall" for long periods of time where the walsender
process will be pinned at 100% CPU utilization, but no data is being sent
to my client.

The stalls occur unpredictably on my production system, but generally seem
to be correlated with schema operations.  My source database has about
100,000 tables; it's a one-schema-per-tenant multi-tenant SaaS system.

I've reproduced the same symptoms with two different approaches on my local
machine.  With both, I have a replication client connected via the
streaming protocol.

In reproduction approach 1, I've created a thread that inserts small sets
of data, and a thread that creates a schema w/ 500 tables and then drops
it.  This approach has pinned CPU usage, but data does come out of it --
just excruciatingly slow when compared to the same test without the schema
create & drop.

In reproduction approach 2, I've created a database w/ 100,000 tables on it
and performed a "vacuum ful".  The walsender goes to 100% CPU and no data
comes out of the replication stream for hours.

I've performed a CPU sampling with the OSX `sample` tool based upon
reproduction approach #1:
https://gist.github.com/mfenniak/366d7ed19b2d804f41180572dc1600d8  It
appears that most of the time is spent in the
RelfilenodeMapInvalidateCallback and CatalogCacheIdInvalidate cache
invalidation callbacks, both of which appear to be invalidating caches
based upon the cache value.

Has anyone else run into this kind of performance problem?  Any thoughts on
how it might be resolved?  I don't mind putting in the work if someone
could describe what is happening here, and have a discussion with me about
what kind of changes might be necessary to improve the performance.

Thanks all,


*Mathieu Fenniak* | Senior Software Architect | Phone 1-587-315-1185

*Replicon* | The leader in cloud time tracking applications - 7,800+
Customers - 70+ Countries - 1.5 Million Users
www.replicon.com | facebook <http://www.facebook.com/Replicon.inc> |
linkedin <http://www.linkedin.com/company/27961?trk=tyah> | twitter
<http://twitter.com/Replicon> | blog <http://www.replicon.com/blog/> | contact
us <http://www.replicon.com/about_replicon/contact_us.aspx>

*We are hiring!* | search jobs <http://www.replicon.com/careers>

Reply via email to