Re: Reg. Manifold Indexing performance

Karl Wright Wed, 17 Jul 2019 10:14:22 -0700

Hi Praveen,

If there is a broken query plan, it will show up in the ManifoldCF log; any
query that takes more than 60 seconds to run gets dumped and explained.  So
it should be possible to rule that out with low effort.

The kind of situation I have seen with very large document jobs is that
postgresql performance does gradually decline as you throw more documents
into the tables.  There doesn't seem to be any pathological reason why this
occurs.  If you can find nothing logged about long-running queries, this
may be the only explanation.

If there is a bad plan, it's probably the so-called "stuffer" query that is
not performant.  If this query takes long enough, worker threads become
idle while waiting for new documents to appear for them.  If this is what's
happening, the performance can be restored by analyzing the jobqueue
table.  MCF does this itself periodically, but this too takes more and more
time the larger the table gets.  How frequently the reanalysis is done can
be controlled with a properties.xml file configuration parameter.

Karl

On Wed, Jul 17, 2019 at 1:00 PM Praveen Bejji <[email protected]>
wrote:

> Hi,
>
> We are trying to index close to one million document using documentum
> connector. Indexing is working fine but we see a drop in indexing
> performance after first day. Connector is able to index 21k/hr on the first
> day but it drops to 10k/hr after 24-28 hours. Although we don't see any
> errors and indexing is getting completed without any issue, it does take a
> good 2-3 days to complete indexing.
> As our higher environments has almost 4 times the data, we want to achieve
> a consistent indexing rate before going ahead.
>
> Our Manifold is running  in quick start single process model on an Linux
> server. Linux server has a 8 Core CPU with 16 Gig of Memory. We are using
> PostgreSQL as the DB.
> We are not sure if this is due to the size of the tables as Manifold is
> inserting all 1 millions document id to PorstgresDB. I can see from the DB
> scripts that the appropriate indexes on Tables are already getting created.
>
> Please let us know your thoughts on this.
>
> Thanks,
> Praveen
>
>

Re: Reg. Manifold Indexing performance

Reply via email to