Hi Mike,

Thanks for the reply.  We've just started a consulting job at this client
and are unravelling the various levels of the programming.  We suspected
that registering all of those queries and then hitting the system with
100,000 inserts was bound to bog down the system.

Their system had ~50 million items with ~2,000 users having access to
certain groups of those 50 million based on a subscription file per user.
 These subscription files have sometimes 20,000 entries.  It appears that
early on, they got stuck on how to approach this (we know that they are
generating some 'subscription queries' that have thousands of nested
cts:and queries, for instance). The solution at the time was to simply
register the monsterous queries.  This just appears to have compounded the
issue by introducing another items causing a bottlneck (the internal
maintenance of the queries).  So when tuning the original queries, any gain
in performance was likely masked by the newer delay (registered queries).

Our approach now is likely to abandon their registered queries and a
combination of (1) optimize the original queries (looks like terms can be
boiled down to hundreds instead of thousands) and possibly also generate
our own 'smart caches' per user that could be updated in various
less-intensive manner.

Regards,
David Ennis


On 29 March 2014 14:58, Michael Blakeley <[email protected]> wrote:

> Registered queries are smart list-cache entries.
>
> You've already deduced that that implies extra work when updates happen,
> either immediately or when each registered query is next used. With a lot
> of registered queries it's probably more efficient to do that work with
> each update, but I haven't noticed that behavior myself.
>
> Why pre-register so many queries? As a rule of thumb it isn't worth
> registering a query unless it will be used it 2-3 times. Maybe that should
> be 2-3 times before the next update, too.
>
> -- Mike
>
> On 28 Mar 2014, at 22:48 , David Ennis <[email protected]> wrote:
>
> > HI.
> >
> > We have a client that has about 4,000 registered queries.  These are
> rather 'large' (taking about 30 minutes to register all of them.
> >
> > One of the tests yesterday seems to confirm that ingestion of new
> content is 1/2 as slow when the queries are registered. Unregistering the
> queries again increases throughput of the ingestion.
> >
> > It should be noted that no queries are being run - they are just sitting
> registered.
> >
> > Can someone explain the inner workings of registered queries?  It seems
> to me that there is some level of maintenance of caches related to these
> registered queries as new documents are ingested - regardless of the query
> being used.
> >
> > Intuition says that this is likely the case, but I would like to be sure
> and cannot find enough information to truly support this theory.
> >
> > So, does registered queries do something that could be causing quite
> some overhead to internally maintain them while ingestion is happening?
> >
> > Kind Regards,
> > David
> >
> > _______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to