Re: Help getting a document by unique ID

Jack Krupansky Tue, 19 Mar 2013 13:54:11 -0700

60ms does seem excessive for the simplest possible access - lookup by theunique key field value. SOMETHING is clearly unacceptable at that level. Isthis on decent hardware?

Try a query with &debugQuery=true and look at the "timing" section and seewhat component(s) are eating up the lion's share of that 60 ms. Is it thequery component or something else like faceting or highlighting?


Or, are you returning a lot of field values?

Or, are you using a lot of filters that are relatively unique (and hencefrequently recomputed)?

Are you doing a lot of updating while querying (and hence invalidatingcaches)?


-- Jack Krupansky

-----Original Message-----From: Brian Hurt

Sent: Tuesday, March 19, 2013 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Help getting a document by unique ID

On Mon, Mar 18, 2013 at 7:08 PM, Jack Krupansky <j...@basetechnology.com>wrote:

Hmmm... if query by your unique key field is killing your performance,maybe
you have some larger problem to address.


This is almost certainly true.  I'm well outside the use cases
targeted by Solr/Lucene, and it's a testament to the quality of the
product that it works at all.  Among other things, I'm implementing a
graph database on top of Solr (it being easier to build a graph
database on top of Solr than it is to implement Solr on top of a graph
database).

Which is the problem- you might think that 60ms unique key accesses
(what I'm seeing) is more than good enough- and for most use cases,
you'd be right.  But it's not unusual for a single web-page hit to
generate many dozens, if not low hundreds, of calls to get document by
id.  At which point, 60ms hits pile up fast.

The current plan is to just cache the documents as files in the local
file system (or possibly other systems), and have the get document
calls go there instead, while complicated searches still go to Solr.
Fortunately, this isn't complicated.

How bad is it? Are you using the
string field type? How long are your ids?


My ids start at 100 million and go up like a kite from there- thus the
string representation.


The only thing the real-time GET API gives you is more immediate access to

recently added, uncommitted data. Accessing older, committed data will beno

faster. But if accessing that recent data is what you are after, real-time
GET may do the trick.


OK, so this is good to know.  This answers question #1: GET isn't the
function I should be calling.  Thanks.

Brian

Re: Help getting a document by unique ID

Reply via email to