Thanks for the reply Shawn.

The solr server currently has 8gb of ram and the total size of the dataDir
is around 30gb.  I start solr and give the java heap up to 4gb of ram, so
that leaves 4gb for the OS, there are no other running services on the
box.  So from what you are saying, we are way under on the amount of ram we
would ideally have.

Just trying to get a better understanding of this.....Wouldn't the indexes
not being in the disk cache make the queries themselves slow as well (high
qTime), not just fetching the results?

We currently store all the fields that we index, my reasoning behind that
is that debugging results we get from solr w/o being able to see what is
stored in solr would be near impossible (in my head anyhow..).  Generally
our original source (mysql) and solr are consistent, but we've had cases
where some updates have been missed for one reason or another.

So my options are: reduce index sizes, increase ram on the server, increase
disk speed (SSD drives)?

Thanks
Stephen

On Mon, Nov 21, 2011 at 11:33 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 11/21/2011 8:45 PM, Stephen Powis wrote:
>
>> I'm running Solr 1.4.1 with Jetty.  When I make requests against solr that
>> have a large response (~1mb of data) I'm getting super slow transfer times
>> back to the client, I'm hoping you guys can help shed some light on this
>> issue for me.
>>
>> Some more information about my setup:
>> - The qTime header in the response generally is very small, under 1 sec (<
>> 1000ms).
>> - The client making the request is on a 1000mb LAN with the solr server,
>> yet the transfer speed is only between 16k and 30k per sec.
>> - If I make the same request against localhost on the solr server, I see
>> the same slow speeds.  SCP and other transfer between the client and
>> server
>> are all quick.  I'd like to think these tests eliminate any kind of
>> network
>> pipe problem between the two servers.
>> - If I make the same query repeatedly, sometimes it will send the response
>> very quickly (6mb/sec and faster)
>> - While testing this, load on the box was basically at idle.
>>
>> So I guess I'm hoping someone can help me understand whats going on here,
>> and why I'm seeing this behavior, and perhaps a possible solution?
>>
>> What exactly does qTime measure?  I assume it is the time it takes to
>> process the request and fetch the resulting rows.  It obviously does not
>> include the transfer time back to the client, but does it include pulling
>> the data from the index?  Is solr slow to pull the data from the index and
>> drop it into the network pipe?
>>
>
> Your bottleneck is probably disk I/O and a lack of OS disk cache.  How big
> is your index, how much RAM do you have, and how much RAM is used by
> processes, especially the Java heap?  QTime measures the amount of time
> that Solr spent finding the document IDs.  It does not include time spent
> retrieving the requested fields or sending it to the client.
>
> Solr is designed to work best when the entire index fits into the OS disk
> cache, which is free memory that is not assigned to the OS or other
> processes.  Limiting the number of fields that Solr indexes (for searching)
> and stores (for data retrieval) keeps index size down, so you can fit more
> of it in the disk cache.  When the index data is in RAM, Solr is very very
> fast.  If it has to go out to the disk to search or retrieve, it is very
> slow.
>
> You should only index the fields absolutely required to get good search
> results, and you should only store the fields required to display a grid of
> search results.  When displaying full details for an individual item, go to
> the the original data source using the identifier returned in the search
> results.  In typical search applications, you only need full details for a
> small subset of the results returned by a search, so don't retrieve
> megabytes of information that will never be used.
>
> Thanks,
> Shawn
>
>

Reply via email to