Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-08 Thread Ted Dunning
This is true with Lucene as it stands.  It would be much faster if there
were a specialized in-memory index such as is typically used with high
performance search engines.

On Tue, Feb 7, 2012 at 9:50 PM, Lance Norskog goks...@gmail.com wrote:

 Experience has shown that it is much faster to run Solr with a small
 amount of memory and let the rest of the ram be used by the operating
 system disk cache. That is, the OS is very good at keeping the right
 disk blocks in memory, much better than Solr.

 How much RAM is in the server and how much RAM does the JVM get? How
 big are the documents, and how large is the term index for your
 searches? How many documents do you get with each search? And, do you
 use filter queries- these are very powerful at limiting searches.

 2012/2/7 James ljatreey...@163.com:
  Is there any practice to load index into RAM to accelerate solr
 performance?
  The over all documents is about 100 million. The search time around
 100ms. I am seeking some method to accelerate the respond time for solr.
  Just check that there is some practice use SSD disk. And SSD is also
 cost much, just want to know is there some method like to load the index
 file in RAM and keep the RAM index and disk index synchronized. Then I can
 search on the RAM index.



 --
 Lance Norskog
 goks...@gmail.com



Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-08 Thread Patrick Plaatje
A start maybe to use a RAM disk for that. Mount is as a normal disk and
have the index files stored there. Have a read here:

http://en.wikipedia.org/wiki/RAM_disk

Cheers,

Patrick


2012/2/8 Ted Dunning ted.dunn...@gmail.com

 This is true with Lucene as it stands.  It would be much faster if there
 were a specialized in-memory index such as is typically used with high
 performance search engines.

 On Tue, Feb 7, 2012 at 9:50 PM, Lance Norskog goks...@gmail.com wrote:

  Experience has shown that it is much faster to run Solr with a small
  amount of memory and let the rest of the ram be used by the operating
  system disk cache. That is, the OS is very good at keeping the right
  disk blocks in memory, much better than Solr.
 
  How much RAM is in the server and how much RAM does the JVM get? How
  big are the documents, and how large is the term index for your
  searches? How many documents do you get with each search? And, do you
  use filter queries- these are very powerful at limiting searches.
 
  2012/2/7 James ljatreey...@163.com:
   Is there any practice to load index into RAM to accelerate solr
  performance?
   The over all documents is about 100 million. The search time around
  100ms. I am seeking some method to accelerate the respond time for solr.
   Just check that there is some practice use SSD disk. And SSD is also
  cost much, just want to know is there some method like to load the index
  file in RAM and keep the RAM index and disk index synchronized. Then I
 can
  search on the RAM index.
 
 
 
  --
  Lance Norskog
  goks...@gmail.com
 




-- 
Patrick Plaatje
Senior Consultant
http://www.nmobile.nl/


Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-08 Thread Dmitry Kan
Hi,

This talk has some interesting details on setting up an Lucene index in RAM:

http://www.lucidimagination.com/devzone/events/conferences/revolution/2011/lucene-yelp


Would be great to hear your findings!

Dmitry

2012/2/8 James ljatreey...@163.com

 Is there any practice to load index into RAM to accelerate solr
 performance?
 The over all documents is about 100 million. The search time around 100ms.
 I am seeking some method to accelerate the respond time for solr.
 Just check that there is some practice use SSD disk. And SSD is also cost
 much, just want to know is there some method like to load the index file in
 RAM and keep the RAM index and disk index synchronized. Then I can search
 on the RAM index.



Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-08 Thread Andrzej Bialecki

On 08/02/2012 09:17, Ted Dunning wrote:

This is true with Lucene as it stands.  It would be much faster if there
were a specialized in-memory index such as is typically used with high
performance search engines.


This could be implemented in Lucene trunk as a Codec. The challenge 
though is to come up with the right data structures.


There has been some interesting research on optimizations for in-memory 
inverted indexes, but it usually involves changing the query evaluation 
algos as well - for reference:


http://digbib.ubka.uni-karlsruhe.de/volltexte/documents/1202502
http://www.siam.org/proceedings/alenex/2008/alx08_01transierf.pdf
http://research.google.com/pubs/archive/37365.pdf

--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-08 Thread Robert Stewart
I concur with this.  As long as index segment files are cached in OS file cache 
performance is as about good as it gets.  Pulling segment files into RAM inside 
JVM process may actually be slower, given Lucene's existing data structures and 
algorithms for reading segment file data.   If you have very large index (much 
bigger than available RAM) then it will only be slow when accessing disk for 
uncached segment files.  In that case you might consider sharding index across 
more than one server and using distributed searching (possibly SOLR cloud, 
etc.).

How large is your index in GB?  You can also try making index files smaller by 
removing indexed/stored fields you dont need, compressing large stored fields, 
etc.  Also maybe turn off storing norms, term frequencies, positions, vectors 
and stuff if you dont need them.

On Feb 8, 2012, at 3:17 AM, Ted Dunning wrote:

 This is true with Lucene as it stands.  It would be much faster if there
 were a specialized in-memory index such as is typically used with high
 performance search engines.
 
 On Tue, Feb 7, 2012 at 9:50 PM, Lance Norskog goks...@gmail.com wrote:
 
 Experience has shown that it is much faster to run Solr with a small
 amount of memory and let the rest of the ram be used by the operating
 system disk cache. That is, the OS is very good at keeping the right
 disk blocks in memory, much better than Solr.
 
 How much RAM is in the server and how much RAM does the JVM get? How
 big are the documents, and how large is the term index for your
 searches? How many documents do you get with each search? And, do you
 use filter queries- these are very powerful at limiting searches.
 
 2012/2/7 James ljatreey...@163.com:
 Is there any practice to load index into RAM to accelerate solr
 performance?
 The over all documents is about 100 million. The search time around
 100ms. I am seeking some method to accelerate the respond time for solr.
 Just check that there is some practice use SSD disk. And SSD is also
 cost much, just want to know is there some method like to load the index
 file in RAM and keep the RAM index and disk index synchronized. Then I can
 search on the RAM index.
 
 
 
 --
 Lance Norskog
 goks...@gmail.com
 



Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-08 Thread Ted Dunning
Add this as well:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.155.5030

On Wed, Feb 8, 2012 at 1:56 AM, Andrzej Bialecki a...@getopt.org wrote:

 On 08/02/2012 09:17, Ted Dunning wrote:

 This is true with Lucene as it stands.  It would be much faster if there
 were a specialized in-memory index such as is typically used with high
 performance search engines.


 This could be implemented in Lucene trunk as a Codec. The challenge though
 is to come up with the right data structures.

 There has been some interesting research on optimizations for in-memory
 inverted indexes, but it usually involves changing the query evaluation
 algos as well - for reference:

 http://digbib.ubka.uni-**karlsruhe.de/volltexte/**documents/1202502http://digbib.ubka.uni-karlsruhe.de/volltexte/documents/1202502
 http://www.siam.org/**proceedings/alenex/2008/alx08_**01transierf.pdfhttp://www.siam.org/proceedings/alenex/2008/alx08_01transierf.pdf
 http://research.google.com/**pubs/archive/37365.pdfhttp://research.google.com/pubs/archive/37365.pdf

 --
 Best regards,
 Andrzej Bialecki 
  ___. ___ ___ ___ _ _   __**
 [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
 ___|||__||  \|  ||  |  Embedded Unix, System Integration
 http://www.sigram.com  Contact: info at sigram dot com




Re: is there any practice to load index into RAM to accelerate solr performance?

2012-02-07 Thread Lance Norskog
Experience has shown that it is much faster to run Solr with a small
amount of memory and let the rest of the ram be used by the operating
system disk cache. That is, the OS is very good at keeping the right
disk blocks in memory, much better than Solr.

How much RAM is in the server and how much RAM does the JVM get? How
big are the documents, and how large is the term index for your
searches? How many documents do you get with each search? And, do you
use filter queries- these are very powerful at limiting searches.

2012/2/7 James ljatreey...@163.com:
 Is there any practice to load index into RAM to accelerate solr performance?
 The over all documents is about 100 million. The search time around 100ms. I 
 am seeking some method to accelerate the respond time for solr.
 Just check that there is some practice use SSD disk. And SSD is also cost 
 much, just want to know is there some method like to load the index file in 
 RAM and keep the RAM index and disk index synchronized. Then I can search on 
 the RAM index.



-- 
Lance Norskog
goks...@gmail.com