Re: AW: 170G index, 1.5 billion documents, out of memory on query

2013-02-28 Thread Erick Erickson
Personally I've never seen any single node support 1.5B documents. I advise biting the bullet and sharding. Even if you do get the simple keyword search working, the first time you sort I expect it to blow up. Then you'll try to facet and it'll blow up. Then you'll start using filter queries and

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-26 Thread zqzuk
Hi, the full stack trace is below. - SEVERE: Unable to create core: collection1 org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.init(SolrCore.java:794) at org.apache.solr.core.SolrCore.init(SolrCore.java:607)

AW: 170G index, 1.5 billion documents, out of memory on query

2013-02-26 Thread André Widhani
@lucene.apache.org Betreff: Re: 170G index, 1.5 billion documents, out of memory on query Hi, the full stack trace is below. - SEVERE: Unable to create core: collection1 org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.init

Re: AW: 170G index, 1.5 billion documents, out of memory on query

2013-02-26 Thread zqzuk
Hi sorry I couldnt do this directly... the way I do this is by subscribing to a cluster of computers in our organisation and send the job with required memory. It gets randomly allocated to a node (one single server in the cluster) once executed and it is not possible to connect to that specific

Re: AW: 170G index, 1.5 billion documents, out of memory on query

2013-02-26 Thread Michael McCandless
It really should be unlimited: this setting has nothing to do with how much RAM is on the computer. See http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Mike McCandless http://blog.mikemccandless.com On Tue, Feb 26, 2013 at 12:18 PM, zqzuk ziqizh...@hotmail.co.uk wrote:

170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread zqzuk
Hi I am really frustrated by this problem. I have built an index of 1.5 billion data records, with a size of about 170GB. It's been optimised and has 12 separate files in the index directory, looking like below: _2.fdt --- 58G _2.fdx --- 80M _2.fnm--- 900bytes _2.si --- 380bytes

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Artem OXSEED
Hello, adding my 5 cents here as well: it seems that we experienced similar problem that was supposed to be fixed or not appear at all for 64-bit systems. Our current solution is custom build of Solr with DEFAULT_READ_CHUNK_SIZE set t0 10MB in FSDirectory class. This fix was done however not

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Shawn Heisey
On 2/25/2013 4:06 AM, zqzuk wrote: Hi I am really frustrated by this problem. I have built an index of 1.5 billion data records, with a size of about 170GB. It's been optimised and has 12 separate files in the index directory, looking like below: _2.fdt --- 58G _2.fdx --- 80M _2.fnm---

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread zqzuk
Hi, thanks for your advice! I have deliberately allocated 32G to JVM, with the command java -Xmx32000m -jar start.jar etc. I am using our server which I think has a total of 48G. However it still crashes because of that error when I specify any keywords in my query. The only query that worked, as

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Timothy Potter
The other issue you need to be worried about is long full GC pauses with -Xmx32000m. Maybe try reducing your JVM Heap considerably (e.g. -Xmx8g) and switching to the MMapDirectory - see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html In solrconfig.xml, this would be:

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Shawn Heisey
On 2/25/2013 11:05 AM, zqzuk wrote: I have deliberately allocated 32G to JVM, with the command java -Xmx32000m -jar start.jar etc. I am using our server which I think has a total of 48G. However it still crashes because of that error when I specify any keywords in my query. The only query that

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread zqzuk
Thanks again for your kind input! I followed Tim's advice and tried to use MMapDirectory. Then I get outofmemory on solr startup (tried giving only 8G, 4G to JVM) I guess this truely indicates that there arent sufficient memory for such a huge index. On another thread I posted days before,

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Michael Della Bitta
Hello Zqzuk, It's true that this index is probably too big for a single shard, but make sure you heed Shawn's advice and use a 64-bit JVM in any case! Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Timothy Potter
Do you have the stack trace for the OOM during startup when using MMapDirectory? That would be interesting to know. Cheers, Tim On Mon, Feb 25, 2013 at 1:15 PM, zqzuk ziqizh...@hotmail.co.uk wrote: Hi Michael Yes I have double checked and pretty sure its 64bit java. Thanks -- View this