Re: How to Manage RAM Usage at Heavy Indexing

2013-09-09 Thread P Williams
Hi,

I've been seeing the same thing on CentOS with high physical memory use
with low JVM-Memory use.  I came to the conclusion that this was expected
behaviour.  Using top I noticed that my solr user's java process has
Virtual memory allocated of about twice the size of the index, actual is
within the limits I set when jetty starts.  I infer from this that 98% of
Physical Memory is being used to cache the index.  Walter, Erick and others
are constantly reminding people on list to have RAM the size of the index
available -- I think 98% physical memory use is exactly why.  Here is an
excerpt from Uwe Schindler's well written
piecewhich
explains in greater detail:

*"Basically mmap does the same like handling the Lucene index as a swap
file. The mmap() syscall tells the O/S kernel to virtually map our whole
index files into the previously described virtual address space, and make
them look like RAM available to our Lucene process. We can then access our
index file on disk just like it would be a large byte[] array (in Java this
is encapsulated by a ByteBuffer interface to make it safe for use by Java
code). If we access this virtual address space from the Lucene code we
don’t need to do any syscalls, the processor’s MMU and TLB handles all the
mapping for us. If the data is only on disk, the MMU will cause an
interrupt and the O/S kernel will load the data into file system cache. If
it is already in cache, MMU/TLB map it directly to the physical memory in
file system cache. It is now just a native memory access, nothing more! We
don’t have to take care of paging in/out of buffers, all this is managed by
the O/S kernel. Furthermore, we have no concurrency issue, the only
overhead over a standard byte[] array is some wrapping caused by
Java’s ByteBuffer
interface (it is still slower than a real byte[] array, but that is the
only way to use mmap from Java and is much faster than all other directory
implementations shipped with Lucene). We also waste no physical memory, as
we operate directly on the O/S cache, avoiding all Java GC issues described
before."*
*
*
Is it odd that my index is ~16GB but top shows 30GB in virtual memory?
 Would the extra be for the field and filter caches I've increased in size?

I went through a few Java tuning steps relating to OutOfMemoryErrors when
using DataImportHandler with Solr.  The first thing is that when using the
FileEntityProcessor for each file in the file system to be indexed an entry
is made and stored in heap before any indexing actually occurs.  When I
started pointing this at very large directories I started running out of
heap.  One work-around is to divide the job up into smaller batches, but I
was able to allocate more memory so that everything fit.  The next thing is
that with more memory allocated the limiting factor was too many open
files.  After allowing the solr user to open more files I was able to get
past this as well.  There was a sweet spot where indexing with just enough
memory was slow enough that I didn't experience the too many open files
error but why go slow?  Now I'm able to index ~4M documents (newspaper
articles and fulltext monographs) in about 7 hours.

I hope someone will correct me if I'm wrong about anything I've said here
and especially if there is a better way to do things.

Best of luck,
Tricia



On Wed, Aug 28, 2013 at 12:12 PM, Dan Davis  wrote:

> This could be an operating systems problem rather than a Solr problem.
> CentOS 6.4 (linux kernel 2.6.32) may have some issues with page flushing
> and I would read-up up on that.
> The VM parameters can be tuned in /etc/sysctl.conf
>
>
> On Sun, Aug 25, 2013 at 4:23 PM, Furkan KAMACI  >wrote:
>
> > Hi Erick;
> >
> > I wanted to get a quick answer that's why I asked my question as that
> way.
> >
> > Error is as follows:
> >
> > INFO  - 2013-08-21 22:01:30.978;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection1]
> > webapp=/solr path=/update params={wt=javabin&version=2}
> > {add=[com.deviantart.reachmeh
> > ere:http/gallery/, com.deviantart.reachstereo:http/,
> > com.deviantart.reachstereo:http/art/SE-mods-313298903,
> > com.deviantart.reachtheclouds:http/,
> com.deviantart.reachthegoddess:http/,
> > co
> > m.deviantart.reachthegoddess:http/art/retouched-160219962,
> > com.deviantart.reachthegoddess:http/badges/,
> > com.deviantart.reachthegoddess:http/favourites/,
> > com.deviantart.reachthetop:http/
> > art/Blue-Jean-Baby-82204657 (1444006227844530177),
> > com.deviantart.reachurdreams:http/, ... (163 adds)]} 0 38790
> > ERROR - 2013-08-21 22:01:30.979; org.apache.solr.common.SolrException;
> > java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException]
> > early EOF
> > at
> >
> >
> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
> > at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
> > at
> >
> >
> com.ctc.wstx.sr.BasicStreamRead

Re: How to Manage RAM Usage at Heavy Indexing

2013-09-09 Thread Shawn Heisey

On 9/9/2013 10:35 AM, P Williams wrote:

Is it odd that my index is ~16GB but top shows 30GB in virtual memory?
  Would the extra be for the field and filter caches I've increased in size?


This should probably be a new thread, but it might have some 
applicability here, so I'm replying.


I have noticed some inconsistencies in memory reporting on Linux with 
regard to Solr.  Here's a screenshot of top on one of my production 
systems, sorted by memory:


https://www.dropbox.com/s/ylxm0qlcegithzc/prod-top-sort-mem.png

The virtual memory size for the top process is right in line with my 
index size, plus a few gig for the java heap.  Something to note as you 
ponder these numbers: My java heap is only 6GB.  Java has allocated the 
entire 6GB.  The other two java processes are homegrown Solr-related 
applications.


What's odd is the resident and shared memory sizes.  I have pretty much 
convinced myself that the shared memory size is misreported.  If you add 
up the numbers for cached and free, you get a total of 53659264 ... 
about 11GB shy of the 64GB total memory.


if the reported resident memory for the Solr java process (17GB) were 
accurate, this would exceed total physical memory by several gigabytes, 
and there would be swap in use, but as you can see, there is no swap in use.


Recently I overheard a conversation between Lucene committers in a 
lucene IRC channel that seemed to be discussing this phenomenon.  There 
is apparently some issue with certain mmap modes that result in the 
operating system shared memory number going up even though no actual 
memory is being consumed.


Thanks,
Shawn



Re: How to Manage RAM Usage at Heavy Indexing

2013-09-09 Thread Furkan KAMACI
Is there anything says something about that bug?


2013/8/28 Dan Davis 

> This could be an operating systems problem rather than a Solr problem.
> CentOS 6.4 (linux kernel 2.6.32) may have some issues with page flushing
> and I would read-up up on that.
> The VM parameters can be tuned in /etc/sysctl.conf
>
>
> On Sun, Aug 25, 2013 at 4:23 PM, Furkan KAMACI  >wrote:
>
> > Hi Erick;
> >
> > I wanted to get a quick answer that's why I asked my question as that
> way.
> >
> > Error is as follows:
> >
> > INFO  - 2013-08-21 22:01:30.978;
> > org.apache.solr.update.processor.LogUpdateProcessor; [collection1]
> > webapp=/solr path=/update params={wt=javabin&version=2}
> > {add=[com.deviantart.reachmeh
> > ere:http/gallery/, com.deviantart.reachstereo:http/,
> > com.deviantart.reachstereo:http/art/SE-mods-313298903,
> > com.deviantart.reachtheclouds:http/,
> com.deviantart.reachthegoddess:http/,
> > co
> > m.deviantart.reachthegoddess:http/art/retouched-160219962,
> > com.deviantart.reachthegoddess:http/badges/,
> > com.deviantart.reachthegoddess:http/favourites/,
> > com.deviantart.reachthetop:http/
> > art/Blue-Jean-Baby-82204657 (1444006227844530177),
> > com.deviantart.reachurdreams:http/, ... (163 adds)]} 0 38790
> > ERROR - 2013-08-21 22:01:30.979; org.apache.solr.common.SolrException;
> > java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException]
> > early EOF
> > at
> >
> >
> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
> > at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
> > at
> >
> >
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
> > at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
> > at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:393)
> > at
> >
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:245)
> > at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
> > at
> >
> >
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> > at
> >
> >
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> > at
> >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1812)
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
> > at
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
> > at
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
> > at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
> > at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
> > at
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> > at
> >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> > at
> >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> > at
> >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> > at org.eclipse.jetty.server.Server.handle(Server.java:365)
> > at
> >
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
> > at
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> > at
> >
> >
> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937)
> > at
> >
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998)
> > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:948)
> > at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> > at
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> > at
> >
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> > at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> > at
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>

Re: How to Manage RAM Usage at Heavy Indexing

2013-08-28 Thread Dan Davis
This could be an operating systems problem rather than a Solr problem.
CentOS 6.4 (linux kernel 2.6.32) may have some issues with page flushing
and I would read-up up on that.
The VM parameters can be tuned in /etc/sysctl.conf


On Sun, Aug 25, 2013 at 4:23 PM, Furkan KAMACI wrote:

> Hi Erick;
>
> I wanted to get a quick answer that's why I asked my question as that way.
>
> Error is as follows:
>
> INFO  - 2013-08-21 22:01:30.978;
> org.apache.solr.update.processor.LogUpdateProcessor; [collection1]
> webapp=/solr path=/update params={wt=javabin&version=2}
> {add=[com.deviantart.reachmeh
> ere:http/gallery/, com.deviantart.reachstereo:http/,
> com.deviantart.reachstereo:http/art/SE-mods-313298903,
> com.deviantart.reachtheclouds:http/, com.deviantart.reachthegoddess:http/,
> co
> m.deviantart.reachthegoddess:http/art/retouched-160219962,
> com.deviantart.reachthegoddess:http/badges/,
> com.deviantart.reachthegoddess:http/favourites/,
> com.deviantart.reachthetop:http/
> art/Blue-Jean-Baby-82204657 (1444006227844530177),
> com.deviantart.reachurdreams:http/, ... (163 adds)]} 0 38790
> ERROR - 2013-08-21 22:01:30.979; org.apache.solr.common.SolrException;
> java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException]
> early EOF
> at
>
> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
> at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
> at
>
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
> at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
> at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:393)
> at
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:245)
> at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
> at
>
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> at
>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1812)
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
> at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
> at
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
> at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
> at
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
> at
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
> at
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:365)
> at
>
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
> at
>
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> at
>
> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937)
> at
>
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:948)
> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
>
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> at
>
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.eclipse.jetty.io.EofException: early EOF
> at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
> at java.io.InputStream.read(InputStream.java:101)
> at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
> at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
> at com.ctc.wstx.io.MergedReader.read(MergedRea

Re: How to Manage RAM Usage at Heavy Indexing

2013-08-25 Thread Furkan KAMACI
Hi Erick;

I wanted to get a quick answer that's why I asked my question as that way.

Error is as follows:

INFO  - 2013-08-21 22:01:30.978;
org.apache.solr.update.processor.LogUpdateProcessor; [collection1]
webapp=/solr path=/update params={wt=javabin&version=2}
{add=[com.deviantart.reachmeh
ere:http/gallery/, com.deviantart.reachstereo:http/,
com.deviantart.reachstereo:http/art/SE-mods-313298903,
com.deviantart.reachtheclouds:http/, com.deviantart.reachthegoddess:http/,
co
m.deviantart.reachthegoddess:http/art/retouched-160219962,
com.deviantart.reachthegoddess:http/badges/,
com.deviantart.reachthegoddess:http/favourites/,
com.deviantart.reachthetop:http/
art/Blue-Jean-Baby-82204657 (1444006227844530177),
com.deviantart.reachurdreams:http/, ... (163 adds)]} 0 38790
ERROR - 2013-08-21 22:01:30.979; org.apache.solr.common.SolrException;
java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException]
early EOF
at
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)
at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)
at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:393)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:245)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1812)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:948)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.eclipse.jetty.io.EofException: early EOF
at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65)
at java.io.InputStream.read(InputStream.java:101)
at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)
at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)
at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)
at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)
at
com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)
at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992)
at
com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4628)
at
com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126)
at
com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701)
at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicS

Re: How to Manage RAM Usage at Heavy Indexing

2013-08-24 Thread Erick Erickson
This is sounding like an XY problem. What are you measuring
when you say RAM usage is 99%? is this virtual memory? See:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

What errors are you seeing when you say: "my node stops to receiving
documents"?

How are you sending 10M documents? All at once in a huge packet
or some smaller number at a time? From where? How?

And what does Hadoop have to do with anything? Are you putting
the Solr index on Hadoop? How? The recent contrib?

In short, you haven't provided very many details. You've been around
long enough that I'm surprised you're saying "it doesn't work, how can
I fix it?" without providing much in the way of details to help us help
you.

Best
Erick



On Sat, Aug 24, 2013 at 1:52 PM, Furkan KAMACI wrote:

> I make a test at my SolrCloud. I try to send 100 millions documents into my
> node which has no replica via Hadoop. When document count send to that node
> is around 30 millions, RAM usage of my machine becomes 99% (Solr Heap Usage
> is not 99%, it uses just 3GB - 4GB of RAM). After a time later my node
> stops to receiving documents to index and the Indexer Job fails as well.
>
> How can I force to clean OS cache (if it is OS cache that blocks) me or
> what should I do (maybe sending 10 million documents and waiting a little
> etc.) What fellows do at heavy indexing situations?
>


How to Manage RAM Usage at Heavy Indexing

2013-08-24 Thread Furkan KAMACI
I make a test at my SolrCloud. I try to send 100 millions documents into my
node which has no replica via Hadoop. When document count send to that node
is around 30 millions, RAM usage of my machine becomes 99% (Solr Heap Usage
is not 99%, it uses just 3GB - 4GB of RAM). After a time later my node
stops to receiving documents to index and the Indexer Job fails as well.

How can I force to clean OS cache (if it is OS cache that blocks) me or
what should I do (maybe sending 10 million documents and waiting a little
etc.) What fellows do at heavy indexing situations?