Re: Re: Config for massive inserts into Solr master
> That is not correct as of version 4.0. > > The only kind of update I've run into that cannot proceed at the same > time as an optimize is a deleteByQuery operation. If you do that, then > it will block until the optimize is done, and I think it will also block > > any update you do after it. > As UPDATES, INSERTS and DELETES always contain the field ID, I can always perform an OPTIMIZE!? That sounds great. As disk is double size as index there should be enough disk space left. So I can perform a weekly OPTIMIZE on MASTER without limitations (apart from deleteByQuery) per crontab? __ Gesendet mit Maills.de - mehr als nur Freemail www.maills.de
Re: Re: Config for massive inserts into Solr master
> > That's considerably larger than you initially indicated. In just one > index, you've got almost 300 million docs taking up well over 200GB. > About half of them have been deleted, but they are still there. Those > deleted docs *DO* affect operation and memory usage. > > Getting rid of deleted docs would go a long way towards reducing memory > > usage. The only effective way to get rid of them is to optimize the > index ... but I will warn you that with an index of that size, the time It really seems to be a matter of size :) We've extended servers RAM from 64GB to 128GB and raised heap space from 32GB to 64GB and now ETL processes are running for 3 days now without interruption. That does not satisfy me but it's a solution to keep business running for now. Is my assumption correct that an OPTIMIZE of index would block all inserts? So that all processes have to pause when I will start an hour-running OPTIMIZE? If so, this would also be no option for the moment. __ Gesendet mit Maills.de - mehr als nur Freemail www.maills.de
Re: Re: Re: Config for massive inserts into Solr master
> > Just a sanity check. That directory mentioned, what kind of file system is > that on? NFS, NAS, RAID? I'm using Ext4 with options "noatime,nodiratime,barrier=0" on a hardware RAID10 with 4 SSD disks __ Gesendet mit Maills.de - mehr als nur Freemail www.maills.de
Re: Re: Config for massive inserts into Solr master
> > What I have been hoping to see is the exact text of an OutOfMemoryError > in solr.log so I can tell whether it's happening because of heap space > or some other problem, like stack space. The stacktrace on such an > error might be helpfultoo. > Hi, I did understand what you need, I'm newbie to Solr and Java but not to Linux. So please believe me, the only data that gets written on OOM is # solr_oom_killer-8983-2016-10-10_00_02_46.log Running OOM killer script for process 116987 for Solr on port 8983 Killed process 116987 # solr.log # solr_gc_log_20161010_0810 2016-10-10T00:02:26.331+0200: 25715,434: [GC (CMS Initial Mark) [1 CMS-initial-mark: 12114331K(16777216K)] 12163041K(30758272K), 0,0036996 secs] [Times: user=0,01 sys=0,00, real=0,01 secs] 2016-10-10T00:02:26.335+0200: 25715,438: Total time for which application threads were stopped: 0,1433940 seconds, Stopping threads took: 0,1393095 seconds 2016-10-10T00:02:26.335+0200: 25715,438: [CMS-concurrent-mark-start] 2016-10-10T00:02:26.594+0200: 25715,697: Total time for which application threads were stopped: 0,1226207 seconds, Stopping threads took: 0,1223130 seconds 2016-10-10T00:02:26.901+0200: 25716,003: Total time for which application threads were stopped: 0,3050990 seconds, Stopping threads took: 0,3047010 seconds 2016-10-10T00:02:26.960+0200: 25716,063: Total time for which application threads were stopped: 0,0043570 seconds, Stopping threads took: 0,0040603 seconds 2016-10-10T00:02:26.960+0200: 25716,063: Total time for which application threads were stopped: 0,0002308 seconds, Stopping threads took: 0,567 seconds 2016-10-10T00:02:28.324+0200: 25717,426: Total time for which application threads were stopped: 0,0061071 seconds, Stopping threads took: 0,0059022 seconds 2016-10-10T00:02:29.325+0200: 25718,428: Total time for which application threads were stopped: 0,0003410 seconds, Stopping threads took: 0,0001455 seconds 2016-10-10T00:02:31.056+0200: 25720,158: [CMS-concurrent-mark: 4,280/4,721 secs] [Times: user=21,41 sys=0,44, real=4,72 secs] 2016-10-10T00:02:31.056+0200: 25720,158: [CMS-concurrent-preclean-start] 2016-10-10T00:02:31.332+0200: 25720,434: Total time for which application threads were stopped: 0,0029760 seconds, Stopping threads took: 0,0027661 seconds 2016-10-10T00:02:32.187+0200: 25721,290: [CMS-concurrent-preclean: 0,965/1,131 secs] [Times: user=5,92 sys=0,09, real=1,13 secs] 2016-10-10T00:02:32.187+0200: 25721,290: [CMS-concurrent-abortable-preclean-start] 2016-10-10T00:02:32.546+0200: 25721,648: Total time for which application threads were stopped: 0,0006876 seconds, Stopping threads took: 0,0001229 seconds 2016-10-10T00:02:32.964+0200: 25722,066: Total time for which application threads were stopped: 0,0020954 seconds, Stopping threads took: 0,0018945 seconds 2016-10-10T00:02:33.794+0200: 25722,896: [CMS-concurrent-abortable-preclean: 1,573/1,607 secs] [Times: user=9,04 sys=0,12, real=1,61 secs] 2016-10-10T00:02:33.926+0200: 25723,028: [GC (CMS Final Remark) [YG occupancy: 4396269 K (13981056 K)]{Heap before GC invocations=478 (full 6): par new generation total 13981056K, used 4396269K [0x7fa0d800, 0x7fa4d800, 0x7fa4d800) eden space 11184896K, 39% used [0x7fa0d800, 0x7fa1e453b598, 0x7fa382ac) from space 2796160K, 0% used [0x7fa42d56, 0x7fa42d56, 0x7fa4d800) to space 2796160K, 0% used [0x7fa382ac, 0x7fa382ac, 0x7fa42d56) concurrent mark-sweep generation total 16777216K, used 15699719K [0x7fa4d800, 0x7fa8d800, 0x7fa8d800) Metaspace used 36740K, capacity 37441K, committed 37684K, reserved 38912K 2016-10-10T00:02:33.926+0200: 25723,028: [GC (CMS Final Remark) 2016-10-10T00:02:33.926+0200: 25723,029: [ParNew: 4396269K->4396269K(13981056K), 0,212 secs] 20095988K->20095988K(30758272K), 0,00 06292 secs] [Times: user=0,00 sys=0,00, real=0,00 secs] Heap after GC invocations=479 (full 6): par new generation total 13981056K, used 4396269K [0x7fa0d800, 0x7fa4d800, 0x7fa4d800) eden space 11184896K, 39% used [0x7fa0d800, 0x7fa1e453b598, 0x7fa382ac) from space 2796160K, 0% used [0x7fa42d56, 0x7fa42d56, 0x7fa4d800) to space 2796160K, 0% used [0x7fa382ac, 0x7fa382ac, 0x7fa42d56) concurrent mark-sweep generation total 16777216K, used 15699719K [0x7fa4d800, 0x7fa8d800, 0x7fa8d800) Metaspace used 36740K, capacity 37441K, committed 37684K, reserved 38912K } 2016-10-10T00:02:33.926+0200: 25723,029: [Rescan (parallel) , 1,4594466 secs]2016-10-10T00:02:35.386+0200: 25724,488: [weak refs processing, 0,0065564 secs]2016-10-10T00:02:35.392+0200: 25724,495: [class unloading, 0,0089755 secs]2016-10-10T00:02:35.401+0200: 25724,504: [scrub symbol table, 0,0044581 secs]2016-10-10T00:02:35.406+0200: 25724,508: [scrub s
Re: Re: Config for massive inserts into Solr master
Just a sanity check. That directory mentioned, what kind of file system is that on? NFS, NAS, RAID? Regards, Alex On 10 Oct 2016 1:09 AM, "Reinhard Budenstecher" wrote: > > That's considerably larger than you initially indicated. In just one > index, you've got almost 300 million docs taking up well over 200GB. > About half of them have been deleted, but they are still there. Those > deleted docs *DO* affect operation and memory usage. > Yes, that's larger than I expected. Two days ago the index was at the size I've written. This huge increase does happen because of running ETL. > > usage. The only effective way to get rid of them is to optimize the > index ... but I will warn you that with an index of that size, the time > required for an optimize can reach into multiple hours, and will > temporarily require considerable additional disk space. The fact that Three days ago we've upgraded from Solr 5.5.3 to 6.2.1. Before upgrading I've optimized this index already and yes, it took some hours. So when two days of ETL cause such an increase of index size, running a daily optimize is not an option. > > You don't need to create it. Stacktraces are logged by Solr, in a file > named solr.log, whenever most errors occur. > Really, there is nothing in solr. log. I did not change any option related to this in config. Solr died again some hours ago and the last entry is: 2016-10-09 22:02:31.051 WARN (qtp225493257-1097) [ ] o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_9102] java.nio.file.NoSuchFileException: /var/solr/data/myshop/data/ index/segments_9102 at sun.nio.fs.UnixException.translateToIOException( UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException( UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException( UnixException.java:107) at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes( UnixFileAttributeViews.java:55) at sun.nio.fs.UnixFileSystemProvider.readAttributes( UnixFileSystemProvider.java:144) at sun.nio.fs.LinuxFileSystemProvider.readAttributes( LinuxFileSystemProvider.java:99) at java.nio.file.Files.readAttributes(Files.java:1737) at java.nio.file.Files.size(Files.java:2332) at org.apache.lucene.store.FSDirectory.fileLength( FSDirectory.java:243) at org.apache.lucene.store.NRTCachingDirectory.fileLength( NRTCachingDirectory.java:128) at org.apache.solr.handler.admin.LukeRequestHandler.getFileLength( LukeRequestHandler.java:597) at org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo( LukeRequestHandler.java:585) at org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus( CoreAdminOperation.java:1007) at org.apache.solr.handler.admin.CoreAdminOperation.lambda$ static$3(CoreAdminOperation.java:170) at org.apache.solr.handler.admin.CoreAdminOperation.execute( CoreAdminOperation.java:1056) at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo. call(CoreAdminHandler.java:365) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody( CoreAdminHandler.java:156) at org.apache.solr.handler.RequestHandlerBase.handleRequest( RequestHandlerBase.java:154) at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest( HttpSolrCall.java:658) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:440) at org.apache.solr.servlet.SolrDispatchFilter.doFilter( SolrDispatchFilter.java:257) at org.apache.solr.servlet.SolrDispatchFilter.doFilter( SolrDispatchFilter.java:208) at org.eclipse.jetty.servlet.ServletHandler$CachedChain. doFilter(ServletHandler.java:1668) at org.eclipse.jetty.servlet.ServletHandler.doHandle( ServletHandler.java:581) at org.eclipse.jetty.server.handler.ScopedHandler.handle( ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle( SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler. doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler. doHandle(ContextHandler.java:1160) at org.eclipse.jetty.servlet.ServletHandler.doScope( ServletHandler.java:511) at org.eclipse.jetty.server.session.SessionHandler. doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler. doScope(ContextHandler.java:1092) at org.eclipse.jetty.server.handler.ScopedHandler.handle( ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle( ContextHandlerCollection.java:213) at org.eclipse.jetty.server.handler.HandlerCollection. handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle( HandlerWrapper.java:134) at org.eclipse.jetty.server.Server.handle(Server.java:518) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308) at org.ecl
Re: Re: Config for massive inserts into Solr master
> > That's considerably larger than you initially indicated. In just one > index, you've got almost 300 million docs taking up well over 200GB. > About half of them have been deleted, but they are still there. Those > deleted docs *DO* affect operation and memory usage. > Yes, that's larger than I expected. Two days ago the index was at the size I've written. This huge increase does happen because of running ETL. > > usage. The only effective way to get rid of them is to optimize the > index ... but I will warn you that with an index of that size, the time > required for an optimize can reach into multiple hours, and will > temporarily require considerable additional disk space. The fact that Three days ago we've upgraded from Solr 5.5.3 to 6.2.1. Before upgrading I've optimized this index already and yes, it took some hours. So when two days of ETL cause such an increase of index size, running a daily optimize is not an option. > > You don't need to create it. Stacktraces are logged by Solr, in a file > named solr.log, whenever most errors occur. > Really, there is nothing in solr. log. I did not change any option related to this in config. Solr died again some hours ago and the last entry is: 2016-10-09 22:02:31.051 WARN (qtp225493257-1097) [ ] o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_9102] java.nio.file.NoSuchFileException: /var/solr/data/myshop/data/index/segments_9102 at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55) at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144) at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99) at java.nio.file.Files.readAttributes(Files.java:1737) at java.nio.file.Files.size(Files.java:2332) at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243) at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:128) at org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(LukeRequestHandler.java:597) at org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(LukeRequestHandler.java:585) at org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus(CoreAdminOperation.java:1007) at org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$3(CoreAdminOperation.java:170) at org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:1056) at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:365) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:156) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154) at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:658) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:440) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.eclipse.jetty.server.Server.handle(Server.java:518) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
Re: Re: Config for massive inserts into Solr master
> What version of Solr? How has it been installed and started? > Solr 6.2.1 on Debian Jessie, installed with: apt-get install openjdk-8-jre-headless openjdk-8-jdk-headless wget "http://www.eu.apache.org/dist/lucene/solr/6.2.1/solr-6.2.1.tgz"; && tar xvfz solr-*.tgz ./solr-*/bin/install_solr_service.sh solr-*.tgz started with "service solr start" > Is this a single index core with 150 million docs and 140GB index > directory size, or is that the sum total of all the indexes on the machine? > Actually, there are three cores and UI gives me following info: Num Docs:148652589, Max Doc:298367634, Size:219.92 GB Num Docs:37396140, Max Doc:38926989, Size:28.81 GB Num Docs:8601222Max Doc:9111004, Size:6.26 GB but the last two cores are not important > > It seems unlikely to me that you would see OOM errors when indexing with > a 32GB heap and no queries. You might try dropping the max heap to 31GB > instead of 32GB, so your Java pointer sizes are cut in half. You might > actually see a net increase in the amount of memory that Solr can > utilize with that change. Actually, Solr servers dies nearly once a day, on next shutdown, I'll reduce the heap size. > > Whether the errors continue or not, can you copy the full error from > your log with stacktrace(s) so we can see it? How do I create such a stack trace? I have no more log informations than the already posted ones. __ Gesendet mit Maills.de - mehr als nur Freemail www.maills.de