Re: Re: Config for massive inserts into Solr master

2016-10-12 Thread Reinhard Budenstecher
> That is not correct as of version 4.0.
>
> The only kind of update I've run into that cannot proceed at the same
> time as an optimize is a deleteByQuery operation.  If you do that, then
> it will block until the optimize is done, and I think it will also block
>
> any update you do after it.
>

As UPDATES, INSERTS and DELETES always contain the field ID, I can always 
perform an OPTIMIZE!? That sounds great. As disk is double size as index there 
should be enough disk space left. So I can perform a weekly OPTIMIZE on MASTER 
without limitations (apart from deleteByQuery) per crontab?

__
Gesendet mit Maills.de - mehr als nur Freemail www.maills.de




Re: Re: Config for massive inserts into Solr master

2016-10-11 Thread Reinhard Budenstecher
>
> That's considerably larger than you initially indicated.  In just one
> index, you've got almost 300 million docs taking up well over 200GB.
> About half of them have been deleted, but they are still there.  Those
> deleted docs *DO* affect operation and memory usage.
>
> Getting rid of deleted docs would go a long way towards reducing memory
>
> usage.  The only effective way to get rid of them is to optimize the
> index ... but I will warn you that with an index of that size, the time

It really seems to be a matter of size :) We've extended servers RAM from 64GB 
to 128GB and raised heap space from 32GB to 64GB and now ETL processes are 
running for 3 days now without interruption. That does not satisfy me but it's 
a solution to keep business running for now.
Is my assumption correct that an OPTIMIZE of index would block all inserts? So 
that all processes have to pause when I will start an hour-running OPTIMIZE? If 
so, this would also be no option for the moment.

__
Gesendet mit Maills.de - mehr als nur Freemail www.maills.de




Re: Re: Re: Config for massive inserts into Solr master

2016-10-10 Thread Reinhard Budenstecher

>
> Just a sanity check. That directory mentioned, what kind of file system is 
> that on? NFS, NAS, RAID?

I'm using Ext4 with options "noatime,nodiratime,barrier=0" on a hardware RAID10 
with 4 SSD disks


__
Gesendet mit Maills.de - mehr als nur Freemail www.maills.de




Re: Re: Config for massive inserts into Solr master

2016-10-10 Thread Reinhard Budenstecher

>
> What I have been hoping to see is the exact text of an OutOfMemoryError
> in solr.log so I can tell whether it's happening because of heap space
> or some other problem, like stack space.  The stacktrace on such an
> error might be helpfultoo.
>

Hi,

I did understand what you need, I'm newbie to Solr and Java but not to Linux. 
So please believe me, the only data that gets written on OOM is

# solr_oom_killer-8983-2016-10-10_00_02_46.log
Running OOM killer script for process 116987 for Solr on port 8983
Killed process 116987


# solr.log


# solr_gc_log_20161010_0810
2016-10-10T00:02:26.331+0200: 25715,434: [GC (CMS Initial Mark) [1 
CMS-initial-mark: 12114331K(16777216K)] 12163041K(30758272K), 0,0036996 secs] 
[Times: user=0,01 sys=0,00, real=0,01 secs]
2016-10-10T00:02:26.335+0200: 25715,438: Total time for which application 
threads were stopped: 0,1433940 seconds, Stopping threads took: 0,1393095 
seconds
2016-10-10T00:02:26.335+0200: 25715,438: [CMS-concurrent-mark-start]
2016-10-10T00:02:26.594+0200: 25715,697: Total time for which application 
threads were stopped: 0,1226207 seconds, Stopping threads took: 0,1223130 
seconds
2016-10-10T00:02:26.901+0200: 25716,003: Total time for which application 
threads were stopped: 0,3050990 seconds, Stopping threads took: 0,3047010 
seconds
2016-10-10T00:02:26.960+0200: 25716,063: Total time for which application 
threads were stopped: 0,0043570 seconds, Stopping threads took: 0,0040603 
seconds
2016-10-10T00:02:26.960+0200: 25716,063: Total time for which application 
threads were stopped: 0,0002308 seconds, Stopping threads took: 0,567 
seconds
2016-10-10T00:02:28.324+0200: 25717,426: Total time for which application 
threads were stopped: 0,0061071 seconds, Stopping threads took: 0,0059022 
seconds
2016-10-10T00:02:29.325+0200: 25718,428: Total time for which application 
threads were stopped: 0,0003410 seconds, Stopping threads took: 0,0001455 
seconds
2016-10-10T00:02:31.056+0200: 25720,158: [CMS-concurrent-mark: 4,280/4,721 
secs] [Times: user=21,41 sys=0,44, real=4,72 secs]
2016-10-10T00:02:31.056+0200: 25720,158: [CMS-concurrent-preclean-start]
2016-10-10T00:02:31.332+0200: 25720,434: Total time for which application 
threads were stopped: 0,0029760 seconds, Stopping threads took: 0,0027661 
seconds
2016-10-10T00:02:32.187+0200: 25721,290: [CMS-concurrent-preclean: 0,965/1,131 
secs] [Times: user=5,92 sys=0,09, real=1,13 secs]
2016-10-10T00:02:32.187+0200: 25721,290: 
[CMS-concurrent-abortable-preclean-start]
2016-10-10T00:02:32.546+0200: 25721,648: Total time for which application 
threads were stopped: 0,0006876 seconds, Stopping threads took: 0,0001229 
seconds
2016-10-10T00:02:32.964+0200: 25722,066: Total time for which application 
threads were stopped: 0,0020954 seconds, Stopping threads took: 0,0018945 
seconds
2016-10-10T00:02:33.794+0200: 25722,896: [CMS-concurrent-abortable-preclean: 
1,573/1,607 secs] [Times: user=9,04 sys=0,12, real=1,61 secs]
2016-10-10T00:02:33.926+0200: 25723,028: [GC (CMS Final Remark) [YG occupancy: 
4396269 K (13981056 K)]{Heap before GC invocations=478 (full 6):
 par new generation   total 13981056K, used 4396269K [0x7fa0d800, 
0x7fa4d800, 0x7fa4d800)
  eden space 11184896K,  39% used [0x7fa0d800, 0x7fa1e453b598, 
0x7fa382ac)
  from space 2796160K,   0% used [0x7fa42d56, 0x7fa42d56, 
0x7fa4d800)
  to   space 2796160K,   0% used [0x7fa382ac, 0x7fa382ac, 
0x7fa42d56)
 concurrent mark-sweep generation total 16777216K, used 15699719K 
[0x7fa4d800, 0x7fa8d800, 0x7fa8d800)
 Metaspace   used 36740K, capacity 37441K, committed 37684K, reserved 38912K
2016-10-10T00:02:33.926+0200: 25723,028: [GC (CMS Final Remark) 
2016-10-10T00:02:33.926+0200: 25723,029: [ParNew: 
4396269K->4396269K(13981056K), 0,212 secs] 20095988K->20095988K(30758272K), 
0,00
06292 secs] [Times: user=0,00 sys=0,00, real=0,00 secs]
Heap after GC invocations=479 (full 6):
 par new generation   total 13981056K, used 4396269K [0x7fa0d800, 
0x7fa4d800, 0x7fa4d800)
  eden space 11184896K,  39% used [0x7fa0d800, 0x7fa1e453b598, 
0x7fa382ac)
  from space 2796160K,   0% used [0x7fa42d56, 0x7fa42d56, 
0x7fa4d800)
  to   space 2796160K,   0% used [0x7fa382ac, 0x7fa382ac, 
0x7fa42d56)
 concurrent mark-sweep generation total 16777216K, used 15699719K 
[0x7fa4d800, 0x7fa8d800, 0x7fa8d800)
 Metaspace   used 36740K, capacity 37441K, committed 37684K, reserved 38912K
}
2016-10-10T00:02:33.926+0200: 25723,029: [Rescan (parallel) , 1,4594466 
secs]2016-10-10T00:02:35.386+0200: 25724,488: [weak refs processing, 0,0065564 
secs]2016-10-10T00:02:35.392+0200: 25724,495: [class unloading, 0,0089755 
secs]2016-10-10T00:02:35.401+0200: 25724,504: [scrub symbol table, 0,0044581 
secs]2016-10-10T00:02:35.406+0200: 25724,508: [scrub s

Re: Re: Config for massive inserts into Solr master

2016-10-10 Thread Alexandre Rafalovitch
Just a sanity check. That directory mentioned, what kind of file system is
that on? NFS, NAS, RAID?

Regards,
Alex

On 10 Oct 2016 1:09 AM, "Reinhard Budenstecher"  wrote:

>
> That's considerably larger than you initially indicated.  In just one
> index, you've got almost 300 million docs taking up well over 200GB.
> About half of them have been deleted, but they are still there.  Those
> deleted docs *DO* affect operation and memory usage.
>

Yes, that's larger than I expected. Two days ago the index was at the size
I've written. This huge increase does happen because of running ETL.

>
> usage.  The only effective way to get rid of them is to optimize the
> index ... but I will warn you that with an index of that size, the time
> required for an optimize can reach into multiple hours, and will
> temporarily require considerable additional disk space.  The fact that

Three days ago we've upgraded from Solr 5.5.3 to 6.2.1. Before upgrading
I've optimized this index already and yes, it took some hours. So when two
days of ETL cause such an increase of index size, running a daily optimize
is not an option.

>
> You don't need to create it.  Stacktraces are logged by Solr, in a file
> named solr.log, whenever most errors occur.
>

Really, there is nothing in solr. log. I did not change any option related
to this in config.  Solr died again some hours ago and the last entry is:

2016-10-09 22:02:31.051 WARN  (qtp225493257-1097) [   ]
o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_9102]
java.nio.file.NoSuchFileException: /var/solr/data/myshop/data/
index/segments_9102
at sun.nio.fs.UnixException.translateToIOException(
UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(
UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(
UnixException.java:107)
at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(
UnixFileAttributeViews.java:55)
at sun.nio.fs.UnixFileSystemProvider.readAttributes(
UnixFileSystemProvider.java:144)
at sun.nio.fs.LinuxFileSystemProvider.readAttributes(
LinuxFileSystemProvider.java:99)
at java.nio.file.Files.readAttributes(Files.java:1737)
at java.nio.file.Files.size(Files.java:2332)
at org.apache.lucene.store.FSDirectory.fileLength(
FSDirectory.java:243)
at org.apache.lucene.store.NRTCachingDirectory.fileLength(
NRTCachingDirectory.java:128)
at org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(
LukeRequestHandler.java:597)
at org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(
LukeRequestHandler.java:585)
at org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus(
CoreAdminOperation.java:1007)
at org.apache.solr.handler.admin.CoreAdminOperation.lambda$
static$3(CoreAdminOperation.java:170)
at org.apache.solr.handler.admin.CoreAdminOperation.execute(
CoreAdminOperation.java:1056)
at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.
call(CoreAdminHandler.java:365)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(
CoreAdminHandler.java:156)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(
RequestHandlerBase.java:154)
at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(
HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:440)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
SolrDispatchFilter.java:257)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
SolrDispatchFilter.java:208)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(
ServletHandler.java:581)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(
ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(
SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.
doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.
doHandle(ContextHandler.java:1160)
at org.eclipse.jetty.servlet.ServletHandler.doScope(
ServletHandler.java:511)
at org.eclipse.jetty.server.session.SessionHandler.
doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.
doScope(ContextHandler.java:1092)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(
ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerCollection.
handle(HandlerCollection.java:119)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at org.ecl

Re: Re: Config for massive inserts into Solr master

2016-10-09 Thread Reinhard Budenstecher
>
> That's considerably larger than you initially indicated.  In just one
> index, you've got almost 300 million docs taking up well over 200GB.
> About half of them have been deleted, but they are still there.  Those
> deleted docs *DO* affect operation and memory usage.
>

Yes, that's larger than I expected. Two days ago the index was at the size I've 
written. This huge increase does happen because of running ETL.

>
> usage.  The only effective way to get rid of them is to optimize the
> index ... but I will warn you that with an index of that size, the time
> required for an optimize can reach into multiple hours, and will
> temporarily require considerable additional disk space.  The fact that

Three days ago we've upgraded from Solr 5.5.3 to 6.2.1. Before upgrading I've 
optimized this index already and yes, it took some hours. So when two days of 
ETL cause such an increase of index size, running a daily optimize is not an 
option.

>
> You don't need to create it.  Stacktraces are logged by Solr, in a file
> named solr.log, whenever most errors occur.
>

Really, there is nothing in solr. log. I did not change any option related to 
this in config.  Solr died again some hours ago and the last entry is:

2016-10-09 22:02:31.051 WARN  (qtp225493257-1097) [   ] 
o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_9102]
java.nio.file.NoSuchFileException: 
/var/solr/data/myshop/data/index/segments_9102
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
at 
sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
at 
sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
at java.nio.file.Files.readAttributes(Files.java:1737)
at java.nio.file.Files.size(Files.java:2332)
at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
at 
org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:128)
at 
org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(LukeRequestHandler.java:597)
at 
org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(LukeRequestHandler.java:585)
at 
org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus(CoreAdminOperation.java:1007)
at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$3(CoreAdminOperation.java:170)
at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:1056)
at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:365)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:156)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:154)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:440)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)

Re: Re: Config for massive inserts into Solr master

2016-10-09 Thread Reinhard Budenstecher

> What version of Solr?  How has it been installed and started?
>
Solr 6.2.1 on Debian Jessie, installed with:

apt-get install openjdk-8-jre-headless openjdk-8-jdk-headless
wget "http://www.eu.apache.org/dist/lucene/solr/6.2.1/solr-6.2.1.tgz"; && tar 
xvfz solr-*.tgz
./solr-*/bin/install_solr_service.sh solr-*.tgz

started with "service solr start"

> Is this a single index core with 150 million docs and 140GB index
> directory size, or is that the sum total of all the indexes on the machine?
>

Actually, there are three cores and UI gives me following info:

Num Docs:148652589, Max Doc:298367634, Size:219.92 GB
Num Docs:37396140, Max Doc:38926989, Size:28.81 GB
Num Docs:8601222Max Doc:9111004, Size:6.26 GB

but the last two cores are not important

>
> It seems unlikely to me that you would see OOM errors when indexing with
> a 32GB heap and no queries.  You might try dropping the max heap to 31GB
> instead of 32GB, so your Java pointer sizes are cut in half.  You might
> actually see a net increase in the amount of memory that Solr can
> utilize with that change.

Actually, Solr servers dies nearly once a day, on next shutdown, I'll reduce 
the heap size.

>
> Whether the errors continue or not, can you copy the full error from
> your log with stacktrace(s) so we can see it?


How do I create such a stack trace? I have no more log informations than the 
already posted ones.

__
Gesendet mit Maills.de - mehr als nur Freemail www.maills.de