Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-14 Thread Ralf Matulat

I checked the index without any problems beeing found.
So it is not obvious, what is going wrong here while the index itself 
looks okay.


Next step, updating java, is work in progress.
So I will come back after sorting out the java version as the cause for 
the failing optimize.


The checkIndex output:

# java -cp lucene-core-4.4.0.jar -ea:org.apache.lucene... 
org.apache.lucene.index.CheckIndex /opt/solr/core-archiv/data/index/


Opening index @ /opt/solr/core-archiv/data/index/

Segments file=segments_44u numSegments=7 version=4.4 format= 
userData={commitTimeMSec=1389680580721}

  1 of 7: name=_dc8 docCount=9876366
codec=Lucene42
compound=false
numFiles=13
size (MB)=39,878.536
diagnostics = {os.arch=amd64, os=Linux, java.vendor=IBM 
Corporation, timestamp=1389390082688, mergeFactor=6, 
mergeMaxNumSegments=1, source=merge, lucene.version=4.4.0 1504776 - 
sarowe - 2013-07-19 02:53:42, java.version=1.6.0, 
os.version=2.6.27.19-5-default}

has deletions [delGen=50]
test: open reader.OK [2355 deleted docs]
test: fields..OK [18 fields]
test: field norms.OK [11 fields]
test: terms, freq, prox...OK [67539365 terms; 2886174773 terms/docs 
pairs; 3908230597 tokens]
test (ignoring deletes): terms, freq, prox...OK [67545837 terms; 
2886362782 terms/docs pairs; 3908385036 tokens]
test: stored fields...OK [129471633 total field count; avg 
13.112 fields per doc]
test: term vectorsOK [29360029 total vector count; avg 
2.974 term/freq vector fields per doc]

test: docvalues...OK [0 total doc count; 0 docvalues fields]

  2 of 7: name=_e8u docCount=4250
codec=Lucene42
compound=false
numFiles=13
size (MB)=10.137
diagnostics = {os.arch=amd64, os=Linux, java.vendor=IBM 
Corporation, timestamp=1389469381574, mergeFactor=3, 
mergeMaxNumSegments=-1, source=merge, lucene.version=4.4.0 1504776 - 
sarowe - 2013-07-19 02:53:42, java.version=1.6.0, 
os.version=2.6.27.19-5-default}

has deletions [delGen=1]
test: open reader.OK [7 deleted docs]
test: fields..OK [18 fields]
test: field norms.OK [11 fields]
test: terms, freq, prox...OK [106619 terms; 736990 terms/docs 
pairs; 899712 tokens]
test (ignoring deletes): terms, freq, prox...OK [106849 terms; 
739599 terms/docs pairs; 903504 tokens]
test: stored fields...OK [56355 total field count; avg 13.282 
fields per doc]
test: term vectorsOK [11933 total vector count; avg 2.812 
term/freq vector fields per doc]

test: docvalues...OK [0 total doc count; 0 docvalues fields]

  3 of 7: name=_fv0 docCount=4603
codec=Lucene42
compound=false
numFiles=12
size (MB)=15.763
diagnostics = {os.arch=amd64, os=Linux, java.vendor=IBM 
Corporation, timestamp=1389592952262, mergeFactor=3, 
mergeMaxNumSegments=1, source=merge, lucene.version=4.4.0 1504776 - 
sarowe - 2013-07-19 02:53:42, java.version=1.6.0, 
os.version=2.6.27.19-5-default}

no deletions
test: open reader.OK
test: fields..OK [18 fields]
test: field norms.OK [11 fields]
test: terms, freq, prox...OK [142338 terms; 1095467 terms/docs 
pairs; 1467387 tokens]
test: stored fields...OK [61486 total field count; avg 13.358 
fields per doc]
test: term vectorsOK [13686 total vector count; avg 2.973 
term/freq vector fields per doc]

test: docvalues...OK [0 total doc count; 0 docvalues fields]

  4 of 7: name=_g6z docCount=2853
codec=Lucene42
compound=false
numFiles=12
size (MB)=8.187
diagnostics = {os.arch=amd64, os=Linux, java.vendor=IBM 
Corporation, timestamp=1389610041349, mergeFactor=3, 
mergeMaxNumSegments=1, source=merge, lucene.version=4.4.0 1504776 - 
sarowe - 2013-07-19 02:53:42, java.version=1.6.0, 
os.version=2.6.27.19-5-default}

no deletions
test: open reader.OK
test: fields..OK [18 fields]
test: field norms.OK [11 fields]
test: terms, freq, prox...OK [97344 terms; 549425 terms/docs pairs; 
705964 tokens]
test: stored fields...OK [38013 total field count; avg 13.324 
fields per doc]
test: term vectorsOK [8059 total vector count; avg 2.825 
term/freq vector fields per doc]

test: docvalues...OK [0 total doc count; 0 docvalues fields]

  5 of 7: name=_gzx docCount=3647
codec=Lucene42
compound=false
numFiles=12
size (MB)=13.878
diagnostics = {os.arch=amd64, os=Linux, java.vendor=IBM 
Corporation, timestamp=1389640939123, mergeFactor=3, 
mergeMaxNumSegments=-1, source=merge, lucene.version=4.4.0 1504776 - 
sarowe - 2013-07-19 02:53:42, java.version=1.6.0, 
os.version=2.6.27.19-5-default}

no deletions
test: open reader.OK
test: fields..OK [18 fields]
test: field norms.OK [11 fields]
test: terms, freq, prox...OK [133598 terms; 923902 terms/docs 
pairs; 1243419 

Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-14 Thread Ralf Matulat

It becomes just more spooky.

The optimize-run this night was succesful.
Yesterday I did two things:

1. Checked the index without any result (no problems found).
2. I did an expungeDelete on the mentioned index.

So I have no idea what is going on here.
Btw: Java version is still the old 1.6.0. from 2008.

Best regards
Ralf

Am 13.01.14 21:15, schrieb Michael McCandless:

I have trouble understanding J9's version strings ... but, is it
really from 2008?  You could be hitting a JVM bug; can you test
upgrading?

I don't have much experience with Solr faceting on optimized vs
unoptimized indices; maybe someone else can answer your question.

Lucene's facet module (not yet exposed through Solr) performance
shouldn't change much for optimized vs unoptimized indices.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Jan 13, 2014 at 10:09 AM, Ralf Matulat
ralf.matu...@bundestag.de wrote:

java -version

java version 1.6.0
Java(TM) SE Runtime Environment (build
pxa6460sr3ifix-20090218_02(SR3+IZ43791+IZ43798))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux amd64-64
jvmxa6460-20081105_25433 (JIT enabled, AOT enabled)
J9VM - 20081105_025433_LHdSMr
JIT  - r9_20081031_1330
GC   - 20081027_AB)
JCL  - 20090218_01

A question regarding to optimizing the index:
As of SOLR 3.X we encountered massive performance improvements with facettet
queries after optimizing an index. So we once started optimizing the indexes
on a daily basis.
With SOLR 4.X and the new index-format that is not true anymore?

Btw: The checkIndex failed with 'java.io.FileNotFoundException:', I guess
because I did not stopped the tomcat while checking. So SOLR created, merged
and deleted some segments while checking. I will restart the check after
stoppimg SOLR.

Kind regards
Ralf Matulat




Which version of Java are you using?

That root cause exception is somewhat spooky: it's in the
ByteBufferIndexCode that handles an UnderflowException, ie when a
small (maybe a few hundred bytes) read happens to span the 1 GB page
boundary, and specifically the exception happens on the final read
(curBuf.get(b, offset, len)).  Such page-spanning reads are very rare.

The code looks fine to me though, and it's hard to explain how NPE (b
= null) could happen: that byte array is allocated in the
Lucene41PostingsReader.BlockDocsAndPositionsEnum class's ctor: encoded
= new byte[MAX_ENCODED_SIZE].

Separately, you really should not have to optimize daily, if ever.

Mike McCandless

http://blog.mikemccandless.com






--
Ralf Matulat
Deutscher Bundestag
Platz der Republik 1
11011 Berlin
Referat IT 1 - Anwendungsadministration
ralf.matu...@bundestag.de
Tel.: 030 - 227 34260



Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Michael McCandless
Which version of Java are you using?

That root cause exception is somewhat spooky: it's in the
ByteBufferIndexCode that handles an UnderflowException, ie when a
small (maybe a few hundred bytes) read happens to span the 1 GB page
boundary, and specifically the exception happens on the final read
(curBuf.get(b, offset, len)).  Such page-spanning reads are very rare.

The code looks fine to me though, and it's hard to explain how NPE (b
= null) could happen: that byte array is allocated in the
Lucene41PostingsReader.BlockDocsAndPositionsEnum class's ctor: encoded
= new byte[MAX_ENCODED_SIZE].

Separately, you really should not have to optimize daily, if ever.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Jan 13, 2014 at 2:06 AM, Ralf Matulat ralf.matu...@bundestag.de wrote:
 Hi,
 I am currently running into merge-issues while optimizing an index.

 To give you some informations:

 We are running 4 SOLR Servers with identical OS, VM-Hardware, RAM etc.
 Only one Server by now is having issues, the others are fine.

 We are running SOLR 4.4.0 with Tomcat 6.0
 It was running since October without any problems.
 The problems first occur after doing a minor change in the synonyms.txt, but
 I guess that was just a coincedence.

 We added `ulimit -v unlimited` to our tomcat init-script years ago.

 We have 4 Cores running on each SOLR Server, configuration, index-sizes of
 all 4 servers are identical (we are distributing cfgs via git).

 We did a rebuild of the index twice: First time without removing the old
 index files, second time deleting the data dir and starting from scratch.

 We are working with DIH, getting data from a MySQL DB.
 After an initial complete index-run, the optimize is working. The optimize
 fails one or two days later.

 We are doing one optimize-run a day, the index contains ~10 millions
 documents, the index size on disc is ~39GB while having 127G of free disc
 space.

 We have a mergeFactor of 3.

 The solr.log says:

 ERROR - 2014-01-12 22:47:11.062; org.apache.solr.common.SolrException;
 java.io.IOException: background merge hit exception: _dc8(4.4):C9876366/1327
 _e8u(4.4):C4250/7 _f4a(4.4):C1553/13 _fj6(4.4
 ):C1903/15 _ep3(4.4):C1217/42 _fle(4.4):C256/7 _flf(4.4):C11 into _flg
 [maxNumSegments=1]
 at
 org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1714)
 at
 org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1650)
 at
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:530)
 at
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
 at
 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at
 org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1235)
 at
 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1219)
 at
 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
 at
 org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
 at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
 at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
 at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:735)
 Caused by: java.lang.NullPointerException
 at java.nio.ByteBuffer.get(ByteBuffer.java:661)
  

Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Ralf Matulat

 java -version
java version 1.6.0
Java(TM) SE Runtime Environment (build 
pxa6460sr3ifix-20090218_02(SR3+IZ43791+IZ43798))
IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux amd64-64 
jvmxa6460-20081105_25433 (JIT enabled, AOT enabled)

J9VM - 20081105_025433_LHdSMr
JIT  - r9_20081031_1330
GC   - 20081027_AB)
JCL  - 20090218_01

A question regarding to optimizing the index:
As of SOLR 3.X we encountered massive performance improvements with 
facettet queries after optimizing an index. So we once started 
optimizing the indexes on a daily basis.

With SOLR 4.X and the new index-format that is not true anymore?

Btw: The checkIndex failed with 'java.io.FileNotFoundException:', I 
guess because I did not stopped the tomcat while checking. So SOLR 
created, merged and deleted some segments while checking. I will restart 
the check after stoppimg SOLR.


Kind regards
Ralf Matulat



Which version of Java are you using?

That root cause exception is somewhat spooky: it's in the
ByteBufferIndexCode that handles an UnderflowException, ie when a
small (maybe a few hundred bytes) read happens to span the 1 GB page
boundary, and specifically the exception happens on the final read
(curBuf.get(b, offset, len)).  Such page-spanning reads are very rare.

The code looks fine to me though, and it's hard to explain how NPE (b
= null) could happen: that byte array is allocated in the
Lucene41PostingsReader.BlockDocsAndPositionsEnum class's ctor: encoded
= new byte[MAX_ENCODED_SIZE].

Separately, you really should not have to optimize daily, if ever.

Mike McCandless

http://blog.mikemccandless.com





Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Michael McCandless
I have trouble understanding J9's version strings ... but, is it
really from 2008?  You could be hitting a JVM bug; can you test
upgrading?

I don't have much experience with Solr faceting on optimized vs
unoptimized indices; maybe someone else can answer your question.

Lucene's facet module (not yet exposed through Solr) performance
shouldn't change much for optimized vs unoptimized indices.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Jan 13, 2014 at 10:09 AM, Ralf Matulat
ralf.matu...@bundestag.de wrote:
 java -version
 java version 1.6.0
 Java(TM) SE Runtime Environment (build
 pxa6460sr3ifix-20090218_02(SR3+IZ43791+IZ43798))
 IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 Linux amd64-64
 jvmxa6460-20081105_25433 (JIT enabled, AOT enabled)
 J9VM - 20081105_025433_LHdSMr
 JIT  - r9_20081031_1330
 GC   - 20081027_AB)
 JCL  - 20090218_01

 A question regarding to optimizing the index:
 As of SOLR 3.X we encountered massive performance improvements with facettet
 queries after optimizing an index. So we once started optimizing the indexes
 on a daily basis.
 With SOLR 4.X and the new index-format that is not true anymore?

 Btw: The checkIndex failed with 'java.io.FileNotFoundException:', I guess
 because I did not stopped the tomcat while checking. So SOLR created, merged
 and deleted some segments while checking. I will restart the check after
 stoppimg SOLR.

 Kind regards
 Ralf Matulat



 Which version of Java are you using?

 That root cause exception is somewhat spooky: it's in the
 ByteBufferIndexCode that handles an UnderflowException, ie when a
 small (maybe a few hundred bytes) read happens to span the 1 GB page
 boundary, and specifically the exception happens on the final read
 (curBuf.get(b, offset, len)).  Such page-spanning reads are very rare.

 The code looks fine to me though, and it's hard to explain how NPE (b
 = null) could happen: that byte array is allocated in the
 Lucene41PostingsReader.BlockDocsAndPositionsEnum class's ctor: encoded
 = new byte[MAX_ENCODED_SIZE].

 Separately, you really should not have to optimize daily, if ever.

 Mike McCandless

 http://blog.mikemccandless.com






Re: background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-13 Thread Ralf Matulat

It sounds quite obvious to upgrade the java environment to go on with that.

We are updating our index almost every second and so over time it counts 
up a lot of segment files (up to 32 in our case). So optimizing was 
always a good idea in the SOLR 3.X world.
We regocnized that in SOLR 4.4.0 it does not hit performance that bad if 
the index is not optimized for a period of time. But we like to avoid 
getting to much segments at once.


Today I will run one more checkIndex run against the now turned off SOLR 
instance.

Maybe there will be some enlightment then.

Bet regards
Ralf Matulat

Am 13.01.14 21:15, schrieb Michael McCandless:

I have trouble understanding J9's version strings ... but, is it
really from 2008?  You could be hitting a JVM bug; can you test
upgrading?

I don't have much experience with Solr faceting on optimized vs
unoptimized indices; maybe someone else can answer your question.

Lucene's facet module (not yet exposed through Solr) performance
shouldn't change much for optimized vs unoptimized indices.

Mike McCandless

http://blog.mikemccandless.com





background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-12 Thread Ralf Matulat

Hi,
I am currently running into merge-issues while optimizing an index.

To give you some informations:

We are running 4 SOLR Servers with identical OS, VM-Hardware, RAM etc.
Only one Server by now is having issues, the others are fine.

We are running SOLR 4.4.0 with Tomcat 6.0
It was running since October without any problems.
The problems first occur after doing a minor change in the synonyms.txt, 
but I guess that was just a coincedence.


We added `ulimit -v unlimited` to our tomcat init-script years ago.

We have 4 Cores running on each SOLR Server, configuration, index-sizes 
of all 4 servers are identical (we are distributing cfgs via git).


We did a rebuild of the index twice: First time without removing the old 
index files, second time deleting the data dir and starting from scratch.


We are working with DIH, getting data from a MySQL DB.
After an initial complete index-run, the optimize is working. The 
optimize fails one or two days later.


We are doing one optimize-run a day, the index contains ~10 millions 
documents, the index size on disc is ~39GB while having 127G of free 
disc space.


We have a mergeFactor of 3.

The solr.log says:

ERROR - 2014-01-12 22:47:11.062; org.apache.solr.common.SolrException; 
java.io.IOException: background merge hit exception: 
_dc8(4.4):C9876366/1327 _e8u(4.4):C4250/7 _f4a(4.4):C1553/13 _fj6(4.4
):C1903/15 _ep3(4.4):C1217/42 _fle(4.4):C256/7 _flf(4.4):C11 into _flg 
[maxNumSegments=1]
at 
org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1714)
at 
org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1650)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:530)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1235)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1219)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)

at java.lang.Thread.run(Thread.java:735)
Caused by: java.lang.NullPointerException
at java.nio.ByteBuffer.get(ByteBuffer.java:661)
at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:245)
at 
org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:107)
at 
org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java:197)
at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.refillDocs(Lucene41PostingsReader.java:748)
at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextDoc(Lucene41PostingsReader.java:813)
at 
org.apache.lucene.codecs.MappingMultiDocsAndPositionsEnum.nextDoc(MappingMultiDocsAndPositionsEnum.java:104)
at 
org.apache.lucene.codecs.PostingsConsumer.merge(PostingsConsumer.java:101)
at 
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:164)
at