Re: How can I get started for investigating the source code of Lucene ?
Hi Jeff You can buy a book for lucene.like Lucene in Action lizhi 于 2010-11-1 13:43, Jeff Zhang 写道: Hi all, I'd like to study the source code of Lucene, but I found there's not so much documents about the internal structure of lucene. And the classes are so big that not so readable, could anyone give me suggestion about How can I get started for investigating the source code of Lucene ? Any document or blog post would be good . Thanks - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926858#action_12926858 ] Nico Krijnen commented on LUCENE-2729: -- {code} jteb:assetIndex jteb$ ls -la total 41550832 drwxr-xr-x 2 jteb jteb4862 1 nov 08:52 . drwxr-xr-x 4 jteb jteb 238 29 okt 14:10 .. -rw-r--r--@ 1 jteb jteb 21508 1 nov 08:52 .DS_Store -rw-r--r-- 1 jteb jteb 969134416 18 okt 16:41 _2q.fdt -rw-r--r-- 1 jteb jteb 36652 18 okt 16:41 _2q.fdx -rw-r--r-- 1 jteb jteb 276 18 okt 16:41 _2q.fnm -rw-r--r-- 1 jteb jteb 4685726 18 okt 16:41 _2q.frq -rw-r--r-- 1 jteb jteb9166 18 okt 16:41 _2q.nrm -rw-r--r-- 1 jteb jteb 393230403 18 okt 16:42 _2q.prx -rw-r--r-- 1 jteb jteb7447 18 okt 16:42 _2q.tii -rw-r--r-- 1 jteb jteb 746299 18 okt 16:42 _2q.tis -rw-r--r-- 1 jteb jteb8394 18 okt 16:42 _2q.tvd -rw-r--r-- 1 jteb jteb 599185081 18 okt 16:42 _2q.tvf -rw-r--r-- 1 jteb jteb 73300 18 okt 16:42 _2q.tvx -rw-r--r-- 1 jteb jteb 1595882722 18 okt 16:45 _3u.fdt -rw-r--r-- 1 jteb jteb 63692 18 okt 16:45 _3u.fdx -rw-r--r-- 1 jteb jteb 330 18 okt 16:45 _3u.fnm -rw-r--r-- 1 jteb jteb 8001869 18 okt 16:45 _3u.frq -rw-r--r-- 1 jteb jteb 15926 18 okt 16:45 _3u.nrm -rw-r--r-- 1 jteb jteb 647374863 18 okt 16:45 _3u.prx -rw-r--r-- 1 jteb jteb 11319 18 okt 16:45 _3u.tii -rw-r--r-- 1 jteb jteb 1168399 18 okt 16:45 _3u.tis -rw-r--r-- 1 jteb jteb 14209 18 okt 16:45 _3u.tvd -rw-r--r-- 1 jteb jteb 986370136 18 okt 16:46 _3u.tvf -rw-r--r-- 1 jteb jteb 127380 18 okt 16:46 _3u.tvx -rw-r--r-- 1 jteb jteb 2691565961 18 okt 16:49 _4c.fdt -rw-r--r-- 1 jteb jteb 39572 18 okt 16:49 _4c.fdx -rw-r--r-- 1 jteb jteb 276 18 okt 16:49 _4c.fnm -rw-r--r-- 1 jteb jteb18724620 18 okt 16:49 _4c.frq -rw-r--r-- 1 jteb jteb9896 18 okt 16:49 _4c.nrm -rw-r--r-- 1 jteb jteb 590255960 18 okt 16:50 _4c.prx -rw-r--r-- 1 jteb jteb 141243 18 okt 16:50 _4c.tii -rw-r--r-- 1 jteb jteb12185869 18 okt 16:50 _4c.tis -rw-r--r-- 1 jteb jteb9894 18 okt 16:50 _4c.tvd -rw-r--r-- 1 jteb jteb 932649779 18 okt 16:51 _4c.tvf -rw-r--r-- 1 jteb jteb 79140 18 okt 16:51 _4c.tvx -rw-r--r-- 1 jteb jteb 2398908136 18 okt 16:52 _4d.fdt -rw-r--r-- 1 jteb jteb 548 18 okt 16:52 _4d.fdx -rw-r--r-- 1 jteb jteb 354 18 okt 16:52 _4d.fnm -rw-r--r-- 1 jteb jteb24581614 18 okt 16:52 _4d.frq -rw-r--r-- 1 jteb jteb 140 18 okt 16:52 _4d.nrm -rw-r--r-- 1 jteb jteb 158243133 18 okt 16:52 _4d.prx -rw-r--r-- 1 jteb jteb 141948 18 okt 16:52 _4d.tii -rw-r--r-- 1 jteb jteb12259425 18 okt 16:52 _4d.tis -rw-r--r-- 1 jteb jteb 140 18 okt 16:52 _4d.tvd -rw-r--r-- 1 jteb jteb 303769970 18 okt 16:53 _4d.tvf -rw-r--r-- 1 jteb jteb1092 18 okt 16:53 _4d.tvx -rw-r--r-- 1 jteb jteb 4118409126 29 okt 16:26 _6g.fdt -rw-r--r-- 1 jteb jteb1484 29 okt 16:26 _6g.fdx -rw-r--r-- 1 jteb jteb 384 29 okt 16:17 _6g.fnm -rw-r--r-- 1 jteb jteb35294399 29 okt 16:27 _6g.frq -rw-r--r-- 1 jteb jteb 374 29 okt 16:27 _6g.nrm -rw-r--r-- 1 jteb jteb 230791431 29 okt 16:27 _6g.prx -rw-r--r-- 1 jteb jteb 143860 29 okt 16:27 _6g.tii -rw-r--r-- 1 jteb jteb12491845 29 okt 16:27 _6g.tis -rw-r--r-- 1 jteb jteb 295 29 okt 16:28 _6g.tvd -rw-r--r-- 1 jteb jteb 444939185 29 okt 16:28 _6g.tvf -rw-r--r-- 1 jteb jteb2964 29 okt 16:28 _6g.tvx -rw-r--r-- 1 jteb jteb 2758122671 29 okt 16:31 _6h.fdt -rw-r--r-- 1 jteb jteb 96388 29 okt 16:31 _6h.fdx -rw-r--r-- 1 jteb jteb 723 29 okt 16:29 _6h.fnm -rw-r--r-- 1 jteb jteb51142700 29 okt 16:31 _6h.frq -rw-r--r-- 1 jteb jteb 24100 29 okt 16:31 _6h.nrm -rw-r--r-- 1 jteb jteb 189178767 29 okt 16:31 _6h.prx -rw-r--r-- 1 jteb jteb 270472 29 okt 16:31 _6h.tii -rw-r--r-- 1 jteb jteb21710405 29 okt 16:31 _6h.tis -rw-r--r-- 1 jteb jteb 23873 29 okt 16:31 _6h.tvd -rw-r--r-- 1 jteb jteb 394088075 29 okt 16:31 _6h.tvf -rw-r--r-- 1 jteb jteb 192772 29 okt 16:31 _6h.tvx -rw-r--r-- 1 jteb jteb 0 29 okt 20:22 _8b.fnm -rw-r--r-- 1 jteb jteb 0 29 okt 20:26 _8b.tvd -rw-r--r-- 1 jteb jteb 0 29 okt 20:26 _8b.tvf -rw-r--r-- 1 jteb jteb 0 29 okt 20:22 _8c.fdt -rw-r--r-- 1 jteb jteb 0 29 okt 20:22 _8c.fdx -rw-r--r-- 1 jteb jteb 0 29 okt 20:26 _8c.frq -rw-r--r-- 1 jteb jteb 0 29 okt 20:24 _8c.tii -rw-r--r-- 1 jteb jteb 0 29 okt 20:24 _8c.tis -rw-r--r-- 1 jteb jteb 0 29 okt 20:28 _8c.tvf -rw-r--r-- 1 jteb jteb 0 29 okt 20:30 _8c.tvx -rw-r--r-- 1 jteb jteb 0 29 okt 20:24 _8d.fdt -rw-r--r-- 1 jteb jteb 0 29 okt 20:25 _8d.fdx
Solr-trunk - Build # 1299 - Still Failing
Build: http://hudson.zones.apache.org/hudson/job/Solr-trunk/1299/ All tests passed Build Log (for compile errors): [...truncated 16288 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-3.x - Build # 832 - Failure
Build: http://hudson.zones.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/832/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads Error Message: IndexFileDeleter doesn't know about file _3i.tvx Stack Trace: junit.framework.AssertionFailedError: IndexFileDeleter doesn't know about file _3i.tvx at org.apache.lucene.index.IndexWriter.filesExist(IndexWriter.java:4336) at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4383) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3159) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3232) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3203) at org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:200) at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:253) Build Log (for compile errors): [...truncated 8588 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926859#action_12926859 ] Nico Krijnen commented on LUCENE-2729: -- A second file listing from another test run, same result: read past EOF {code} jteb:assetIndex jteb$ ls -la total 38739848 drwxr-xr-x 2 jteb jteb4964 26 okt 11:51 . drwxr-xr-x 3 jteb jteb 204 22 okt 11:42 .. -rw-r--r-- 1 jteb jteb 969134416 18 okt 16:41 _2q.fdt -rw-r--r-- 1 jteb jteb 36652 18 okt 16:41 _2q.fdx -rw-r--r-- 1 jteb jteb 276 18 okt 16:41 _2q.fnm -rw-r--r-- 1 jteb jteb 4685726 18 okt 16:41 _2q.frq -rw-r--r-- 1 jteb jteb9166 18 okt 16:41 _2q.nrm -rw-r--r-- 1 jteb jteb 393230403 18 okt 16:42 _2q.prx -rw-r--r-- 1 jteb jteb7447 18 okt 16:42 _2q.tii -rw-r--r-- 1 jteb jteb 746299 18 okt 16:42 _2q.tis -rw-r--r-- 1 jteb jteb8394 18 okt 16:42 _2q.tvd -rw-r--r-- 1 jteb jteb 599185081 18 okt 16:42 _2q.tvf -rw-r--r-- 1 jteb jteb 73300 18 okt 16:42 _2q.tvx -rw-r--r-- 1 jteb jteb 2061261675 18 okt 16:44 _39.fdt -rw-r--r-- 1 jteb jteb1012 18 okt 16:44 _39.fdx -rw-r--r-- 1 jteb jteb 276 18 okt 16:44 _39.fnm -rw-r--r-- 1 jteb jteb17754579 18 okt 16:44 _39.frq -rw-r--r-- 1 jteb jteb 256 18 okt 16:44 _39.nrm -rw-r--r-- 1 jteb jteb 121067407 18 okt 16:44 _39.prx -rw-r--r-- 1 jteb jteb 137511 18 okt 16:44 _39.tii -rw-r--r-- 1 jteb jteb11726653 18 okt 16:44 _39.tis -rw-r--r-- 1 jteb jteb 185 18 okt 16:44 _39.tvd -rw-r--r-- 1 jteb jteb 233037042 18 okt 16:44 _39.tvf -rw-r--r-- 1 jteb jteb2020 18 okt 16:44 _39.tvx -rw-r--r-- 1 jteb jteb 1595882722 18 okt 16:45 _3u.fdt -rw-r--r-- 1 jteb jteb 63692 18 okt 16:45 _3u.fdx -rw-r--r-- 1 jteb jteb 330 18 okt 16:45 _3u.fnm -rw-r--r-- 1 jteb jteb 8001869 18 okt 16:45 _3u.frq -rw-r--r-- 1 jteb jteb 15926 18 okt 16:45 _3u.nrm -rw-r--r-- 1 jteb jteb 647374863 18 okt 16:45 _3u.prx -rw-r--r-- 1 jteb jteb 11319 18 okt 16:45 _3u.tii -rw-r--r-- 1 jteb jteb 1168399 18 okt 16:45 _3u.tis -rw-r--r-- 1 jteb jteb 14209 18 okt 16:45 _3u.tvd -rw-r--r-- 1 jteb jteb 986370136 18 okt 16:46 _3u.tvf -rw-r--r-- 1 jteb jteb 127380 18 okt 16:46 _3u.tvx -rw-r--r-- 1 jteb jteb 2057147455 18 okt 16:47 _3v.fdt -rw-r--r-- 1 jteb jteb 476 18 okt 16:47 _3v.fdx -rw-r--r-- 1 jteb jteb 384 18 okt 16:47 _3v.fnm -rw-r--r-- 1 jteb jteb1520 18 okt 16:47 _3v.frq -rw-r--r-- 1 jteb jteb 122 18 okt 16:47 _3v.nrm -rw-r--r-- 1 jteb jteb 109724024 18 okt 16:47 _3v.prx -rw-r--r-- 1 jteb jteb 132491 18 okt 16:47 _3v.tii -rw-r--r-- 1 jteb jteb11457688 18 okt 16:47 _3v.tis -rw-r--r-- 1 jteb jteb 114 18 okt 16:47 _3v.tvd -rw-r--r-- 1 jteb jteb 211902147 18 okt 16:48 _3v.tvf -rw-r--r-- 1 jteb jteb 948 18 okt 16:48 _3v.tvx -rw-r--r-- 1 jteb jteb 2691565961 18 okt 16:49 _4c.fdt -rw-r--r-- 1 jteb jteb 39572 18 okt 16:49 _4c.fdx -rw-r--r-- 1 jteb jteb 276 18 okt 16:49 _4c.fnm -rw-r--r-- 1 jteb jteb18724620 18 okt 16:49 _4c.frq -rw-r--r-- 1 jteb jteb9896 18 okt 16:49 _4c.nrm -rw-r--r-- 1 jteb jteb 590255960 18 okt 16:50 _4c.prx -rw-r--r-- 1 jteb jteb 141243 18 okt 16:50 _4c.tii -rw-r--r-- 1 jteb jteb12185869 18 okt 16:50 _4c.tis -rw-r--r-- 1 jteb jteb9894 18 okt 16:50 _4c.tvd -rw-r--r-- 1 jteb jteb 932649779 18 okt 16:51 _4c.tvf -rw-r--r-- 1 jteb jteb 79140 18 okt 16:51 _4c.tvx -rw-r--r-- 1 jteb jteb 2398908136 18 okt 16:52 _4d.fdt -rw-r--r-- 1 jteb jteb 548 18 okt 16:52 _4d.fdx -rw-r--r-- 1 jteb jteb 354 18 okt 16:52 _4d.fnm -rw-r--r-- 1 jteb jteb24581614 18 okt 16:52 _4d.frq -rw-r--r-- 1 jteb jteb 140 18 okt 16:52 _4d.nrm -rw-r--r-- 1 jteb jteb 158243133 18 okt 16:52 _4d.prx -rw-r--r-- 1 jteb jteb 141948 18 okt 16:52 _4d.tii -rw-r--r-- 1 jteb jteb12259425 18 okt 16:52 _4d.tis -rw-r--r-- 1 jteb jteb 140 18 okt 16:52 _4d.tvd -rw-r--r-- 1 jteb jteb 303769970 18 okt 16:53 _4d.tvf -rw-r--r-- 1 jteb jteb1092 18 okt 16:53 _4d.tvx -rw-r--r-- 1 jteb jteb 1081212027 18 okt 16:53 _4p.fdt -rw-r--r-- 1 jteb jteb 212 18 okt 16:53 _4p.fdx -rw-r--r-- 1 jteb jteb 354 18 okt 16:53 _4p.fnm -rw-r--r-- 1 jteb jteb 8294102 18 okt 16:53 _4p.frq -rw-r--r-- 1 jteb jteb 56 18 okt 16:53 _4p.nrm -rw-r--r-- 1 jteb jteb60513257 18 okt 16:53 _4p.prx -rw-r--r-- 1 jteb jteb 134898 18 okt 16:53 _4p.tii -rw-r--r-- 1 jteb jteb11376730 18 okt 16:53 _4p.tis -rw-r--r-- 1 jteb jteb 56 18 okt 16:53 _4p.tvd -rw-r--r-- 1 jteb jteb 116715012 18 okt 16:53 _4p.tvf -rw-r--r-- 1 jteb jteb 420 18 okt 16:53 _4p.tvx -rw-r--r-- 1 jteb jteb 787581180 18 okt 16:54
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926907#action_12926907 ] Michael McCandless commented on LUCENE-2729: That long string of length 0 files is very bizarre. Was there no original root cause here? Eg disk full? Or is the read past EOF on closing an IndexReader w/ pending deletes really the first exception you see? Does zoie somehow touch the index files? Taking a backup is fundamentally a read-only op on the index, so that process shouldn't by itself truncate index files. Something is somehow reaching in and zero-ing out these files. I don't think Lucene itself would do this. For example, the serious of _6i.XXX zero'd files... Lucene writes these files roughly in sequence, so if something bad happened in writing the postings, then the .nrm file should not even exist. So we need to figure out who is truncating these files... Index corruption after 'read past EOF' under heavy update load and snapshot export -- Key: LUCENE-2729 URL: https://issues.apache.org/jira/browse/LUCENE-2729 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.0.1, 3.0.2 Environment: Happens on both OS X 10.6 and Windows 2008 Server. Integrated with zoie (using a zoie snapshot from 2010-08-06: zoie-2.0.0-snapshot-20100806.jar). Reporter: Nico Krijnen We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:166) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973) at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162) at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003) at proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373) 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:957) at
[jira] Updated: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nico Krijnen updated LUCENE-2729: - Description: We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:166) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973) at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162) at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003) at proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373) 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:957) at proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373) {code} We get exactly the same behavour on both OS X and on Windows. On both zoie is using a SimpleFSDirectory. We also use a SingleInstanceLockFactory (since our process is the only one working with the index), but we get the same behaviour with a NativeFSLock. The snapshot backup is being made by calling: *proj.zoie.impl.indexing.ZoieSystem.exportSnapshot(WritableByteChannel)* Same issue in zoie JIRA: http://snaprojects.jira.com/browse/ZOIE-51 was: We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926908#action_12926908 ] Nico Krijnen commented on LUCENE-2729: -- bq. Was there no original root cause here? Eg disk full? This was one of the first things i thought, but the disk has more than enough free space: 200GB. Also, for this test we write the backup to a different disk - both for better performance and to prevent the disk with the index on it from running out of free space. bq. Or is the read past EOF on closing an IndexReader w/ pending deletes really the first exception you see? It is the first exception we see. We turned on quite a bit of additional logging but we have not been able to find out anything weird happening before this error. I do expect something weird must have happened to cause the 'read past EOF'. Do you have any clues as to what we could look for? - that might narrow the search. We are able to consistently reproduce this on our test environment. So if you have clues to specific debug logging that should be turned on - we can do another test run. bq. Does zoie somehow touch the index files? We'll try to find out. For as far as I see the basic backup procedure is to grab the last 'commit snapshot', prevent it from being deleted (ZoieIndexDeletionPolicy), and write all the files mentioned in the commit snapshot to a NIO WritableByteChannel. Index corruption after 'read past EOF' under heavy update load and snapshot export -- Key: LUCENE-2729 URL: https://issues.apache.org/jira/browse/LUCENE-2729 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.0.1, 3.0.2 Environment: Happens on both OS X 10.6 and Windows 2008 Server. Integrated with zoie (using a zoie snapshot from 2010-08-06: zoie-2.0.0-snapshot-20100806.jar). Reporter: Nico Krijnen We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:166) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973) at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162) at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003) at proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373) 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying
[jira] Issue Comment Edited: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926908#action_12926908 ] Nico Krijnen edited comment on LUCENE-2729 at 11/1/10 6:52 AM: --- bq. Was there no original root cause here? Eg disk full? This was one of the first things i thought, but the disk has more than enough free space: 200GB. Also, for this test we write the backup to a different disk - both for better performance and to prevent the disk with the index on it from running out of free space. bq. Or is the read past EOF on closing an IndexReader w/ pending deletes really the first exception you see? It is the first exception we see. We turned on quite a bit of additional logging but we have not been able to find anything weird happening before this error. I do expect something weird must have happened to cause the 'read past EOF'. Do you have any clues as to what we could look for? - that might narrow the search. We are able to consistently reproduce this on our test environment. So if you have clues to specific debug logging that should be turned on - we can do another test run. bq. Does zoie somehow touch the index files? We'll try to find out. For as far as I see the basic backup procedure is to grab the last 'commit snapshot', prevent it from being deleted (ZoieIndexDeletionPolicy), and write all the files mentioned in the commit snapshot to a NIO WritableByteChannel (proj.zoie.impl.indexing.internal.DiskIndexSnapshot#writeTo) - we call proj.zoie.impl.indexing.ZoieSystem.exportSnapshot(WritableByteChannel) ourselves. was (Author: nkrijnen): bq. Was there no original root cause here? Eg disk full? This was one of the first things i thought, but the disk has more than enough free space: 200GB. Also, for this test we write the backup to a different disk - both for better performance and to prevent the disk with the index on it from running out of free space. bq. Or is the read past EOF on closing an IndexReader w/ pending deletes really the first exception you see? It is the first exception we see. We turned on quite a bit of additional logging but we have not been able to find out anything weird happening before this error. I do expect something weird must have happened to cause the 'read past EOF'. Do you have any clues as to what we could look for? - that might narrow the search. We are able to consistently reproduce this on our test environment. So if you have clues to specific debug logging that should be turned on - we can do another test run. bq. Does zoie somehow touch the index files? We'll try to find out. For as far as I see the basic backup procedure is to grab the last 'commit snapshot', prevent it from being deleted (ZoieIndexDeletionPolicy), and write all the files mentioned in the commit snapshot to a NIO WritableByteChannel. Index corruption after 'read past EOF' under heavy update load and snapshot export -- Key: LUCENE-2729 URL: https://issues.apache.org/jira/browse/LUCENE-2729 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.0.1, 3.0.2 Environment: Happens on both OS X 10.6 and Windows 2008 Server. Integrated with zoie (using a zoie snapshot from 2010-08-06: zoie-2.0.0-snapshot-20100806.jar). Reporter: Nico Krijnen We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at
Re: How can I get started for investigating the source code of Lucene ?
Here's a rough overview I mapped out as a sequence diagram for the search side of things some time ago: http://goo.gl/lE6a - Original Message From: Jeff Zhang zjf...@gmail.com To: dev@lucene.apache.org Sent: Mon, 1 November, 2010 5:43:08 Subject: How can I get started for investigating the source code of Lucene ? Hi all, I'd like to study the source code of Lucene, but I found there's not so much documents about the internal structure of lucene. And the classes are so big that not so readable, could anyone give me suggestion about How can I get started for investigating the source code of Lucene ? Any document or blog post would be good . Thanks -- Best Regards Jeff Zhang - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2202) Money FieldType
[ https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926974#action_12926974 ] Robert Muir commented on SOLR-2202: --- Greg, one more nitpick: I think the reloadCurrencyConfig could be improved: # It seems to use resource loader to read in the xml file into a String line-by-line, but then concats all these lines and converts back into a bytearray, just to get an input stream. # It uses a charset of UTF8 (should be UTF-8). I think easier/safer would be to just get an InputStream directly from the resource loader (ResourceLoader.openResource) without this encoding conversion. Money FieldType --- Key: SOLR-2202 URL: https://issues.apache.org/jira/browse/SOLR-2202 Project: Solr Issue Type: New Feature Components: Schema and Analysis Affects Versions: 1.5 Reporter: Greg Fodor Attachments: SOLR-2022-solr-3.patch, SOLR-2202-lucene-1.patch, SOLR-2202-solr-1.patch, SOLR-2202-solr-2.patch, SOLR-2202-solr-4.patch, SOLR-2202-solr-5.patch Attached please find patches to add support for monetary values to Solr/Lucene with query-time currency conversion. The following features are supported: - Point queries (ex: price:4.00USD) - Range quries (ex: price:[$5.00 TO $10.00]) - Sorting. - Currency parsing by either currency code or symbol. - Symmetric Asymmetric exchange rates. (Asymmetric exchange rates are useful if there are fees associated with exchanging the currency.) At indexing time, money fields can be indexed in a native currency. For example, if a product on an e-commerce site is listed in Euros, indexing the price field as 10.00EUR will index it appropriately. By altering the currency.xml file, the sorting and querying against Solr can take into account fluctuations in currency exchange rates without having to re-index the documents. The new money field type is a polyfield which indexes two fields, one which contains the amount of the value and another which contains the currency code or symbol. The currency metadata (names, symbols, codes, and exchange rates) are expected to be in an xml file which is pointed to by the field type declaration in the schema.xml. The current patch is factored such that Money utility functions and configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), while the MoneyType and MoneyValueSource lie in Solr. This was meant to mirror the work being done on the spacial field types. This patch has not yet been deployed to production but will be getting used to power the international search capabilities of the search engine at Etsy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2725) Bengali Analyzer for Lucene has been Developed
[ https://issues.apache.org/jira/browse/LUCENE-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926981#action_12926981 ] Ahmed Chisty commented on LUCENE-2725: -- I have extended the analyzer class and use some rules of Bengali language's grammar. I have first tokenized Bengali Strings. Then i removed the stopwords and finally i filtered them. I have worked with this analyzer in some projects. It works fine. How can i contribute it to Lucene API? Can anyone give me any solution/way? Bengali Analyzer for Lucene has been Developed -- Key: LUCENE-2725 URL: https://issues.apache.org/jira/browse/LUCENE-2725 Project: Lucene - Java Issue Type: New Feature Components: contrib/analyzers Affects Versions: 3.0.1 Environment: Environment Independent Reporter: Ahmed Chisty Fix For: 3.1 Hi everyone, I am a CSE student of SUST, SYlhet( http://www.sust.edu/). I have noticed that there is no such Bengali Analyzer in Lucene for Bengali Text search and highlight. I have used Standard Analyzer and others but they do not give good result. So, i have developed a Bengali Analyzer. I have tested it for 50 thousand document. And it is being used in Ekushe Finance Search Engine. (http://efinance.com.bd/). Please give me some instruction so that i can contribute that analyzer in Lucene. Thanx. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2725) Bengali Analyzer for Lucene has been Developed
[ https://issues.apache.org/jira/browse/LUCENE-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12926985#action_12926985 ] Robert Muir commented on LUCENE-2725: - Ahmed: please see http://wiki.apache.org/lucene-java/HowToContribute Bengali Analyzer for Lucene has been Developed -- Key: LUCENE-2725 URL: https://issues.apache.org/jira/browse/LUCENE-2725 Project: Lucene - Java Issue Type: New Feature Components: contrib/analyzers Affects Versions: 3.0.1 Environment: Environment Independent Reporter: Ahmed Chisty Fix For: 3.1 Hi everyone, I am a CSE student of SUST, SYlhet( http://www.sust.edu/). I have noticed that there is no such Bengali Analyzer in Lucene for Bengali Text search and highlight. I have used Standard Analyzer and others but they do not give good result. So, i have developed a Bengali Analyzer. I have tested it for 50 thousand document. And it is being used in Ekushe Finance Search Engine. (http://efinance.com.bd/). Please give me some instruction so that i can contribute that analyzer in Lucene. Thanx. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-trunk - Build # 858 - Failure
Build: http://hudson.zones.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/858/ 1 tests failed. REGRESSION: org.apache.solr.TestDistributedSearch.testDistribSearch Error Message: Some threads threw uncaught exceptions! Stack Trace: junit.framework.AssertionFailedError: Some threads threw uncaught exceptions! at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:878) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:844) at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:437) at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:78) at org.apache.solr.BaseDistributedSearchTestCase.tearDown(BaseDistributedSearchTestCase.java:144) Build Log (for compile errors): [...truncated 8709 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Add MERGEINDEXES action to CoreAdmin wiki page?
Shouldn't the MERGEINDEXES action be listed on the http://wiki.apache.org/solr/CoreAdmin wiki page? With maybe a link back to http://wiki.apache.org/solr/MergingSolrIndexes#Merging_Through_CoreAdmin ? Be happy to make the edit... Eric - Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server Free/Busy: http://tinyurl.com/eric-cal - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927031#action_12927031 ] Jason Rutherglen commented on LUCENE-2729: -- Using Solr 1.4.2 on disk full .del files were being written with a file length of zero, however that is supposed to be fixed by https://issues.apache.org/jira/browse/LUCENE-2593 This doesn't appear to be similar because more than the .del files are of zero length. Index corruption after 'read past EOF' under heavy update load and snapshot export -- Key: LUCENE-2729 URL: https://issues.apache.org/jira/browse/LUCENE-2729 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.0.1, 3.0.2 Environment: Happens on both OS X 10.6 and Windows 2008 Server. Integrated with zoie (using a zoie snapshot from 2010-08-06: zoie-2.0.0-snapshot-20100806.jar). Reporter: Nico Krijnen We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:166) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973) at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162) at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003) at proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373) 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:957) at proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at
jQuery and tabs in example
All: I recently had occasion to work with the Solr example code and VrW and figured out how to put in a tabbed display by letting jQuery do all the work, but that needed a more recent jQuery (I used 1.4.x). Since I'm fresh off that experience and can maybe remember what I just finished doing, do folks think it's worth a Jira or two (that I'd immediately take) for 1 Upgrading the example code to jQuery 1.4.3 2 Using the tabbing capabilities of 1.4 to display the simple, spatial and group by links in a tabbed page to demonstrate? Let me know Erick
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927040#action_12927040 ] Nico Krijnen commented on LUCENE-2729: -- In the mean time, we also did a test with a checkout of the latest lucene_3_0 branch (@2010-11-01), which should include the fix that Jason mentions. Does not seem to make a difference though. We still get a 'read past EOF'. On the last run we did get a slightly different stacktrace. This time the 'read past EOF' happens when the zoie RAM index is written to the zoie Disk index. Last time it occurred a little earlier in BaseSearchIndex#loadFromIndex, while committing deletes to the disk IndexReader. This could be just a coincidence though. My feeling is still that the 'read past EOF' is just a result/symptom of something else that happened just before it - still trying to figure out what that could be... {code} 15:25:03,453 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@3d9e7719] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:170) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1127) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:960) at proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:172) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:377) {code} Index corruption after 'read past EOF' under heavy update load and snapshot export -- Key: LUCENE-2729 URL: https://issues.apache.org/jira/browse/LUCENE-2729 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.0.1, 3.0.2 Environment: Happens on both OS X 10.6 and Windows 2008 Server. Integrated with zoie (using a zoie snapshot from 2010-08-06: zoie-2.0.0-snapshot-20100806.jar). Reporter: Nico Krijnen We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:166) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987) at
[jira] Issue Comment Edited: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927040#action_12927040 ] Nico Krijnen edited comment on LUCENE-2729 at 11/1/10 12:53 PM: In the mean time, we also did a test with a checkout of the latest lucene_3_0 branch (@2010-11-01), which should include the fix that Jason mentions. Does not seem to make a difference though. We still get a 'read past EOF'. On the last run we did get a slightly different stacktrace. This time the 'read past EOF' happens when the zoie RAM index is written to the zoie Disk index. Last time it occurred a little earlier in BaseSearchIndex#loadFromIndex, while committing deletes to the disk IndexReader. This could be just a coincidence though. My feeling is still that the 'read past EOF' is just a result/symptom of something else that happened just before it - still trying to figure out what that could be... any suggestions are welcome. {code} 15:25:03,453 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@3d9e7719] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:170) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1127) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:960) at proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:172) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:377) {code} was (Author: nkrijnen): In the mean time, we also did a test with a checkout of the latest lucene_3_0 branch (@2010-11-01), which should include the fix that Jason mentions. Does not seem to make a difference though. We still get a 'read past EOF'. On the last run we did get a slightly different stacktrace. This time the 'read past EOF' happens when the zoie RAM index is written to the zoie Disk index. Last time it occurred a little earlier in BaseSearchIndex#loadFromIndex, while committing deletes to the disk IndexReader. This could be just a coincidence though. My feeling is still that the 'read past EOF' is just a result/symptom of something else that happened just before it - still trying to figure out what that could be... {code} 15:25:03,453 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@3d9e7719] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:170) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1127) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:960) at proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:172) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:377) {code} Index corruption after 'read past EOF' under heavy update load and snapshot export
[jira] Commented: (LUCENE-2729) Index corruption after 'read past EOF' under heavy update load and snapshot export
[ https://issues.apache.org/jira/browse/LUCENE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927043#action_12927043 ] Michael McCandless commented on LUCENE-2729: Somehow we have to locate the event that causes the truncation of the files. Can you enable IndexWriter's infoStream and then get the corruption to happen, and post the results? Index corruption after 'read past EOF' under heavy update load and snapshot export -- Key: LUCENE-2729 URL: https://issues.apache.org/jira/browse/LUCENE-2729 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.0.1, 3.0.2 Environment: Happens on both OS X 10.6 and Windows 2008 Server. Integrated with zoie (using a zoie snapshot from 2010-08-06: zoie-2.0.0-snapshot-20100806.jar). Reporter: Nico Krijnen We have a system running lucene and zoie. We use lucene as a content store for a CMS/DAM system. We use the hot-backup feature of zoie to make scheduled backups of the index. This works fine for small indexes and when there are not a lot of changes to the index when the backup is made. On large indexes (about 5 GB to 19 GB), when a backup is made while the index is being changed a lot (lots of document additions and/or deletions), we almost always get a 'read past EOF' at some point, followed by lots of 'Lock obtain timed out'. At that point we get lots of 0 kb files in the index, data gets lots, and the index is unusable. When we stop our server, remove the 0kb files and restart our server, the index is operational again, but data has been lost. I'm not sure if this is a zoie or a lucene issue, so i'm posting it to both. Hopefully someone has some ideas where to look to fix this. Some more details... Stack trace of the read past EOF and following Lock obtain timed out: {code} 78307 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.BaseSearchIndex - read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:154) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:39) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:37) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:69) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:245) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:166) at org.apache.lucene.index.DirectoryReader.doCommit(DirectoryReader.java:725) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:987) at org.apache.lucene.index.IndexReader.commit(IndexReader.java:973) at org.apache.lucene.index.IndexReader.decRef(IndexReader.java:162) at org.apache.lucene.index.IndexReader.close(IndexReader.java:1003) at proj.zoie.impl.indexing.internal.BaseSearchIndex.deleteDocs(BaseSearchIndex.java:203) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:223) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at proj.zoie.impl.indexing.internal.BatchedIndexDataLoader$LoaderThread.run(BatchedIndexDataLoader.java:373) 579336 [proj.zoie.impl.indexing.internal.realtimeindexdataloa...@31ca5085] ERROR proj.zoie.impl.indexing.internal.LuceneIndexDataLoader - Problem copying segments: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: org.apache.lucene.store.singleinstancel...@5ad0b895: write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1060) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:957) at proj.zoie.impl.indexing.internal.DiskSearchIndex.openIndexWriter(DiskSearchIndex.java:176) at proj.zoie.impl.indexing.internal.BaseSearchIndex.loadFromIndex(BaseSearchIndex.java:228) at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.loadFromIndex(LuceneIndexDataLoader.java:153) at proj.zoie.impl.indexing.internal.DiskLuceneIndexDataLoader.loadFromIndex(DiskLuceneIndexDataLoader.java:134) at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.processBatch(RealtimeIndexDataLoader.java:171) at
Annoying message when building Solr
It's been several weeks since I built Solr, so I removed all the trunk code, did a checkout and tried an ant build. The build starts out by giving a bunch of annoying warnings about not being able to find c:\ant\lib\xbean.jar, xerxedImpl.jar, serializer.jar and others (Yes, some of us are forever destined to work on windows boxes). I'm also getting some test failures I know there have been eMails flying back and forth about Maven etc. but haven't paid much attention. I can start tracking these down but wanted to know what's expected. The how to contribute page might need to be updated, which I'll do if that's what should be done. And my mac starts out the build by not being able to find some hsqldb jars, but at least the tests succeed there. All of these are very possibly issues with half-baked machine setups, which is the first thing I'll check this afternoon. Mostly I wanted to know if: 1 These are experienced by someone else 2 I should have paid more attention to the Maven emails and that's the preferred way of doing things now. 3 whether the Wiki is just really out of date and I should update it as I work through the issues. Under any circumstances, the how to contribute page on the wiki doesn't have any prerequisites, which may be way more important on windows boxes than Macs... Erick
jQuery and tabs in example (sorry for double posting)
Got the old list in the to field first time, sorry.. All: I recently had occasion to work with the Solr example code and VrW and figured out how to put in a tabbed display by letting jQuery do all the work, but that needed a more recent jQuery (I used 1.4.x). Since I'm fresh off that experience and can maybe remember what I just finished doing, do folks think it's worth a Jira or two (that I'd immediately take) for 1 Upgrading the example code to jQuery 1.4.3 2 Using the tabbing capabilities of 1.4 to display the simple, spatial and group by links in a tabbed page to demonstrate? Let me know Erick
[jira] Created: (SOLR-2210) Provide solr FilterFactory for Lucene ICUTokenizer
Provide solr FilterFactory for Lucene ICUTokenizer -- Key: SOLR-2210 URL: https://issues.apache.org/jira/browse/SOLR-2210 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene ICUTokenizer provides many benefits for multilingual tokenizing. There should be a ICUFilterFactory so that it can be used from Solr. There are probably some issues in terms of passing configuration parameters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2210) Provide solr FilterFactory for Lucene ICUTokenizer
[ https://issues.apache.org/jira/browse/SOLR-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927048#action_12927048 ] Robert Muir commented on SOLR-2210: --- Thanks for opening this, Tom. I've got some barebones filters for some of this stuff on my computer. Because the ICU jar file is large, i was trying to see if i could solve LUCENE-2510 first, but this would only fix the problem for 4.0 anyway. I think we should just make an icu contrib for now, and put the factories (Tokenizer, Normalizer, Folding, Transliterator, Collation) and the jar file in there. Provide solr FilterFactory for Lucene ICUTokenizer -- Key: SOLR-2210 URL: https://issues.apache.org/jira/browse/SOLR-2210 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene ICUTokenizer provides many benefits for multilingual tokenizing. There should be a ICUFilterFactory so that it can be used from Solr. There are probably some issues in terms of passing configuration parameters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2211) Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support
Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support --- Key: SOLR-2211 URL: https://issues.apache.org/jira/browse/SOLR-2211 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2211) Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support
[ https://issues.apache.org/jira/browse/SOLR-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927052#action_12927052 ] Robert Muir commented on SOLR-2211: --- Tom, for this one we just want to wrap org.apache.lucene.standard.UAX29Tokenizer, care to make a patch? Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support --- Key: SOLR-2211 URL: https://issues.apache.org/jira/browse/SOLR-2211 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2210) Provide solr FilterFactory for Lucene ICUTokenizer
[ https://issues.apache.org/jira/browse/SOLR-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927053#action_12927053 ] Robert Muir commented on SOLR-2210: --- actually another idea, would be to just make an 'extraAnalyzers' contrib. then we could also add factories for smart chinese, polish etc, without creating a ton of contribs. i think this would be a good solution to expose all the lucene analyzers to Solr, since to me, LUCENE-2510 seems tricky. Provide solr FilterFactory for Lucene ICUTokenizer -- Key: SOLR-2210 URL: https://issues.apache.org/jira/browse/SOLR-2210 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene ICUTokenizer provides many benefits for multilingual tokenizing. There should be a ICUFilterFactory so that it can be used from Solr. There are probably some issues in terms of passing configuration parameters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2680) Improve how IndexWriter flushes deletes against existing segments
[ https://issues.apache.org/jira/browse/LUCENE-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-2680: - Attachment: LUCENE-2680.patch The general approach is to reuse BufferedDeletes though place them into a segment info keyed map for those segments generated post lastSegmentIndex as per what has been discussed here https://issues.apache.org/jira/browse/LUCENE-2655?focusedCommentId=12922894page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12922894 and below. * lastSegmentIndex is added to IW * DW segmentDeletes is a map of segment info - buffered deletes. In the apply deletes method buffered deletes are pulled for a given segment info if they exist, otherwise they're taken from deletesFlushedLastSeg. * I'm not entirely sure what pushDeletes should do now, probably the same thing as currently, only the name should change slightly in that it's pushing deletes only for the RAM buffer docs. * There needs to be tests to ensure the docid-upto logic is working correctly * I'm not sure what to do with DW hasDeletes (it's usage is commented out) * Does there need to be separate deletes for the ram buffer vis-à-vis the (0 - lastSegmentIndex) deletes? * The memory accounting'll now get interesting as we'll need to track the RAM usage of terms/queries across multiple maps. * In commitMerge, DW verifySegmentDeletes removes the unused info - deletes * testDeletes deletes a doc in segment 1, then merges segments 1 and 2. We then test to insure the deletes were in fact applied only to segment 1 and 2. * testInitLastSegmentIndex insures that on IW init, the lastSegmentIndex is in fact set Improve how IndexWriter flushes deletes against existing segments - Key: LUCENE-2680 URL: https://issues.apache.org/jira/browse/LUCENE-2680 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Fix For: 4.0 Attachments: LUCENE-2680.patch IndexWriter buffers up all deletes (by Term and Query) and only applies them if 1) commit or NRT getReader() is called, or 2) a merge is about to kickoff. We do this because, for a large index, it's very costly to open a SegmentReader for every segment in the index. So we defer as long as we can. We do it just before merge so that the merge can eliminate the deleted docs. But, most merges are small, yet in a big index we apply deletes to all of the segments, which is really very wasteful. Instead, we should only apply the buffered deletes to the segments that are about to be merged, and keep the buffer around for the remaining segments. I think it's not so hard to do; we'd have to have generations of pending deletions, because the newly merged segment doesn't need the same buffered deletions applied again. So every time a merge kicks off, we pinch off the current set of buffered deletions, open a new set (the next generation), and record which segment was created as of which generation. This should be a very sizable gain for large indices that mix deletes, though, less so in flex since opening the terms index is much faster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2211) Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support
[ https://issues.apache.org/jira/browse/SOLR-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927067#action_12927067 ] Tom Burton-West commented on SOLR-2211: --- Sure, I'll give it a try. I've got large Monday morning backlog in my todo list today, so it will probably be towards the middle of the week. Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support --- Key: SOLR-2211 URL: https://issues.apache.org/jira/browse/SOLR-2211 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2211) Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support
[ https://issues.apache.org/jira/browse/SOLR-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927069#action_12927069 ] Robert Muir commented on SOLR-2211: --- Sounds great, this one has no external dependencies, so it can just be with the rest of the factories. I'll look at starting on the ant/build-system-stuff for SOLR-2210. Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support --- Key: SOLR-2211 URL: https://issues.apache.org/jira/browse/SOLR-2211 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Build problems (sorry for the second double-post)
On Mon, Nov 1, 2010 at 1:13 PM, Erick Erickson erickerick...@gmail.com wrote: Sorry, sent the original to the old dev list. Shows you how long it's been since I originated a mail It's been several weeks since I built Solr, so I removed all the trunk code, did a checkout and tried an ant build. The build starts out by giving a bunch of annoying warnings about not being able to find c:\ant\lib\xbean.jar, xerxedImpl.jar, serializer.jar and others (Yes, some of us are forever destined to work on windows boxes). these might be warnings somehow related to ant's classpath. i think i get these... what version of ant by the way? I'm also getting some test failures which ones? can you provide the information it gives you back, specifically the 'reproduce-with' command line? I know there have been eMails flying back and forth about Maven etc. but haven't paid much attention. wait, are you using ant, or maven?! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Build problems (sorry for the second double-post)
Sorry, sent the original to the old dev list. Shows you how long it's been since I originated a mail It's been several weeks since I built Solr, so I removed all the trunk code, did a checkout and tried an ant build. The build starts out by giving a bunch of annoying warnings about not being able to find c:\ant\lib\xbean.jar, xerxedImpl.jar, serializer.jar and others (Yes, some of us are forever destined to work on windows boxes). these might be warnings somehow related to ant's classpath. i think i get these... what version of ant by the way? These warning dont indicate that anything is broken, it happens on ant 1.7.0 and ant 1.7.1, if you have multiple lib folder in ~/.ant/lib and/or you specified a folder with -lib to command line. This is an ant bug, not sure where it comes from, but it tries to build a path from a jar file names from one folder together with the name of other folder and adds it to classpath, producing incorrect path/file combinations. Javac complains simply about those incorrect entries. This breaks nothing. On Lucene's Hudson builds (its FreeBSD) we have exactly the same problem since we have a ~/.ant/lib on the machine. But it works fine, so need to react on it :-) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene PMC update
Daniel Naber left the PMC in May 2010, but is still listed on the website (and in committee-info.txt) as being a member of the PMC. Also, Robert Muir was added to the PMC recently, but is not in the LDAP PMC group; the PMC chair need to run the following command on people please: modify_committee.pl lucene -add=rmuir Thanks. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene PMC update
I'll take care of it. On Nov 1, 2010, at 10:30 AM, sebb wrote: Daniel Naber left the PMC in May 2010, but is still listed on the website (and in committee-info.txt) as being a member of the PMC. Also, Robert Muir was added to the PMC recently, but is not in the LDAP PMC group; the PMC chair need to run the following command on people please: modify_committee.pl lucene -add=rmuir Thanks. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-trunk - Build # 867 - Failure
Build: http://hudson.zones.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/867/ 1 tests failed. REGRESSION: org.apache.solr.TestDistributedSearch.testDistribSearch Error Message: Some threads threw uncaught exceptions! Stack Trace: junit.framework.AssertionFailedError: Some threads threw uncaught exceptions! at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:878) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:844) at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:437) at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:78) at org.apache.solr.BaseDistributedSearchTestCase.tearDown(BaseDistributedSearchTestCase.java:144) Build Log (for compile errors): [...truncated 8709 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene-Solr-tests-only-trunk - Build # 869 - Failure
Build: http://hudson.zones.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/869/ 1 tests failed. REGRESSION: org.apache.solr.TestDistributedSearch.testDistribSearch Error Message: Some threads threw uncaught exceptions! Stack Trace: junit.framework.AssertionFailedError: Some threads threw uncaught exceptions! at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:878) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:844) at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:437) at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:78) at org.apache.solr.BaseDistributedSearchTestCase.tearDown(BaseDistributedSearchTestCase.java:144) Build Log (for compile errors): [...truncated 8709 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Build problems (sorry for the second double-post)
On Mon, Nov 1, 2010 at 7:29 PM, Erick Erickson erickerick...@gmail.com wrote: Uwe: Thanks, I'll update the how to contribute page with your comments. Robert: I'm using ant. I could have been clearer about that. There is no mention of maven at all on the how to contribute page, and I'm playing the naive user role here because it's a natural role for me well, this is helpful, here is some explanation for each unique issue you have (I have windows too, so some of it i see) [junit] Testsuite: org.apache.solr.client.solrj.response.TestSpellCheckResponse [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.346 sec [junit] [junit] - Standard Error - [junit] WARNING: best effort to remove C:\apache_trunk\trunk\solr\build\test-results\temp\6\solrtest-TestSpellCheckResponse-1288644520922\spellchecker\_4.cfs FAILED ! [junit] WARNING: best effort to remove C:\apache_trunk\trunk\solr\build\test-results\temp\6\solrtest-TestSpellCheckResponse-1288644520922\spellchecker FAILED ! [junit] WARNING: best effort to remove C:\apache_trunk\trunk\solr\build\test-results\temp\6\solrtest-TestSpellCheckResponse-1288644520922 FAILED ! [junit] - --- This is just a warning, no test failed, only the best effort to remove the spellchecker index failed. So, the problem here is https://issues.apache.org/jira/browse/SOLR-1877 (unclosed spellcheck reader). The solr base test classes (AbstractSolrTestCase, SolrTestCaseJ4) check that they can remove their temporary directories completely... on windows because the reader isn't closed, they can't do this, so they emit these warnings. In lucene, we have a similar check in LuceneTestCase, except the test will actually fail, and it uses MockDirectoryWrapper so that the tests always act like they are on windows regardless of the OS. It might be a good idea to make a MockDirectoryWrapperFactory, and use it for all Solr tests for these reasons (we can disable this pickiness for the two solr tests, but at least it would be consistent on windows and linux). Its also handy if you want to emulate things like disk-full in tests... * This looks more promising: [junit] Testsuite: org.apache.solr.cloud.CloudStateUpdateTest [junit] Testcase: testCoreRegistration(org.apache.solr.cloud.CloudStateUpdateTest): FAILED [junit] [junit] junit.framework.AssertionFailedError: [junit] at org.apache.solr.cloud.CloudStateUpdateTest.testCoreRegistration(CloudStateUpdateTest.java:170) This is a real test failure, fails often in hudson too. This looks like https://issues.apache.org/jira/browse/SOLR-2159 ** [junit] Testsuite: org.apache.solr.velocity.VelocityResponseWriterTest [junit] Testcase: testTemplateName(org.apache.solr.velocity.VelocityResponseWriterTest): Caused an ERROR [junit] org.apache.log4j.Logger.setAdditivity(Z)V [junit] java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V Hmm, are you sure you got a clean checkout? NoSuchMethod error is wierd to see here, I don't see it. Other people have seen this, and somehow fixed it... we should get to the bottom of this/document whatever the fix is at least ! * [junit] [junit] Testsuite: org.apache.solr.handler.TestReplicationHandler [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 46.863 sec [junit] [junit] - Standard Error - [junit] 01/11/2010 06:48:51 ? org.apache.solr.handler.SnapPuller fetchLatestIndex [junit] SEVERE: Master at: http://localhost:51343/solr/replication is not available. Index fetch failed. Exception: Connection refused: connect This is just a noisy/crazy test and it often logs scary/severe errors for me. But as you see, it didnt fail. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Build problems (sorry for the second double-post)
Robert: Thanks for the time you put into this. I'll make a clean checkout in the morning and see if that one error goes away. I'll see if I can get to the bottom of some of these. After how everything just worked on my Mac, it's disconcerting to see these failures-that-aren't-failures on my Windows box... and training oneself to ignore warnings is just asking for trouble. Thanks again Erick On Mon, Nov 1, 2010 at 8:23 PM, Robert Muir rcm...@gmail.com wrote: On Mon, Nov 1, 2010 at 7:29 PM, Erick Erickson erickerick...@gmail.com wrote: Uwe: Thanks, I'll update the how to contribute page with your comments. Robert: I'm using ant. I could have been clearer about that. There is no mention of maven at all on the how to contribute page, and I'm playing the naive user role here because it's a natural role for me well, this is helpful, here is some explanation for each unique issue you have (I have windows too, so some of it i see) [junit] Testsuite: org.apache.solr.client.solrj.response.TestSpellCheckResponse [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.346 sec [junit] [junit] - Standard Error - [junit] WARNING: best effort to remove C:\apache_trunk\trunk\solr\build\test-results\temp\6\solrtest-TestSpellCheckResponse-1288644520922\spellchecker\_4.cfs FAILED ! [junit] WARNING: best effort to remove C:\apache_trunk\trunk\solr\build\test-results\temp\6\solrtest-TestSpellCheckResponse-1288644520922\spellchecker FAILED ! [junit] WARNING: best effort to remove C:\apache_trunk\trunk\solr\build\test-results\temp\6\solrtest-TestSpellCheckResponse-1288644520922 FAILED ! [junit] - --- This is just a warning, no test failed, only the best effort to remove the spellchecker index failed. So, the problem here is https://issues.apache.org/jira/browse/SOLR-1877 (unclosed spellcheck reader). The solr base test classes (AbstractSolrTestCase, SolrTestCaseJ4) check that they can remove their temporary directories completely... on windows because the reader isn't closed, they can't do this, so they emit these warnings. In lucene, we have a similar check in LuceneTestCase, except the test will actually fail, and it uses MockDirectoryWrapper so that the tests always act like they are on windows regardless of the OS. It might be a good idea to make a MockDirectoryWrapperFactory, and use it for all Solr tests for these reasons (we can disable this pickiness for the two solr tests, but at least it would be consistent on windows and linux). Its also handy if you want to emulate things like disk-full in tests... * This looks more promising: [junit] Testsuite: org.apache.solr.cloud.CloudStateUpdateTest [junit] Testcase: testCoreRegistration(org.apache.solr.cloud.CloudStateUpdateTest): FAILED [junit] [junit] junit.framework.AssertionFailedError: [junit] at org.apache.solr.cloud.CloudStateUpdateTest.testCoreRegistration(CloudStateUpdateTest.java:170) This is a real test failure, fails often in hudson too. This looks like https://issues.apache.org/jira/browse/SOLR-2159 ** [junit] Testsuite: org.apache.solr.velocity.VelocityResponseWriterTest [junit] Testcase: testTemplateName(org.apache.solr.velocity.VelocityResponseWriterTest): Caused an ERROR [junit] org.apache.log4j.Logger.setAdditivity(Z)V [junit] java.lang.NoSuchMethodError: org.apache.log4j.Logger.setAdditivity(Z)V Hmm, are you sure you got a clean checkout? NoSuchMethod error is wierd to see here, I don't see it. Other people have seen this, and somehow fixed it... we should get to the bottom of this/document whatever the fix is at least ! * [junit] [junit] Testsuite: org.apache.solr.handler.TestReplicationHandler [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 46.863 sec [junit] [junit] - Standard Error - [junit] 01/11/2010 06:48:51 ? org.apache.solr.handler.SnapPuller fetchLatestIndex [junit] SEVERE: Master at: http://localhost:51343/solr/replicationis not available. Index fetch failed. Exception: Connection refused: connect This is just a noisy/crazy test and it often logs scary/severe errors for me. But as you see, it didnt fail. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: jQuery and tabs in example (sorry for double posting)
Absotively! On Mon, Nov 1, 2010 at 10:12 AM, Erick Erickson erickerick...@gmail.com wrote: Got the old list in the to field first time, sorry.. All: I recently had occasion to work with the Solr example code and VrW and figured out how to put in a tabbed display by letting jQuery do all the work, but that needed a more recent jQuery (I used 1.4.x). Since I'm fresh off that experience and can maybe remember what I just finished doing, do folks think it's worth a Jira or two (that I'd immediately take) for 1 Upgrading the example code to jQuery 1.4.3 2 Using the tabbing capabilities of 1.4 to display the simple, spatial and group by links in a tabbed page to demonstrate? Let me know Erick -- Lance Norskog goks...@gmail.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: jQuery and tabs in example
On Nov 1, 2010, at 12:33 , Erick Erickson wrote: All: I recently had occasion to work with the Solr example code and VrW and figured out how to put in a tabbed display by letting jQuery do all the work, but that needed a more recent jQuery (I used 1.4.x). Since I'm fresh off that experience and can maybe remember what I just finished doing, do folks think it's worth a Jira or two (that I'd immediately take) for 1 Upgrading the example code to jQuery 1.4.3 +1 2 Using the tabbing capabilities of 1.4 to display the simple, spatial and group by links in a tabbed page to demonstrate? I'm not too fond of the the tabbed way to demonstrate these features. Rather, we could create separate layouts and/or browse.vm'ish templates for each different example. Otherwise, we end up with a cluttered UI that demonstrates everything all at once and is too cluttered to be fun to show off. Check out how it looks on trunk with the work Grant has done (good work, but getting a bit cluttered and needs to streamlining IMO) Erik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2210) Provide solr FilterFactory for Lucene ICUTokenizer
[ https://issues.apache.org/jira/browse/SOLR-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-2210: -- Attachment: SOLR-2210.patch here's a start: makes an analysis-extras contrib with all the build logic, and factories for the icu filters. still todo: add support for custom normalization and custom tokenizer config, filters for smart chinese, and stempel. But i think its ok to commit this as-is and improve it in svn. Provide solr FilterFactory for Lucene ICUTokenizer -- Key: SOLR-2210 URL: https://issues.apache.org/jira/browse/SOLR-2210 Project: Solr Issue Type: New Feature Affects Versions: 3.1 Reporter: Tom Burton-West Priority: Minor Attachments: SOLR-2210.patch The Lucene ICUTokenizer provides many benefits for multilingual tokenizing. There should be a ICUFilterFactory so that it can be used from Solr. There are probably some issues in terms of passing configuration parameters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2212) NoMergePolicy class does not load
[ https://issues.apache.org/jira/browse/SOLR-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12927309#action_12927309 ] Lance Norskog commented on SOLR-2212: - This test is in trunk/solr/example and branch_3x/solr/example. I set the MergePolicy in solrconfig.xml to the NoMergePolicy class with this line: {code} mergePolicy class=org.apache.lucene.index.NoMergePolicy/ {code} When I start solr I get the following stack trace. {code} Nov 1, 2010 10:43:40 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error instantiating class: 'org.apache.lucene.index.NoMergePolicy' at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:432) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:83) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:197) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:399) at org.apache.solr.core.SolrCore.init(SolrCore.java:550) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:660) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:412) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:294) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:243) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:86) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:662) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:194) at org.mortbay.start.Main.start(Main.java:534) at org.mortbay.start.Main.start(Main.java:441) at org.mortbay.start.Main.main(Main.java:119) Caused by: java.lang.InstantiationException: org.apache.lucene.index.NoMergePolicy at java.lang.Class.newInstance0(Class.java:340) at java.lang.Class.newInstance(Class.java:308) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:429) ... 34 more {code} NoMergePolicy class does not load - Key: SOLR-2212 URL: https://issues.apache.org/jira/browse/SOLR-2212 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 3.1, 4.0 Reporter: Lance Norskog Solr cannot use the Lucene NoMergePolicy class. It will not instantiate correctly when loading the core. Other MergePolicy classes work, including the BalancedSegmentMergePolicy. This is in trunk and 3.x. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org