Re: [VOTE] release 3.3
+1 Tommaso 2011/6/23 Robert Muir rcm...@gmail.com Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev-tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
+1 I did the following: - compared the Solr Lucene binary .zip and .tgz archives' contents for any differences (other than line endings) - skimmed Changes.html for generation problems - looked at random pages from each module's javadocs - ran the Lucene demo, indexed and searched - ran the Solr example server, indexed and searched - eyeballed all modules' Maven artifacts sanity checked their POMs - ran all tests from the Solr Lucene source tarballs, separately Two non-release-blocking nits: 1. In the Solr source tarball, solr/example/README.txt recommends using the command ./post.sh *.xml from solr/example/exampledocs/, but post.sh does not have executable permissions. In the binary tarball, however, post.sh has executable permissions. 2. I checked the source for references to older versions, and I found the following; I think these just point to a missing item in the release todo (post-branching), and should not block the release: ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java: @Deprecated // TODO remove in 3.2 (this is present twice in this file) ./lucene/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: /** @deprecated remove all this test mode code in lucene 3.2! */ ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/br/BrazilianAnalyzer.java: // TODO make this private in 3.1 (this is present twice in this file) ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java:/** Index all text files under a directory. See http://lucene.apache.org/java/3_1/demo.html. */ ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java: + See http://lucene.apache.org/java/3_1/demo.html for details.; Steve -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 4:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev-tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1
[jira] [Created] (SOLR-2619) two sfields in geospatial search
two sfields in geospatial search Key: SOLR-2619 URL: https://issues.apache.org/jira/browse/SOLR-2619 Project: Solr Issue Type: Wish Components: clients - php Affects Versions: 3.2 Environment: Using with drupal Reporter: jose rodriguez Fix For: 3.2 Is it possible to create a query with two sfield (geospatial search)? .Want to mean two diferents pt and d for each field. If i need from - to then i need fields around the from coordinate and around the to coordinates. Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2382) DIH Cache Improvements
[ https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054263#comment-13054263 ] Noble Paul commented on SOLR-2382: -- bq.cacheInit() in EntityProcessorBase specifically passes only the parameters that apply to the current situation it doen't matter . It can use any params which are relevant to it. Anyway you can't define what params are required for a future DIHCache impl. Look at a Transformer implementation it can read anything it wants. The cache should be initialized like that only DIH Cache Improvements -- Key: SOLR-2382 URL: https://issues.apache.org/jira/browse/SOLR-2382 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: James Dyer Priority: Minor Attachments: SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch Functionality: 1. Provide a pluggable caching framework for DIH so that users can choose a cache implementation that best suits their data and application. 2. Provide a means to temporarily cache a child Entity's data without needing to create a special cached implementation of the Entity Processor (such as CachedSqlEntityProcessor). 3. Provide a means to write the final (root entity) DIH output to a cache rather than to Solr. Then provide a way for a subsequent DIH call to use the cache as an Entity input. Also provide the ability to do delta updates on such persistent caches. 4. Provide the ability to partition data across multiple caches that can then be fed back into DIH and indexed either to varying Solr Shards, or to the same Core in parallel. Use Cases: 1. We needed a flexible scalable way to temporarily cache child-entity data prior to joining to parent entities. - Using SqlEntityProcessor with Child Entities can cause an n+1 select problem. - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching mechanism and does not scale. - There is no way to cache non-SQL inputs (ex: flat files, xml, etc). 2. We needed the ability to gather data from long-running entities by a process that runs separate from our main indexing process. 3. We wanted the ability to do a delta import of only the entities that changed. - Lucene/Solr requires entire documents to be re-indexed, even if only a few fields changed. - Our data comes from 50+ complex sql queries and/or flat files. - We do not want to incur overhead re-gathering all of this data if only 1 entity's data changed. - Persistent DIH caches solve this problem. 4. We want the ability to index several documents in parallel (using 1.4.1, which did not have the threads parameter). 5. In the future, we may need to use Shards, creating a need to easily partition our source data into Shards. Implementation Details: 1. De-couple EntityProcessorBase from caching. - Created a new interface, DIHCache two implementations: - SortedMapBackedCache - An in-memory cache, used as default with CachedSqlEntityProcessor (now deprecated). - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested with je-4.1.6.jar - NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar. I believe this may be incompatible due to Generic Usage. - NOTE: I did not modify the ant script to automatically get this jar, so to use or evaluate this patch, download bdb-je from http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 2. Allow Entity Processors to take a cacheImpl parameter to cause the entity data to be cached (see EntityProcessorBase DIHCacheProperties). 3. Partially De-couple SolrWriter from DocBuilder - Created a new interface DIHWriter, two implementations: - SolrWriter (refactored) - DIHCacheWriter (allows DIH to write ultimately to a Cache). 4. Create a new Entity Processor, DIHCacheProcessor, which reads a persistent Cache as DIH Entity Input. 5. Support a partition parameter with both DIHCacheWriter and DIHCacheProcessor to allow for easy partitioning of source entity data. 6. Change the semantics of entity.destroy() - Previously, it was being called on each iteration of DocBuilder.buildDocument(). - Now it is does one-time cleanup tasks (like closing or deleting a disk-backed cache) once the entity processor is completed. - The only out-of-the-box entity processor that previously implemented destroy() was LineEntitiyProcessor, so this is not a very invasive change. General Notes: We are near completion in converting our search functionality from a legacy search engine to Solr. However, I found that DIH did not support caching to the level of
[jira] [Issue Comment Edited] (SOLR-2382) DIH Cache Improvements
[ https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054263#comment-13054263 ] Noble Paul edited comment on SOLR-2382 at 6/24/11 6:42 AM: --- bq.cacheInit() in EntityProcessorBase specifically passes only the parameters that apply to the current situation it doen't matter . It can use any params which are relevant to it. Anyway you can't define what params are required for a future DIHCache impl. Look at a Transformer implementation it can read anything it wants. The cache should be initialized like that only Why should the DocBuilder be even aware of DIHCache , Should it not be kept local to the EntityProcessor? was (Author: noble.paul): bq.cacheInit() in EntityProcessorBase specifically passes only the parameters that apply to the current situation it doen't matter . It can use any params which are relevant to it. Anyway you can't define what params are required for a future DIHCache impl. Look at a Transformer implementation it can read anything it wants. The cache should be initialized like that only DIH Cache Improvements -- Key: SOLR-2382 URL: https://issues.apache.org/jira/browse/SOLR-2382 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: James Dyer Priority: Minor Attachments: SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch Functionality: 1. Provide a pluggable caching framework for DIH so that users can choose a cache implementation that best suits their data and application. 2. Provide a means to temporarily cache a child Entity's data without needing to create a special cached implementation of the Entity Processor (such as CachedSqlEntityProcessor). 3. Provide a means to write the final (root entity) DIH output to a cache rather than to Solr. Then provide a way for a subsequent DIH call to use the cache as an Entity input. Also provide the ability to do delta updates on such persistent caches. 4. Provide the ability to partition data across multiple caches that can then be fed back into DIH and indexed either to varying Solr Shards, or to the same Core in parallel. Use Cases: 1. We needed a flexible scalable way to temporarily cache child-entity data prior to joining to parent entities. - Using SqlEntityProcessor with Child Entities can cause an n+1 select problem. - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching mechanism and does not scale. - There is no way to cache non-SQL inputs (ex: flat files, xml, etc). 2. We needed the ability to gather data from long-running entities by a process that runs separate from our main indexing process. 3. We wanted the ability to do a delta import of only the entities that changed. - Lucene/Solr requires entire documents to be re-indexed, even if only a few fields changed. - Our data comes from 50+ complex sql queries and/or flat files. - We do not want to incur overhead re-gathering all of this data if only 1 entity's data changed. - Persistent DIH caches solve this problem. 4. We want the ability to index several documents in parallel (using 1.4.1, which did not have the threads parameter). 5. In the future, we may need to use Shards, creating a need to easily partition our source data into Shards. Implementation Details: 1. De-couple EntityProcessorBase from caching. - Created a new interface, DIHCache two implementations: - SortedMapBackedCache - An in-memory cache, used as default with CachedSqlEntityProcessor (now deprecated). - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested with je-4.1.6.jar - NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar. I believe this may be incompatible due to Generic Usage. - NOTE: I did not modify the ant script to automatically get this jar, so to use or evaluate this patch, download bdb-je from http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 2. Allow Entity Processors to take a cacheImpl parameter to cause the entity data to be cached (see EntityProcessorBase DIHCacheProperties). 3. Partially De-couple SolrWriter from DocBuilder - Created a new interface DIHWriter, two implementations: - SolrWriter (refactored) - DIHCacheWriter (allows DIH to write ultimately to a Cache). 4. Create a new Entity Processor, DIHCacheProcessor, which reads a persistent Cache as DIH Entity Input. 5. Support a partition parameter with both DIHCacheWriter and DIHCacheProcessor to allow for easy partitioning of source entity data. 6. Change the semantics of
Re: [VOTE] release 3.3
I checked the clustering contrib, went through the Solr example (on ubuntu). One thing I noticed: - we have duplicated log4j*.jar in the distribution; the one in clustering contrib is not needed, in fact, because we use slf4j for logging anyway (and this one is picked from the war's WEB-INF/lib. I'll file an issue to remove it. Dawid On Fri, Jun 24, 2011 at 8:15 AM, Steven A Rowe sar...@syr.edu wrote: +1 I did the following: - compared the Solr Lucene binary .zip and .tgz archives' contents for any differences (other than line endings) - skimmed Changes.html for generation problems - looked at random pages from each module's javadocs - ran the Lucene demo, indexed and searched - ran the Solr example server, indexed and searched - eyeballed all modules' Maven artifacts sanity checked their POMs - ran all tests from the Solr Lucene source tarballs, separately Two non-release-blocking nits: 1. In the Solr source tarball, solr/example/README.txt recommends using the command ./post.sh *.xml from solr/example/exampledocs/, but post.sh does not have executable permissions. In the binary tarball, however, post.sh has executable permissions. 2. I checked the source for references to older versions, and I found the following; I think these just point to a missing item in the release todo (post-branching), and should not block the release: ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java: @Deprecated // TODO remove in 3.2 (this is present twice in this file) ./lucene/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: /** @deprecated remove all this test mode code in lucene 3.2! */ ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/br/BrazilianAnalyzer.java: // TODO make this private in 3.1 (this is present twice in this file) ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java:/** Index all text files under a directory. See http://lucene.apache.org/java/3_1/demo.html. */ ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java: + See http://lucene.apache.org/java/3_1/demo.html for details.; Steve -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 4:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev-tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2620) Remove log4j jar from the clustering contrib (uses slf4j).
Remove log4j jar from the clustering contrib (uses slf4j). -- Key: SOLR-2620 URL: https://issues.apache.org/jira/browse/SOLR-2620 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 3.3 Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 3.3, 4.0 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
Is there a code freeze on 3x or can I apply SOLR-2620 to it? Dawid On Fri, Jun 24, 2011 at 8:51 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I checked the clustering contrib, went through the Solr example (on ubuntu). One thing I noticed: - we have duplicated log4j*.jar in the distribution; the one in clustering contrib is not needed, in fact, because we use slf4j for logging anyway (and this one is picked from the war's WEB-INF/lib. I'll file an issue to remove it. Dawid On Fri, Jun 24, 2011 at 8:15 AM, Steven A Rowe sar...@syr.edu wrote: +1 I did the following: - compared the Solr Lucene binary .zip and .tgz archives' contents for any differences (other than line endings) - skimmed Changes.html for generation problems - looked at random pages from each module's javadocs - ran the Lucene demo, indexed and searched - ran the Solr example server, indexed and searched - eyeballed all modules' Maven artifacts sanity checked their POMs - ran all tests from the Solr Lucene source tarballs, separately Two non-release-blocking nits: 1. In the Solr source tarball, solr/example/README.txt recommends using the command ./post.sh *.xml from solr/example/exampledocs/, but post.sh does not have executable permissions. In the binary tarball, however, post.sh has executable permissions. 2. I checked the source for references to older versions, and I found the following; I think these just point to a missing item in the release todo (post-branching), and should not block the release: ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java: @Deprecated // TODO remove in 3.2 (this is present twice in this file) ./lucene/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: /** @deprecated remove all this test mode code in lucene 3.2! */ ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/br/BrazilianAnalyzer.java: // TODO make this private in 3.1 (this is present twice in this file) ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java:/** Index all text files under a directory. See http://lucene.apache.org/java/3_1/demo.html. */ ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java: + See http://lucene.apache.org/java/3_1/demo.html for details.; Steve -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 4:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev-tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2620) Remove log4j jar from the clustering contrib (uses slf4j).
[ https://issues.apache.org/jira/browse/SOLR-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-2620: -- Fix Version/s: (was: 4.0) Remove log4j jar from the clustering contrib (uses slf4j). -- Key: SOLR-2620 URL: https://issues.apache.org/jira/browse/SOLR-2620 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 3.3 Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 3.3 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3229) Overlaped SpanNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054270#comment-13054270 ] Paul Elschot commented on LUCENE-3229: -- Try and set 3.4 as fix version for this, 3.3 is already on the way out. It might also help to add some text for a changes.txt entry. Overlaped SpanNearQuery --- Key: LUCENE-3229 URL: https://issues.apache.org/jira/browse/LUCENE-3229 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.1 Environment: Windows XP, Java 1.6 Reporter: ludovic Boutros Priority: Minor Attachments: LUCENE-3229.patch, LUCENE-3229.patch, SpanOverlap.diff, SpanOverlap2.diff, SpanOverlapTestUnit.diff While using Span queries I think I've found a little bug. With a document like this (from the TestNearSpansOrdered unit test) : w1 w2 w3 w4 w5 If I try to search for this span query : spanNear([spanNear([field:w3, field:w5], 1, true), field:w4], 0, true) the above document is returned and I think it should not because 'w4' is not after 'w5'. The 2 spans are not ordered, because there is an overlap. I will add a test patch in the TestNearSpansOrdered unit test. I will add a patch to solve this issue too. Basicaly it modifies the two docSpansOrdered functions to make sure that the spans does not overlap. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
there is no code freeze anywhere... in my opinion, if you find little things to fix, just commit!!! (and backport also to http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/) I named the RC with the revision number, if we decide to go with it anyway, we just use that rev for svn tagging. But if lots of good things are found/fixed, then 'svn update' + 'ant prepare-release' + sftp to create a second release candidate is no big deal. On Fri, Jun 24, 2011 at 2:55 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Is there a code freeze on 3x or can I apply SOLR-2620 to it? Dawid On Fri, Jun 24, 2011 at 8:51 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I checked the clustering contrib, went through the Solr example (on ubuntu). One thing I noticed: - we have duplicated log4j*.jar in the distribution; the one in clustering contrib is not needed, in fact, because we use slf4j for logging anyway (and this one is picked from the war's WEB-INF/lib. I'll file an issue to remove it. Dawid On Fri, Jun 24, 2011 at 8:15 AM, Steven A Rowe sar...@syr.edu wrote: +1 I did the following: - compared the Solr Lucene binary .zip and .tgz archives' contents for any differences (other than line endings) - skimmed Changes.html for generation problems - looked at random pages from each module's javadocs - ran the Lucene demo, indexed and searched - ran the Solr example server, indexed and searched - eyeballed all modules' Maven artifacts sanity checked their POMs - ran all tests from the Solr Lucene source tarballs, separately Two non-release-blocking nits: 1. In the Solr source tarball, solr/example/README.txt recommends using the command ./post.sh *.xml from solr/example/exampledocs/, but post.sh does not have executable permissions. In the binary tarball, however, post.sh has executable permissions. 2. I checked the source for references to older versions, and I found the following; I think these just point to a missing item in the release todo (post-branching), and should not block the release: ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java: @Deprecated // TODO remove in 3.2 (this is present twice in this file) ./lucene/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: /** @deprecated remove all this test mode code in lucene 3.2! */ ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/br/BrazilianAnalyzer.java: // TODO make this private in 3.1 (this is present twice in this file) ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java:/** Index all text files under a directory. See http://lucene.apache.org/java/3_1/demo.html. */ ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java: + See http://lucene.apache.org/java/3_1/demo.html for details.; Steve -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 4:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev-tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
Ok, thanks Robert. This is a trivial correction. I didn't find log4j refs. anywhere under contrib/clustering (just to make sure), so I think it won't break anything. Dawid On Fri, Jun 24, 2011 at 9:03 AM, Robert Muir rcm...@gmail.com wrote: there is no code freeze anywhere... in my opinion, if you find little things to fix, just commit!!! (and backport also to http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_3/) I named the RC with the revision number, if we decide to go with it anyway, we just use that rev for svn tagging. But if lots of good things are found/fixed, then 'svn update' + 'ant prepare-release' + sftp to create a second release candidate is no big deal. On Fri, Jun 24, 2011 at 2:55 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Is there a code freeze on 3x or can I apply SOLR-2620 to it? Dawid On Fri, Jun 24, 2011 at 8:51 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I checked the clustering contrib, went through the Solr example (on ubuntu). One thing I noticed: - we have duplicated log4j*.jar in the distribution; the one in clustering contrib is not needed, in fact, because we use slf4j for logging anyway (and this one is picked from the war's WEB-INF/lib. I'll file an issue to remove it. Dawid On Fri, Jun 24, 2011 at 8:15 AM, Steven A Rowe sar...@syr.edu wrote: +1 I did the following: - compared the Solr Lucene binary .zip and .tgz archives' contents for any differences (other than line endings) - skimmed Changes.html for generation problems - looked at random pages from each module's javadocs - ran the Lucene demo, indexed and searched - ran the Solr example server, indexed and searched - eyeballed all modules' Maven artifacts sanity checked their POMs - ran all tests from the Solr Lucene source tarballs, separately Two non-release-blocking nits: 1. In the Solr source tarball, solr/example/README.txt recommends using the command ./post.sh *.xml from solr/example/exampledocs/, but post.sh does not have executable permissions. In the binary tarball, however, post.sh has executable permissions. 2. I checked the source for references to older versions, and I found the following; I think these just point to a missing item in the release todo (post-branching), and should not block the release: ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java: @Deprecated // TODO remove in 3.2 (this is present twice in this file) ./lucene/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: /** @deprecated remove all this test mode code in lucene 3.2! */ ./lucene/contrib/analyzers/common/src/java/org/apache/lucene/analysis/br/BrazilianAnalyzer.java: // TODO make this private in 3.1 (this is present twice in this file) ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java:/** Index all text files under a directory. See http://lucene.apache.org/java/3_1/demo.html. */ ./lucene/contrib/demo/src/java/org/apache/lucene/demo/IndexFiles.java: + See http://lucene.apache.org/java/3_1/demo.html for details.; Steve -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 4:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev-tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2620) Remove log4j jar from the clustering contrib (uses slf4j).
[ https://issues.apache.org/jira/browse/SOLR-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-2620. --- Resolution: Fixed Applied to 3x and 3_3 branches. Remove log4j jar from the clustering contrib (uses slf4j). -- Key: SOLR-2620 URL: https://issues.apache.org/jira/browse/SOLR-2620 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 3.3 Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 3.3 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3206) FST package API refactoring
[ https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-3206: Fix Version/s: (was: 3.3) 3.4 Moving to 3.4, if at all :) FST package API refactoring --- Key: LUCENE-3206 URL: https://issues.apache.org/jira/browse/LUCENE-3206 Project: Lucene - Java Issue Type: Improvement Components: core/FSTs Affects Versions: 3.2 Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 3.4, 4.0 Attachments: LUCENE-3206.patch The current API is still marked @experimental, so I think there's still time to fiddle with it. I've been using the current API for some time and I do have some ideas for improvement. This is a placeholder for these -- I'll post a patch once I have a working proof of concept. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9031 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9031/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety Error Message: Error occurred in thread Thread-95: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/7/test4767747692tmp/_e_4.frq (Too many open files in system) Stack Trace: junit.framework.AssertionFailedError: Error occurred in thread Thread-95: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/7/test4767747692tmp/_e_4.frq (Too many open files in system) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1425) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1343) /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/7/test4767747692tmp/_e_4.frq (Too many open files in system) at org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety(TestIndexReaderReopen.java:822) Build Log (for compile errors): [...truncated 3546 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
On Fri, Jun 24, 2011 at 2:15 AM, Steven A Rowe sar...@syr.edu wrote: +1 I did the following: - compared the Solr Lucene binary .zip and .tgz archives' contents for any differences (other than line endings) - skimmed Changes.html for generation problems - looked at random pages from each module's javadocs - ran the Lucene demo, indexed and searched - ran the Solr example server, indexed and searched - eyeballed all modules' Maven artifacts sanity checked their POMs - ran all tests from the Solr Lucene source tarballs, separately Thanks Steven, these look like good checks, and I think as much as possible it would be great if we can add any of these to the 'smokeTestRelease.py' script in dev-tools/scripts (e.g. startup solr, index the example docs and do a search). I could also imagine sometime soon we might even want to have this release-tester testing nightly builds or something, so we catch problems continuously. Two non-release-blocking nits: 1. In the Solr source tarball, solr/example/README.txt recommends using the command ./post.sh *.xml from solr/example/exampledocs/, but post.sh does not have executable permissions. In the binary tarball, however, post.sh has executable permissions. 2. I checked the source for references to older versions, and I found the following; I think these just point to a missing item in the release todo (post-branching), and should not block the release: I took care of these: the deprecations are already nuked in trunk, and I don't think we achieve a ton by nuking them in a 3.x minor release. As far as the demo links, these were totally broken links, so i replaced them with a description of what the thing is doing (seems more useful) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2617) Support git.
[ https://issues.apache.org/jira/browse/SOLR-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054302#comment-13054302 ] Stefan Trcek commented on SOLR-2617: For .gitignore, I prefer to generate it automatically assuming the git repo is git-svn based, however that didn't work As a git mirror is sufficient to make patches I suggest to add .gitignore to the repo, as this enables the use of a git mirror without git-svn. Support git. Key: SOLR-2617 URL: https://issues.apache.org/jira/browse/SOLR-2617 Project: Solr Issue Type: New Feature Components: Build Reporter: David Smiley Apache has git mirrors of Lucene/Solr, as well as many other projects. Presently, if git is used to checkout Lucene/Solr, there are only a couple small problems to address, but it otherwise works fine. * a .gitignore is needed. * empty directories need to be dealt-with. (git doesn't support them) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
On Fri, Jun 24, 2011 at 1:48 AM, Robert Muir rcm...@gmail.com wrote: Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 Thanks for leading this release Robert. I compared SolrJ's pom.xml with past releases and I noticed that since v3.1.0, SolrJ has a dependency on lucene-core. I'm not sure why SolrJ needs that dependency. Also in v3.1, SolrJ had a test dependency on solr-test-framework but it was removed in v3.2. I have been missing from the action since Solr v1.4 so I'm not sure if those changes were intentional or mistakes. -- Regards, Shalin Shekhar Mangar.
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 9031 - Failure
i fixed this test, i had inadvertently allowed nightly to give a large multiplier to its 'n', which is not just number of docs, but controls #threads/readers used in this test... thats why its been creating so many open files on hudson lately. On Fri, Jun 24, 2011 at 3:39 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9031/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety Error Message: Error occurred in thread Thread-95: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/7/test4767747692tmp/_e_4.frq (Too many open files in system) Stack Trace: junit.framework.AssertionFailedError: Error occurred in thread Thread-95: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/7/test4767747692tmp/_e_4.frq (Too many open files in system) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1425) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1343) /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/test/7/test4767747692tmp/_e_4.frq (Too many open files in system) at org.apache.lucene.index.TestIndexReaderReopen.testThreadSafety(TestIndexReaderReopen.java:822) Build Log (for compile errors): [...truncated 3546 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2610) Add an option to delete index through CoreAdmin UNLOAD action
[ https://issues.apache.org/jira/browse/SOLR-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054316#comment-13054316 ] Shalin Shekhar Mangar commented on SOLR-2610: - bq. But you might want to (in fact, I do this). If you are really done with a core, if you really want to remove it, what do you need the config files around for anymore? I was approaching this particular issue more from the angle of making it useful for SolrCloud. I can see how deleting configs can be useful to some people but is it worth introducing such an inconsistency i.e. you can delete config but cannot add it back? Anyways, it is best handled via a separate issue. Add an option to delete index through CoreAdmin UNLOAD action - Key: SOLR-2610 URL: https://issues.apache.org/jira/browse/SOLR-2610 Project: Solr Issue Type: Improvement Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2610-branch3x.patch, SOLR-2610.patch Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2610) Add an option to delete index through CoreAdmin UNLOAD action
[ https://issues.apache.org/jira/browse/SOLR-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054320#comment-13054320 ] Shalin Shekhar Mangar commented on SOLR-2610: - {quote} I can think of a corollary core action I'd like to see – the ability on a core RELOAD to entirely delete the index from a core and replace it with a fresh empty index that will start building at segment _0. I would do this to my build core before using it, and later after swapping it with the live core and ensuring it's good, to free up disk space. {quote} Shawn, that is not a use-case for RELOAD. The idea behind it is to reload an existing core's index with updated configuration changes and swap it with the existing core without causing downtime. It seems like your use-case is handled well with the stock CREATE, SWAP and UNLOAD+deleteIndex? Add an option to delete index through CoreAdmin UNLOAD action - Key: SOLR-2610 URL: https://issues.apache.org/jira/browse/SOLR-2610 Project: Solr Issue Type: Improvement Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2610-branch3x.patch, SOLR-2610.patch Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3234) Provide limit on phrase analysis in FastVectorHighlighter
[ https://issues.apache.org/jira/browse/LUCENE-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054326#comment-13054326 ] Robert Muir commented on LUCENE-3234: - Oh I see, I think i'm nervous about testRepeatedTerms too. Maybe we can comment it out and just mention its more of a benchmark? The problem could be that the test is timing-based... in general a machine could suddenly get busy at any time, especially since we run many tests in parallel, so I'm worried it could intermittently fail. Provide limit on phrase analysis in FastVectorHighlighter - Key: LUCENE-3234 URL: https://issues.apache.org/jira/browse/LUCENE-3234 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9.4, 3.0.3, 3.1, 3.2, 3.3 Reporter: Mike Sokolov Assignee: Koji Sekiguchi Fix For: 3.4, 4.0 Attachments: LUCENE-3234.patch, LUCENE-3234.patch With larger documents, FVH can spend a lot of time trying to find the best-scoring snippet as it examines every possible phrase formed from matching terms in the document. If one is willing to accept less-than-perfect scoring by limiting the number of phrases that are examined, substantial speedups are possible. This is analogous to the Highlighter limit on the number of characters to analyze. The patch includes an artifical test case that shows 1000x speedup. In a more normal test environment, with English documents and random queries, I am seeing speedups of around 3-10x when setting phraseLimit=1, which has the effect of selecting the first possible snippet in the document. Most of our sites operate in this way (just show the first snippet), so this would be a big win for us. With phraseLimit = -1, you get the existing FVH behavior. At larger values of phraseLimit, you may not get substantial speedup in the normal case, but you do get the benefit of protection against blow-up in pathological cases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
+1 I built PyLucene from the Lucene 3.3 sources, fixed a bug due to FieldComparator becoming generic and all tests passed. Andi.. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054336#comment-13054336 ] Robert Muir commented on LUCENE-3235: - Mike, i installed 1.5.0_22 (amd64) on my linux machine, and i can't reproduce there either (i ran like 500 iterations). Maybe my hardware isn't concurrent enough? or maybe you should un-overclock? :) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3230) Make FSDirectory.fsync() public and static
[ https://issues.apache.org/jira/browse/LUCENE-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054348#comment-13054348 ] Michael McCandless commented on LUCENE-3230: OK, I think close this issue, and open another (to switch to syncing the actual IOs we opened)... this will be a challenge for Lucene though. Make FSDirectory.fsync() public and static -- Key: LUCENE-3230 URL: https://issues.apache.org/jira/browse/LUCENE-3230 Project: Lucene - Java Issue Type: New Feature Components: core/store Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.3, 4.0 I find FSDirectory.fsync() (today protected and instance method) very useful as a utility to sync() files. I'd like create a FSDirectory.sync() utility which contains the exact same impl of FSDir.fsync(), and have the latter call it. We can have it part of IOUtils too, as it's a completely standalone utility. I would get rid of FSDir.fsync() if it wasn't protected (as if encouraging people to override it). I doubt anyone really overrides it (our core Directories don't). Also, while reviewing the code, I noticed that if IOE occurs, the code sleeps for 5 msec. If an InterruptedException occurs then, it immediately throws ThreadIE, completely ignoring the fact that it slept due to IOE. Shouldn't we at least pass IOE.getMessage() on ThreadIE? The patch is trivial, so I'd like to get some feedback before I post it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054349#comment-13054349 ] Michael McCandless commented on LUCENE-3235: VERY interesting! Is anyone able to repro this hang besides me...? TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3232) Move MutableValues to Common Module
[ https://issues.apache.org/jira/browse/LUCENE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054350#comment-13054350 ] Michael McCandless commented on LUCENE-3232: OK this sounds like a good plan... if we can get FQs factored out soonish then we can simply fix grouping module to use that (ie, we don't need common module to hold the ValueSource, etc.). I guess we keep the name common for now. Maybe as we slurp in more stuff from Solr I'll like the name better :) Move MutableValues to Common Module --- Key: LUCENE-3232 URL: https://issues.apache.org/jira/browse/LUCENE-3232 Project: Lucene - Java Issue Type: Sub-task Components: core/search Reporter: Chris Male Fix For: 4.0 Attachments: LUCENE-3232.patch, LUCENE-3232.patch Solr makes use of the MutableValue* series of classes to improve performance of grouping by FunctionQuery (I think). As such they are used in ValueSource implementations. Consequently we need to move these classes in order to move the ValueSources. As Yonik pointed out, these classes have use beyond just FunctionQuerys and might be used by both Solr and other modules. However I don't think they belong in Lucene core, since they aren't really related to search functionality. Therefore I think we should put them into a Common module, which can serve as a dependency to Solr and any module. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054353#comment-13054353 ] Dawid Weiss commented on LUCENE-3235: - I don't think you can force -client if it's a 64 bit release and you have tons of memory, can you? You can check by running java -client -version -- this is what it tells me, for example: {noformat} dweiss@dweiss-linux:~/work/lucene/lucene-trunk$ java -client -version java version 1.6.0_16 Java(TM) SE Runtime Environment (build 1.6.0_16-b01) Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) {noformat} Can you do a remote stack of all the VM (or run it from the console and send it a signal to dump all threads)? TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term
[ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3225: --- Attachment: LUCENE-3225.patch OK, new patch: I added a new seekExact method (instead of new boolean to seek); renamed existing seek methods to either seekCeil or seekExact; changed seekExact(long ord) to not return a value (it's an error to pass out-of-bounds ord to this method). I think it's ready! Optimize TermsEnum.seek when caller doesn't need next term -- Key: LUCENE-3225 URL: https://issues.apache.org/jira/browse/LUCENE-3225 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3225.patch, LUCENE-3225.patch Some codecs are able to save CPU if the caller is only interested in exact matches. EG, Memory codec and SimpleText can do more efficient FSTEnum lookup if they know the caller doesn't need to know the term following the seek term. We have cases like this in Lucene, eg when IW deletes documents by Term, if the term is not found in a given segment then it doesn't need to know the ceiling term. Likewise when TermQuery looks up the term in each segment. I had done this change as part of LUCENE-3030, which is a new terms index that's able to save seeking for exact-only lookups, but now that we have Memory codec that can also save CPU I think we should commit this today. The change adds a boolean onlyExact param to seek(BytesRef). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3233) HuperDuperSynonymsFilter™
[ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3233: --- Attachment: LUCENE-3223.patch Dumping my current state on FSTSynonymFilter -- it compiles but it's got tons of bugs I'm sure! I added a trivial initial test. HuperDuperSynonymsFilter™ - Key: LUCENE-3233 URL: https://issues.apache.org/jira/browse/LUCENE-3233 Project: Lucene - Java Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-3223.patch, LUCENE-3233.patch The current synonymsfilter uses a lot of ram and cpu, especially at build time. I think yesterday I heard about huge synonyms files three times. So, I think we should use an FST-based structure, sharing the inputs and outputs. And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2793: --- Attachment: LUCENE-2793-nrt.patch Patch, fixing NRTCachingDir to no longer have anything to do with the merge scheduler (yay!). Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054359#comment-13054359 ] Michael McCandless commented on LUCENE-2793: I took quick look @ the branch -- it's looking good! Some small stuff: * Should IOContext and MergeInfo be in oal.store not .index? * I think SegmentMerger should receive an IOCtx from its caller, and then apss that to all the IO ops it invokes? But the code has a nocommit about tripping an assert -- which one? * I think on flush IOContext should include num docs and estimated segment size (we can roughly pull this from RAM used for the segment), but we should include comment that this is only approx. * Somehow, lucene/contrib/demo/data is deleted on the branch. We should check if anything else is missing! Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term
[ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054360#comment-13054360 ] Dawid Weiss commented on LUCENE-3225: - I like this one better. boolean args are cryptic (even if I do use them from time to time). Optimize TermsEnum.seek when caller doesn't need next term -- Key: LUCENE-3225 URL: https://issues.apache.org/jira/browse/LUCENE-3225 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3225.patch, LUCENE-3225.patch Some codecs are able to save CPU if the caller is only interested in exact matches. EG, Memory codec and SimpleText can do more efficient FSTEnum lookup if they know the caller doesn't need to know the term following the seek term. We have cases like this in Lucene, eg when IW deletes documents by Term, if the term is not found in a given segment then it doesn't need to know the ceiling term. Likewise when TermQuery looks up the term in each segment. I had done this change as part of LUCENE-3030, which is a new terms index that's able to save seeking for exact-only lookups, but now that we have Memory codec that can also save CPU I think we should commit this today. The change adds a boolean onlyExact param to seek(BytesRef). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054363#comment-13054363 ] Michael McCandless commented on LUCENE-3235: Indeed java -client -version shows it's still using server VM -- you're right! TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054362#comment-13054362 ] Michael McCandless commented on LUCENE-3235: Yes the stack looks just like the stack overflow link I posted -- several threads stuck in sun.misc.Unsafe.park ;) java -Xint definitely does not hang... ran for like 4200 iterations. TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3225) Optimize TermsEnum.seek when caller doesn't need next term
[ https://issues.apache.org/jira/browse/LUCENE-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054364#comment-13054364 ] Simon Willnauer commented on LUCENE-3225: - looks good +1 to commit! thanks for working on that Optimize TermsEnum.seek when caller doesn't need next term -- Key: LUCENE-3225 URL: https://issues.apache.org/jira/browse/LUCENE-3225 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3225.patch, LUCENE-3225.patch Some codecs are able to save CPU if the caller is only interested in exact matches. EG, Memory codec and SimpleText can do more efficient FSTEnum lookup if they know the caller doesn't need to know the term following the seek term. We have cases like this in Lucene, eg when IW deletes documents by Term, if the term is not found in a given segment then it doesn't need to know the ceiling term. Likewise when TermQuery looks up the term in each segment. I had done this change as part of LUCENE-3030, which is a new terms index that's able to save seeking for exact-only lookups, but now that we have Memory codec that can also save CPU I think we should commit this today. The change adds a boolean onlyExact param to seek(BytesRef). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054366#comment-13054366 ] Dawid Weiss commented on LUCENE-3235: - I'm same as Robert: +1 to drop 1.5... TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3230) Make FSDirectory.fsync() public and static
[ https://issues.apache.org/jira/browse/LUCENE-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-3230. Resolution: Won't Fix I opened LUCENE-3237 to improve how fsync works. After we move sync() to IndexOutput, a public static sync() API won't make much sense. Make FSDirectory.fsync() public and static -- Key: LUCENE-3230 URL: https://issues.apache.org/jira/browse/LUCENE-3230 Project: Lucene - Java Issue Type: New Feature Components: core/store Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.3, 4.0 I find FSDirectory.fsync() (today protected and instance method) very useful as a utility to sync() files. I'd like create a FSDirectory.sync() utility which contains the exact same impl of FSDir.fsync(), and have the latter call it. We can have it part of IOUtils too, as it's a completely standalone utility. I would get rid of FSDir.fsync() if it wasn't protected (as if encouraging people to override it). I doubt anyone really overrides it (our core Directories don't). Also, while reviewing the code, I noticed that if IOE occurs, the code sleeps for 5 msec. If an InterruptedException occurs then, it immediately throws ThreadIE, completely ignoring the fact that it slept due to IOE. Shouldn't we at least pass IOE.getMessage() on ThreadIE? The patch is trivial, so I'd like to get some feedback before I post it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3237) FSDirectory.fsync() may not work properly
FSDirectory.fsync() may not work properly - Key: LUCENE-3237 URL: https://issues.apache.org/jira/browse/LUCENE-3237 Project: Lucene - Java Issue Type: Bug Components: core/store Reporter: Shai Erera Fix For: 3.4, 4.0 Spinoff from LUCENE-3230. FSDirectory.fsync() opens a new RAF, sync() its FileDescriptor and closes RAF. It is not clear that this syncs whatever was written to the file by other FileDescriptors. It would be better if we do this operation on the actual RAF/FileOS which wrote the data. We can add sync() to IndexOutput and FSIndexOutput will do that. Directory-wise, we should stop syncing on file names, and instead sync on the IOs that performed the write operations. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #161: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-3.x/161/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestCheckIndex.testLuceneConstantVersion Error Message: Invalid version: 3.3-SNAPSHOT Stack Trace: java.lang.AssertionError: Invalid version: 3.3-SNAPSHOT at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.lucene.index.TestCheckIndex.testLuceneConstantVersion(TestCheckIndex.java:98) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.rules.TestWatchman$1.evaluate(TestWatchman.java:48) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1272) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1190) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:35) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:146) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:97) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.maven.surefire.booter.ProviderFactory$ClassLoaderProxy.invoke(ProviderFactory.java:103) at $Proxy0.invoke(Unknown Source) at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:145) at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcess(SurefireStarter.java:87) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:69) Build Log (for compile errors): [...truncated 15489 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3226) rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex
[ https://issues.apache.org/jira/browse/LUCENE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-3226: --- Attachment: LUCENE-3226.patch Patch improves CheckIndex output (includes information about oldest/newest segments. On the way I fixed a bug in StringHelper.versionComparator (could overflow if Integer.MIN/MAX_VAL were used. The changes to TestDemo won't be committed. I just included them here so you can run the test and check the output. rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex Key: LUCENE-3226 URL: https://issues.apache.org/jira/browse/LUCENE-3226 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1, 3.2 Reporter: Hoss Man Fix For: 3.3, 4.0 Attachments: LUCENE-3226.patch, LUCENE-3226.patch A 3.2 user recently asked if something was wrong because CheckIndex was reporting his (newly built) index version as... {noformat} Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1] {noformat} It seems like there are two very confusing pieces of information here... 1) the variable name of SegmentInfos.FORMAT_3_1 seems like poor choice. All other FORMAT_* constants in SegmentInfos are descriptive of the actual change made, and not specific to the version when they were introduced. 2) whatever the name of the FORMAT_* variable, CheckIndex is labeling it Lucene 3.1, which is missleading since that format is alwasy used in 3.2 (and probably 3.3, etc...). I suggest: a) rename FORMAT_3_1 to something like FORMAT_SEGMENT_RECORDS_VERSION b) change CheckIndex so that the label for the newest format always ends with and later (ie: Lucene 3.1 and later) so when we release versions w/o a format change we don't have to remember to manual list them in CheckIndex. when we *do* make format changes and update CheckIndex and later can be replaced with to X.Y and the new format can be added -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3232) Move MutableValues to Common Module
[ https://issues.apache.org/jira/browse/LUCENE-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054395#comment-13054395 ] Chris Male commented on LUCENE-3232: bq. if we can get FQs factored out soonish This is the last issue preventing me from doing just that :) bq. I guess we keep the name common for now. Awesome. I find it fairly common (ha) in projects to have a common module. If it doesn't pan out, then we can either rename it or slurp it into another module. Move MutableValues to Common Module --- Key: LUCENE-3232 URL: https://issues.apache.org/jira/browse/LUCENE-3232 Project: Lucene - Java Issue Type: Sub-task Components: core/search Reporter: Chris Male Fix For: 4.0 Attachments: LUCENE-3232.patch, LUCENE-3232.patch Solr makes use of the MutableValue* series of classes to improve performance of grouping by FunctionQuery (I think). As such they are used in ValueSource implementations. Consequently we need to move these classes in order to move the ValueSources. As Yonik pointed out, these classes have use beyond just FunctionQuerys and might be used by both Solr and other modules. However I don't think they belong in Lucene core, since they aren't really related to search functionality. Therefore I think we should put them into a Common module, which can serve as a dependency to Solr and any module. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3226) rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex
[ https://issues.apache.org/jira/browse/LUCENE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054397#comment-13054397 ] Michael McCandless commented on LUCENE-3226: Patch looks good -- I like how CheckIndex now tells you version range of your segments. rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex Key: LUCENE-3226 URL: https://issues.apache.org/jira/browse/LUCENE-3226 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1, 3.2 Reporter: Hoss Man Fix For: 3.3, 4.0 Attachments: LUCENE-3226.patch, LUCENE-3226.patch A 3.2 user recently asked if something was wrong because CheckIndex was reporting his (newly built) index version as... {noformat} Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1] {noformat} It seems like there are two very confusing pieces of information here... 1) the variable name of SegmentInfos.FORMAT_3_1 seems like poor choice. All other FORMAT_* constants in SegmentInfos are descriptive of the actual change made, and not specific to the version when they were introduced. 2) whatever the name of the FORMAT_* variable, CheckIndex is labeling it Lucene 3.1, which is missleading since that format is alwasy used in 3.2 (and probably 3.3, etc...). I suggest: a) rename FORMAT_3_1 to something like FORMAT_SEGMENT_RECORDS_VERSION b) change CheckIndex so that the label for the newest format always ends with and later (ie: Lucene 3.1 and later) so when we release versions w/o a format change we don't have to remember to manual list them in CheckIndex. when we *do* make format changes and update CheckIndex and later can be replaced with to X.Y and the new format can be added -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3226) rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex
[ https://issues.apache.org/jira/browse/LUCENE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-3226: --- Attachment: LUCENE-3226.patch Changed the message format a bit (Thanks Robert for the feedback). Now it prints 'version=x.y' if all segments are on the same version, or 'versions=[a.b .. c.d]' if there is more than one version. I plan to commit this. rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex Key: LUCENE-3226 URL: https://issues.apache.org/jira/browse/LUCENE-3226 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1, 3.2 Reporter: Hoss Man Fix For: 3.3, 4.0 Attachments: LUCENE-3226.patch, LUCENE-3226.patch, LUCENE-3226.patch A 3.2 user recently asked if something was wrong because CheckIndex was reporting his (newly built) index version as... {noformat} Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1] {noformat} It seems like there are two very confusing pieces of information here... 1) the variable name of SegmentInfos.FORMAT_3_1 seems like poor choice. All other FORMAT_* constants in SegmentInfos are descriptive of the actual change made, and not specific to the version when they were introduced. 2) whatever the name of the FORMAT_* variable, CheckIndex is labeling it Lucene 3.1, which is missleading since that format is alwasy used in 3.2 (and probably 3.3, etc...). I suggest: a) rename FORMAT_3_1 to something like FORMAT_SEGMENT_RECORDS_VERSION b) change CheckIndex so that the label for the newest format always ends with and later (ie: Lucene 3.1 and later) so when we release versions w/o a format change we don't have to remember to manual list them in CheckIndex. when we *do* make format changes and update CheckIndex and later can be replaced with to X.Y and the new format can be added -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3226) rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex
[ https://issues.apache.org/jira/browse/LUCENE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054411#comment-13054411 ] Robert Muir commented on LUCENE-3226: - +1 rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex Key: LUCENE-3226 URL: https://issues.apache.org/jira/browse/LUCENE-3226 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1, 3.2 Reporter: Hoss Man Fix For: 3.3, 4.0 Attachments: LUCENE-3226.patch, LUCENE-3226.patch, LUCENE-3226.patch A 3.2 user recently asked if something was wrong because CheckIndex was reporting his (newly built) index version as... {noformat} Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1] {noformat} It seems like there are two very confusing pieces of information here... 1) the variable name of SegmentInfos.FORMAT_3_1 seems like poor choice. All other FORMAT_* constants in SegmentInfos are descriptive of the actual change made, and not specific to the version when they were introduced. 2) whatever the name of the FORMAT_* variable, CheckIndex is labeling it Lucene 3.1, which is missleading since that format is alwasy used in 3.2 (and probably 3.3, etc...). I suggest: a) rename FORMAT_3_1 to something like FORMAT_SEGMENT_RECORDS_VERSION b) change CheckIndex so that the label for the newest format always ends with and later (ie: Lucene 3.1 and later) so when we release versions w/o a format change we don't have to remember to manual list them in CheckIndex. when we *do* make format changes and update CheckIndex and later can be replaced with to X.Y and the new format can be added -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2610) Add an option to delete index through CoreAdmin UNLOAD action
[ https://issues.apache.org/jira/browse/SOLR-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054415#comment-13054415 ] Mark Miller commented on SOLR-2610: --- bq.I was approaching this particular issue more from the angle of making it useful for SolrCloud. where do you mention how this helps with SolrCloud? bq. I can see how deleting configs can be useful to some people but is it worth introducing such an inconsistency i.e. you can delete config but cannot add it back? Anyways, it is best handled via a separate issue. Why are you deleting cores only to add them back again with the same config? Do you really think it's inconsistent to actually be able to delete something? Does it really seem like a weird use case to say, I want to delete a SolrCore I no longer have an interest in? Looks like a few people have an interest in this issue, so I'm not sure why you rammed it in so quickly. Add an option to delete index through CoreAdmin UNLOAD action - Key: SOLR-2610 URL: https://issues.apache.org/jira/browse/SOLR-2610 Project: Solr Issue Type: Improvement Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2610-branch3x.patch, SOLR-2610.patch Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3226) rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex
[ https://issues.apache.org/jira/browse/LUCENE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054413#comment-13054413 ] Robert Muir commented on LUCENE-3226: - lets backport this to 3.3? a few issues have been found/fixed already, so i don't mind respinning with this one too, since i think it will eliminate confusion. rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex Key: LUCENE-3226 URL: https://issues.apache.org/jira/browse/LUCENE-3226 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1, 3.2 Reporter: Hoss Man Fix For: 3.3, 4.0 Attachments: LUCENE-3226.patch, LUCENE-3226.patch, LUCENE-3226.patch A 3.2 user recently asked if something was wrong because CheckIndex was reporting his (newly built) index version as... {noformat} Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1] {noformat} It seems like there are two very confusing pieces of information here... 1) the variable name of SegmentInfos.FORMAT_3_1 seems like poor choice. All other FORMAT_* constants in SegmentInfos are descriptive of the actual change made, and not specific to the version when they were introduced. 2) whatever the name of the FORMAT_* variable, CheckIndex is labeling it Lucene 3.1, which is missleading since that format is alwasy used in 3.2 (and probably 3.3, etc...). I suggest: a) rename FORMAT_3_1 to something like FORMAT_SEGMENT_RECORDS_VERSION b) change CheckIndex so that the label for the newest format always ends with and later (ie: Lucene 3.1 and later) so when we release versions w/o a format change we don't have to remember to manual list them in CheckIndex. when we *do* make format changes and update CheckIndex and later can be replaced with to X.Y and the new format can be added -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2308) Separately specify a field's type
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Tankovic updated LUCENE-2308: Attachment: LUCENE-2308-3.patch Patch: copied oal.doc to oal.doc2 with new FieldType changes, modified TestDocument Unit Test, but have some failures. I think it's because reference to oal.doc should change to oal.doc2 but I don't know exactly where so I need some help. I changed imports at IndexSearcher and IndexReader only. Separately specify a field's type - Key: LUCENE-2308 URL: https://issues.apache.org/jira/browse/LUCENE-2308 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, LUCENE-2308.patch, LUCENE-2308.patch This came up from dicussions on IRC. I'm summarizing here... Today when you make a Field to add to a document you can set things index or not, stored or not, analyzed or not, details like omitTfAP, omitNorms, index term vectors (separately controlling offsets/positions), etc. I think we should factor these out into a new class (FieldType?). Then you could re-use this FieldType instance across multiple fields. The Field instance would still hold the actual value. We could then do per-field analyzers by adding a setAnalyzer on the FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise for per-field codecs (with flex), where we now have PerFieldCodecWrapper). This would NOT be a schema! It's just refactoring what we already specify today. EG it's not serialized into the index. This has been discussed before, and I know Michael Busch opened a more ambitious (I think?) issue. I think this is a good first baby step. We could consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3203) Rate-limit IO used by merging
[ https://issues.apache.org/jira/browse/LUCENE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3203: --- Attachment: LUCENE-3203.patch New patch, applies to the IOContext branch. I think it's committable! It adds set/getMaxMergeWriteMBPerSec methods to FSDirectory. Rate-limit IO used by merging - Key: LUCENE-3203 URL: https://issues.apache.org/jira/browse/LUCENE-3203 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3203.patch, LUCENE-3203.patch Large merges can mess up searches and increase NRT reopen time (see http://blog.mikemccandless.com/2011/06/lucenes-near-real-time-search-is-fast.html). A simple rate limiter improves the spikey NRT reopen times during big merges, so I think we should somehow make this possible. Likely this would reduce impact on searches as well. Typically apps that do indexing and searching on same box are in no rush to see the merges complete so this is a good tradeoff. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2621) Consider adding an option to remove all remnants of a SolrCore when unloading it.
Consider adding an option to remove all remnants of a SolrCore when unloading it. - Key: SOLR-2621 URL: https://issues.apache.org/jira/browse/SOLR-2621 Project: Solr Issue Type: New Feature Components: multicore Reporter: Mark Miller Priority: Minor Fix For: 4.0 We can use the new postClose hook from SOLR-2610 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3235) TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug
[ https://issues.apache.org/jira/browse/LUCENE-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054433#comment-13054433 ] Mark Miller commented on LUCENE-3235: - bq. +1 to drop 1.5... +1. TestDoubleBarrelLRUCache hangs under Java 1.5, 3.x and trunk, likely JVM bug Key: LUCENE-3235 URL: https://issues.apache.org/jira/browse/LUCENE-3235 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Not sure what's going on yet... but under Java 1.6 it seems not to hang bug under Java 1.5 hangs fairly easily, on Linux. Java is 1.5.0_22. I suspect this is relevant: http://stackoverflow.com/questions/3292577/is-it-possible-for-concurrenthashmap-to-deadlock which refers to this JVM bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6865591 which then refers to this one http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6822370 It looks like that last bug was fixed in Java 1.6 but not 1.5. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2610) Add an option to delete index through CoreAdmin UNLOAD action
[ https://issues.apache.org/jira/browse/SOLR-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054443#comment-13054443 ] Shalin Shekhar Mangar commented on SOLR-2610: - bq. where do you mention how this helps with SolrCloud? I didn't and I'm sorry about that. I was just trying to tell you my perspective. These are small pieces that need to be fixed before tackling larger problems in SolrCloud and this one seemed generally useful and simple enough by itself that I opened the issue without giving the bigger picture. Some of the other pieces are captured in SOLR-2595 bq. Why are you deleting cores only to add them back again with the same config? Hopefully SOLR-2595 will give you a better idea of what I was thinking. The use-case is to split and migrate pieces of an index and this issue will help in deleting the leftover temporary cores. bq. Do you really think it's inconsistent to actually be able to delete something? The inconsistency is to be able to delete a configuration file when there is no way to add it back but I'm not against the feature in general. bq. Does it really seem like a weird use case to say, I want to delete a SolrCore I no longer have an interest in? Absolutely not. If you want that feature, that's fine. You don't need permissions to put up a patch and commit it :) bq. Looks like a few people have an interest in this issue, so I'm not sure why you rammed it in so quickly. The issue clearly talks about deleting index on unload and that's what it does. And I got a +1 from you and Jason on the topic of the issue (or at least, that's what I assumed). I waited a day to commit - would you like me to wait longer for future issues or leave a comment to that effect? If the patch is not what you intended, go ahead and reopen/extend the scope of the issue or open another issue. Add an option to delete index through CoreAdmin UNLOAD action - Key: SOLR-2610 URL: https://issues.apache.org/jira/browse/SOLR-2610 Project: Solr Issue Type: Improvement Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2610-branch3x.patch, SOLR-2610.patch Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3226) rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex
[ https://issues.apache.org/jira/browse/LUCENE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-3226. Resolution: Fixed Fix Version/s: 3.4 Assignee: Shai Erera Lucene Fields: [New, Patch Available] (was: [New]) Committed revision 1139284 (trunk). Committed revision 1139286 (3x). Committed revision 1139300 (3.3). Thanks Robert and Mike for the review ! rename SegmentInfos.FORMAT_3_1 and improve description in CheckIndex Key: LUCENE-3226 URL: https://issues.apache.org/jira/browse/LUCENE-3226 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1, 3.2 Reporter: Hoss Man Assignee: Shai Erera Fix For: 3.3, 3.4, 4.0 Attachments: LUCENE-3226.patch, LUCENE-3226.patch, LUCENE-3226.patch A 3.2 user recently asked if something was wrong because CheckIndex was reporting his (newly built) index version as... {noformat} Segments file=segments_or numSegments=1 version=FORMAT_3_1 [Lucene 3.1] {noformat} It seems like there are two very confusing pieces of information here... 1) the variable name of SegmentInfos.FORMAT_3_1 seems like poor choice. All other FORMAT_* constants in SegmentInfos are descriptive of the actual change made, and not specific to the version when they were introduced. 2) whatever the name of the FORMAT_* variable, CheckIndex is labeling it Lucene 3.1, which is missleading since that format is alwasy used in 3.2 (and probably 3.3, etc...). I suggest: a) rename FORMAT_3_1 to something like FORMAT_SEGMENT_RECORDS_VERSION b) change CheckIndex so that the label for the newest format always ends with and later (ie: Lucene 3.1 and later) so when we release versions w/o a format change we don't have to remember to manual list them in CheckIndex. when we *do* make format changes and update CheckIndex and later can be replaced with to X.Y and the new format can be added -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2610) Add an option to delete index through CoreAdmin UNLOAD action
[ https://issues.apache.org/jira/browse/SOLR-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054450#comment-13054450 ] Mark Miller commented on SOLR-2610: --- bq. And I got a +1 from you and Jason on the topic of the issue (or at least, that's what I assumed). I waited a day to commit - would you like me to wait longer for future issues or leave a comment to that effect? No, I think a day is fine - just warning perhaps? Both Jason and I liked the idea, but it just seemed like we where discussing some of the details and you committed kind of without warning. I'm not that concerned about it, just mentioning it. bq. If the patch is not what you intended, go ahead and reopen/extend the scope of the issue or open another issue. I think the patch is fine - I've tweaked a couple little things on the changes entry, but the patch itself looks good so far. I opened SOLR-2621 to continue the other 'delete options' discussion. Add an option to delete index through CoreAdmin UNLOAD action - Key: SOLR-2610 URL: https://issues.apache.org/jira/browse/SOLR-2610 Project: Solr Issue Type: Improvement Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2610-branch3x.patch, SOLR-2610.patch Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ludovic Boutros updated LUCENE-3238: Attachment: LUCENE-3238.patch Here is the patch for the branch 3x. SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Attachments: LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-3238: --- Assignee: Robert Muir SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Assignee: Robert Muir Attachments: LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3238: Attachment: LUCENE-3238.patch Hi, definitely a bug, thank you! In my opinion, WildcardQuery should not try to override MultiTermQuery's rewrite here, it causes too many problems. Instead, in this case it should just return a PrefixTermEnum... this is the way we handle these things in trunk and I think we should fix it here the same way. SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Assignee: Robert Muir Attachments: LUCENE-3238.patch, LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054465#comment-13054465 ] Uwe Schindler commented on LUCENE-3238: --- The fix is fine, but in my optinion the problem should be solved differently. I would like to make the rewrite method in MultiTermQuery final to prevent override. To correctly fix the issue, WildcardQuery only needs to return a PrefixTermEnum in its getEnum method. This is already fixed in Lucene 4.0. From looking at the code, SpanMultiTermQueryWrapper would not work correct in all cases, if the underlying query overwrites rewrite(), as the rewritten query would again have the wrong type. SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Assignee: Robert Muir Attachments: LUCENE-3238.patch, LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054466#comment-13054466 ] Uwe Schindler commented on LUCENE-3238: --- Patch is fine! Funny overlap, we both responded with same answer :-) SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Assignee: Robert Muir Attachments: LUCENE-3238.patch, LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2618) Indexing and search on more then one type (Mapping)
[ https://issues.apache.org/jira/browse/SOLR-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Monica Storfjord updated SOLR-2618: --- Description: It would be very beneficial for a project that I am currently working on to have the ability to index and search on various subclasses of an object and map the objects directly to the actual domain-object. This functionality exist in Hibernate search for instance. Is this something that future releases have in mind? I would think that this is something that will make the value of Solr more efficient to a lot of users. We are testing SolrJ 3.2 with the use of the SolrJ client and the web interface to index change and search. It should be possible to make a solution that map against a special type field(like field name=classtype type=class) in schemas.xml that are indexed every time and use reflection against the actual class? - Monica was: It would be very beneficial for a project that I am currently working on to have the ability to index and search on various subclasses of an object and map the objects directly to the actual domain-object. We are planning to do an implementation of this feature but if it is a Solr plugin or something that introduce this feature already if will reduce the development time for us greatly! We are using SolrJ against an Apache Solr 3.2 instance to index, change and search. It should be possible to make a solution that map against a special type field( field name=classtype type=class) in schemas.xml that are indexed every time and use reflection against the actual class? - Monica Summary: Indexing and search on more then one type (Mapping) (was: Indexing and search on more then one object) Indexing and search on more then one type (Mapping) --- Key: SOLR-2618 URL: https://issues.apache.org/jira/browse/SOLR-2618 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 3.2 Reporter: Monica Storfjord Priority: Minor It would be very beneficial for a project that I am currently working on to have the ability to index and search on various subclasses of an object and map the objects directly to the actual domain-object. This functionality exist in Hibernate search for instance. Is this something that future releases have in mind? I would think that this is something that will make the value of Solr more efficient to a lot of users. We are testing SolrJ 3.2 with the use of the SolrJ client and the web interface to index change and search. It should be possible to make a solution that map against a special type field(like field name=classtype type=class) in schemas.xml that are indexed every time and use reflection against the actual class? - Monica -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
This might be the cause of the test failures in the skiplist (I will investigate!). In general, not all tests are guaranteed to work correctly with tests.iter 1, some tests have bugs! On Fri, Jun 24, 2011 at 10:45 AM, Uwe Schindler u...@thetaphi.de wrote: I forgot to mention, all tests of core were running 95 minutes using -Dtests.multiplier=100 and -Dtests.iter=100 ! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 4:44 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 iter Hi, I run some smoke tests yesterday on the Lucene 2.9/3.0 release machine I used. The machine runs Java 1.5.0_22, Solaris x64, Opteron 16 cores. One of the tests was already fixed by Robert (I told him yesterday because it was always failing). The others are maybe serious, maybe not: [junit] Testsuite: org.apache.lucene.index.TestMultiLevelSkipList [junit] Testcase: testSimpleSkip(org.apache.lucene.index.TestMultiLevelSkipList): FAILED [junit] Wrong payload for the target 14: -106 expected:14 but was:- 106 [junit] junit.framework.AssertionFailedError: Wrong payload for the target 14: -106 expected:14 but was:-106 [junit] at org.apache.lucene.index.TestMultiLevelSkipList.checkSkipTo(TestMultiLevel SkipList.java:87) [junit] at org.apache.lucene.index.TestMultiLevelSkipList.testSimpleSkip(TestMultiLev elSkipList.java:66) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Luc eneTestCase.java:1272) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Luc eneTestCase.java:1190) ...( 100 repetitions of same stack trace) and then about 100 times with different seeds: [junit] NOTE: reproduce with: ant test -Dtestcase=TestMultiLevelSkipList - Dtestmethod=testSimpleSkip - Dtests.seed=2861480580591035682:880958701285368932 But it's not reproduceable, so maybe its only the repetition causing this! This one is very serious and easy to reproduce with every printed seed: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Testcase: testSmall(org.apache.lucene.util.TestOpenBitSet): Caused an ERROR [junit] -1 [junit] java.lang.ArrayIndexOutOfBoundsException: -1 [junit] at org.apache.lucene.util.OpenBitSet.prevSetBit(OpenBitSet.java:671) [junit] at org.apache.lucene.util.TestOpenBitSet.doPrevSetBit(TestOpenBitSet.java:53 ) [junit] at org.apache.lucene.util.TestOpenBitSet.doRandomSets(TestOpenBitSet.java: 148) [junit] at org.apache.lucene.util.TestOpenBitSet.testSmall(TestOpenBitSet.java:192) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Luc eneTestCase.java:1272) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Luc eneTestCase.java:1190) [junit] [junit] [junit] Testcase: testSmall(org.apache.lucene.util.TestOpenBitSet): Caused an ERROR [junit] (null) [junit] java.lang.ArrayIndexOutOfBoundsException [junit] [junit] [junit] Testcase: testSmall(org.apache.lucene.util.TestOpenBitSet): Caused an ERROR [junit] (null) [junit] java.lang.ArrayIndexOutOfBoundsException [junit] [junit] ... following lots of times the (null) AIOOBE message... [junit] NOTE: reproduce with: ant test -Dtestcase=TestOpenBitSet - Dtestmethod=testSmall -Dtests.seed=- 4526826707499307278:4139930264431857886 Again with different seeds. But not all 100 repetitions fail, so the ones mentioned should fail, but reproduceible. The good news: PANGAEA index works fine, no readVInt hotspot problems with Java 6! Thanks Robert for fixing in 3.1, but after changes in MMap they could have reappeared. So this release candidate is in my opinion broken! -1 to release. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 10:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here: http://s.apache.org/lusolr33rc0 working release notes here: http://wiki.apache.org/lucene-java/ReleaseNote33 http://wiki.apache.org/solr/ReleaseNote33 I ran the automated release test script in trunk/dev- tools/scripts/smokeTestRelease.py, and ran 'ant test' at the top level 50 times on windows. Here is my +1 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
[jira] [Updated] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3238: Attachment: LUCENE-3238.patch Same patch: except I made MultiTermQuery's rewrite() final. In my opinion, this is a good backwards break, it will only fix bugs in someone's code if they have a custom MultiTermQuery: its very tricky to override this (e.g. you must pass along boost, rewriteMethod, ...), and when you do, still might cause problems (like this Span issue). Its also much easier to just return a simpler enum. SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Assignee: Robert Muir Attachments: LUCENE-3238.patch, LUCENE-3238.patch, LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 4:47 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 This might be the cause of the test failures in the skiplist (I will investigate!). In general, not all tests are guaranteed to work correctly with tests.iter 1, some tests have bugs! On Fri, Jun 24, 2011 at 10:45 AM, Uwe Schindler u...@thetaphi.de wrote: I forgot to mention, all tests of core were running 95 minutes using - Dtests.multiplier=100 and -Dtests.iter=100 ! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 4:44 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 iter Hi, I run some smoke tests yesterday on the Lucene 2.9/3.0 release machine I used. The machine runs Java 1.5.0_22, Solaris x64, Opteron 16 cores. One of the tests was already fixed by Robert (I told him yesterday because it was always failing). The others are maybe serious, maybe not: [junit] Testsuite: org.apache.lucene.index.TestMultiLevelSkipList [junit] Testcase: testSimpleSkip(org.apache.lucene.index.TestMultiLevelSkipList): FAILED [junit] Wrong payload for the target 14: -106 expected:14 but was:- 106 [junit] junit.framework.AssertionFailedError: Wrong payload for the target 14: -106 expected:14 but was:-106 [junit] at org.apache.lucene.index.TestMultiLevelSkipList.checkSkipTo(TestMultiL evel SkipList.java:87) [junit] at org.apache.lucene.index.TestMultiLevelSkipList.testSimpleSkip(TestMul tiLev elSkipList.java:66) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(L uc eneTestCase.java:1272) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(L uc eneTestCase.java:1190) ...( 100 repetitions of same stack trace) and then about 100 times with different seeds: [junit] NOTE: reproduce with: ant test -Dtestcase=TestMultiLevelSkipList - Dtestmethod=testSimpleSkip - Dtests.seed=2861480580591035682:880958701285368932 But it's not reproduceable, so maybe its only the repetition causing this! This one is very serious and easy to reproduce with every printed seed: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Testcase: testSmall(org.apache.lucene.util.TestOpenBitSet): Caused an ERROR [junit] -1 [junit] java.lang.ArrayIndexOutOfBoundsException: -1 [junit] at org.apache.lucene.util.OpenBitSet.prevSetBit(OpenBitSet.java:671) [junit] at org.apache.lucene.util.TestOpenBitSet.doPrevSetBit(TestOpenBitSet.jav a:53 ) [junit] at org.apache.lucene.util.TestOpenBitSet.doRandomSets(TestOpenBitSet.java: 148) [junit] at org.apache.lucene.util.TestOpenBitSet.testSmall(TestOpenBitSet.java:1 92) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(L uc eneTestCase.java:1272) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(L uc eneTestCase.java:1190) [junit] [junit] [junit] Testcase: testSmall(org.apache.lucene.util.TestOpenBitSet): Caused an ERROR [junit] (null) [junit] java.lang.ArrayIndexOutOfBoundsException [junit] [junit] [junit] Testcase: testSmall(org.apache.lucene.util.TestOpenBitSet): Caused an ERROR [junit] (null) [junit] java.lang.ArrayIndexOutOfBoundsException [junit] [junit] ... following lots of times the (null) AIOOBE message... [junit] NOTE: reproduce with: ant test -Dtestcase=TestOpenBitSet - Dtestmethod=testSmall -Dtests.seed=- 4526826707499307278:4139930264431857886 Again with different seeds. But not all 100 repetitions fail, so the ones mentioned should fail, but reproduceible. The good news: PANGAEA index works fine, no readVInt hotspot problems with Java 6! Thanks Robert for fixing in 3.1, but after changes in MMap they could have reappeared. So this release candidate is in my opinion broken! -1 to release. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 23, 2011 10:18 PM To: dev@lucene.apache.org Subject: [VOTE] release 3.3 Artifacts here:
[jira] [Commented] (SOLR-2610) Add an option to delete index through CoreAdmin UNLOAD action
[ https://issues.apache.org/jira/browse/SOLR-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054489#comment-13054489 ] Shawn Heisey commented on SOLR-2610: bq. Shawn, that is not a use-case for RELOAD. The idea behind it is to reload an existing core's index with updated configuration changes and swap it with the existing core without causing downtime. It seems like your use-case is handled well with the stock CREATE, SWAP and UNLOAD+deleteIndex? CREATE requires that the caller be aware of internal server filesystem structures. For the typical use of CREATE, this is not really a problem, but if what you're trying to do is unload a core, delete its index, and then immediately recreate it with the same config, it would be very nice to not have to specify (or even know) the solr.xml configuration bits. In this particular case, the person who writes the scripts is the same person who maintains the Solr infrastructure (me) ... but that might not always be the case. Currently the build scripts don't know anything about the internal structure other than core names, and I'd like to keep it that way. Adding an option like deleteIndex to RELOAD seemed a logical way to handle this, since currently (1.4.1) I have to completely restart Solr when I wipe out an index directory. If this is not a logical progression, I would argue that CoreAdmin needs an entirely new action. Either way, if it's deemed desirable, it needs its own Jira issue. I brought it up here because it's at least tangentially related. Add an option to delete index through CoreAdmin UNLOAD action - Key: SOLR-2610 URL: https://issues.apache.org/jira/browse/SOLR-2610 Project: Solr Issue Type: Improvement Components: multicore Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 3.3, 4.0 Attachments: SOLR-2610-branch3x.patch, SOLR-2610.patch Right now, one can unload a Solr Core but the index files are left behind and consume disk space. We should have an option to delete the index when unloading a core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
I assume the problem is the intrinsic, I will replace by the own hacker's delight impl (like we do everywhere else in OpenBitSet, why did we use the platform method here?) and try again Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 5:07 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
The bug is *not* fixed by replacing Long.numberOfLeadingZeros(word) with BitUtils.nlz(word). So this is really strange. Also happens with -Xbatch. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:20 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 I assume the problem is the intrinsic, I will replace by the own hacker's delight impl (like we do everywhere else in OpenBitSet, why did we use the platform method here?) and try again Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 5:07 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
And -Xint and -client On Fri, Jun 24, 2011 at 11:32 AM, Uwe Schindler u...@thetaphi.de wrote: The bug is *not* fixed by replacing Long.numberOfLeadingZeros(word) with BitUtils.nlz(word). So this is really strange. Also happens with -Xbatch. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:20 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 I assume the problem is the intrinsic, I will replace by the own hacker's delight impl (like we do everywhere else in OpenBitSet, why did we use the platform method here?) and try again Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 5:07 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2308) Separately specify a field's type
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Tankovic updated LUCENE-2308: Attachment: LUCENE-2308-4.patch Separately specify a field's type - Key: LUCENE-2308 URL: https://issues.apache.org/jira/browse/LUCENE-2308 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308.patch, LUCENE-2308.patch This came up from dicussions on IRC. I'm summarizing here... Today when you make a Field to add to a document you can set things index or not, stored or not, analyzed or not, details like omitTfAP, omitNorms, index term vectors (separately controlling offsets/positions), etc. I think we should factor these out into a new class (FieldType?). Then you could re-use this FieldType instance across multiple fields. The Field instance would still hold the actual value. We could then do per-field analyzers by adding a setAnalyzer on the FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise for per-field codecs (with flex), where we now have PerFieldCodecWrapper). This would NOT be a schema! It's just refactoring what we already specify today. EG it's not serialized into the index. This has been discussed before, and I know Michael Busch opened a more ambitious (I think?) issue. I think this is a good first baby step. We could consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2308) Separately specify a field's type
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Tankovic updated LUCENE-2308: Attachment: LUCENE-2308-4.patch Patch No.4: Passing TestDemo Unit test Separately specify a field's type - Key: LUCENE-2308 URL: https://issues.apache.org/jira/browse/LUCENE-2308 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308.patch, LUCENE-2308.patch This came up from dicussions on IRC. I'm summarizing here... Today when you make a Field to add to a document you can set things index or not, stored or not, analyzed or not, details like omitTfAP, omitNorms, index term vectors (separately controlling offsets/positions), etc. I think we should factor these out into a new class (FieldType?). Then you could re-use this FieldType instance across multiple fields. The Field instance would still hold the actual value. We could then do per-field analyzers by adding a setAnalyzer on the FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise for per-field codecs (with flex), where we now have PerFieldCodecWrapper). This would NOT be a schema! It's just refactoring what we already specify today. EG it's not serialized into the index. This has been discussed before, and I know Michael Busch opened a more ambitious (I think?) issue. I think this is a good first baby step. We could consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
OK, the bug is not a Java 5 bug, its just a different behavior in the BitSet impl between java 5 and java 6. Java 5's BitSet impl allocates for new BitSet(0) always at least one word, so size() differs from OpenBitSet. This makes the test fail. The fix is to code the test correctly by using Math.min(BitSet.length(), OpenBitSet.size()) as upper limit and not assume the allocation strategy of both bitsets is identical. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:33 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 The bug is *not* fixed by replacing Long.numberOfLeadingZeros(word) with BitUtils.nlz(word). So this is really strange. Also happens with -Xbatch. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:20 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 I assume the problem is the intrinsic, I will replace by the own hacker's delight impl (like we do everywhere else in OpenBitSet, why did we use the platform method here?) and try again Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 5:07 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3239) drop java 5 support
drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
Here the patch to fix the test bug, it's really a test bug - found thanks to Java 5. It was even wrong to use length(), correct would be size() for both and only use the minimum of both as allocation strategy may not be equal. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 6:06 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 OK, the bug is not a Java 5 bug, its just a different behavior in the BitSet impl between java 5 and java 6. Java 5's BitSet impl allocates for new BitSet(0) always at least one word, so size() differs from OpenBitSet. This makes the test fail. The fix is to code the test correctly by using Math.min(BitSet.length(), OpenBitSet.size()) as upper limit and not assume the allocation strategy of both bitsets is identical. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:33 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 The bug is *not* fixed by replacing Long.numberOfLeadingZeros(word) with BitUtils.nlz(word). So this is really strange. Also happens with -Xbatch. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:20 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 I assume the problem is the intrinsic, I will replace by the own hacker's delight impl (like we do everywhere else in OpenBitSet, why did we use the platform method here?) and try again Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 5:07 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org TestOpenBitSet.patch Description: Binary data - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
I just took a quick peek at the prevBitSet, and the implementation looks buggy (provided that it's legal for a user to pass an i that may be greater than the largest bit ever set). Here is the current code, which will cause an exception when wlen==0. public int prevSetBit(int index) { if (index 0) { return -1; } int i = index6; if (i = wlen) { i = wlen - 1; } final int subIndex = index 0x3f; // index within the word long word = (bits[i] (63-subIndex)); // skip all the bits to the left of index All that needs to be done is to move the negative index check to the bottom (the first index0 is not needed since we do a signed shift). public int prevSetBit(int index) { int i = index6; if (i = wlen) { i = wlen - 1; } if (i 0) return -1; final int subIndex = index 0x3f; // index within the word long word = (bits[i] (63-subIndex)); // skip all the bits to the left of index -Yonik http://www.lucidimagination.com On Fri, Jun 24, 2011 at 11:33 AM, Robert Muir rcm...@gmail.com wrote: And -Xint and -client On Fri, Jun 24, 2011 at 11:32 AM, Uwe Schindler u...@thetaphi.de wrote: The bug is *not* fixed by replacing Long.numberOfLeadingZeros(word) with BitUtils.nlz(word). So this is really strange. Also happens with -Xbatch. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Friday, June 24, 2011 5:20 PM To: dev@lucene.apache.org Subject: RE: [VOTE] release 3.3 I assume the problem is the intrinsic, I will replace by the own hacker's delight impl (like we do everywhere else in OpenBitSet, why did we use the platform method here?) and try again Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 24, 2011 5:07 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 Just some more info: i took away the seed and used -Dtests.iter=100 on this test: JAVA5: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 23, Time elapsed: 21.793 sec JAVA6: junit-sequential: [junit] Testsuite: org.apache.lucene.util.TestOpenBitSet [junit] Tests run: 400, Failures: 0, Errors: 0, Time elapsed: 19.719 sec so this test fails 23% of the time on java5. The reason we never caught it, is that java5 is unmaintained and we cannot even test it in hudson... aka we cannot support this monster anymore On Fri, Jun 24, 2011 at 11:02 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:54 AM, Uwe Schindler u...@thetaphi.de wrote: The OpenBitSet test is in all cases serious (vs. the skiplist test is a test bug, that true). The AIOOBE is caused inside OpenBitSet and that should never ever happen, even if you use it incorrectly! Its not clear that its that serious, it only fails with java 5 for me (not java 6) :) Looks like a bug in java 5... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-3179: --- The testcase for prevSetBit has a bug, that was found by testing with java 5. It assumes that the allocation strategy for BitSet and OpenBitSet is identical, which it is not. E.g. Java 5's new BitSet(0) allocates still one word, while OpenBitSet does not. The attached patch fixes the issue. OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3179: -- Attachment: TestOpenBitSet.patch OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054525#comment-13054525 ] Uwe Schindler commented on LUCENE-3239: --- As said yesterday to you privately: I agree with making Lucene trunk Java 6 only - Surprise. But 3.x should stay with Java 5. Is this ok for you? I know, Simon will not agree because he made DocValues for Android *g* About Hudson testing: We may donate a machine to Infra just for Lucene tests running something nice like Ubuntu, stay tuned (no details, I just say that). drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
On Fri, Jun 24, 2011 at 12:14 PM, Yonik Seeley yo...@lucidimagination.com wrote: All that needs to be done is to move the negative index check to the bottom (the first index0 is not needed since we do a signed shift). public int prevSetBit(int index) { int i = index6; if (i = wlen) { i = wlen - 1; } if (i 0) return -1; final int subIndex = index 0x3f; // index within the word long word = (bits[i] (63-subIndex)); // skip all the bits to the left of index And a further minor optimization, if we assume that negative indexes are not legal, is to move the (i0) check inside the if (i=wlen) block (and just let a negative index passed by the user to cause a natural AIOOB). -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3238) SpanMultiTermQueryWrapper with Prefix Query issue
[ https://issues.apache.org/jira/browse/LUCENE-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054530#comment-13054530 ] ludovic Boutros commented on LUCENE-3238: - I understand the patch, that's better indeed. :) Thanks. SpanMultiTermQueryWrapper with Prefix Query issue - Key: LUCENE-3238 URL: https://issues.apache.org/jira/browse/LUCENE-3238 Project: Lucene - Java Issue Type: Bug Components: core/search Affects Versions: 3.3 Environment: Windows 7, JDK 1.6 Reporter: ludovic Boutros Assignee: Robert Muir Attachments: LUCENE-3238.patch, LUCENE-3238.patch, LUCENE-3238.patch If we try to do a search with SpanQuery and a PrefixQuery this message is returned: You can only use SpanMultiTermQueryWrapper with a suitable SpanRewriteMethod. The problem is in the WildcardQuery rewrite function. If the wildcard query is a prefix, a new prefix query is created, the rewrite method is set with the SpanRewriteMethod and the prefix query is returned. But, that's the rewritten prefix query which should be returned: - return rewritten; + return rewritten.rewrite(reader); I will attach a patch with a unit test included. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] release 3.3
Yonik, You are the best. If you look at the test, its also broken somehow, because it uses length vs. size() wrong (I already reopened the https://issues.apache.org/jira/browse/LUCENE-3179 issue). And please stop ranting about Java 5, it helped to find a bug in this impl, its really broken as OpenBitSet always allows indexes = size (except fast* methods). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Friday, June 24, 2011 6:29 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 On Fri, Jun 24, 2011 at 12:14 PM, Yonik Seeley yo...@lucidimagination.com wrote: All that needs to be done is to move the negative index check to the bottom (the first index0 is not needed since we do a signed shift). public int prevSetBit(int index) { int i = index6; if (i = wlen) { i = wlen - 1; } if (i 0) return -1; final int subIndex = index 0x3f; // index within the word long word = (bits[i] (63-subIndex)); // skip all the bits to the left of index And a further minor optimization, if we assume that negative indexes are not legal, is to move the (i0) check inside the if (i=wlen) block (and just let a negative index passed by the user to cause a natural AIOOB). -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054542#comment-13054542 ] Robert Muir commented on LUCENE-3239: - bq. About Hudson testing: We may donate a machine to Infra just for Lucene tests running something nice like Ubuntu, stay tuned (no details, I just say that). And when this time comes, we could consider supporting java 5. But right now, we don't have a way to test it. drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2382) DIH Cache Improvements
[ https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2382: - Attachment: SOLR-2382.patch Here is a version that passes parameters via the Context object, rather than by building maps. DIH Cache Improvements -- Key: SOLR-2382 URL: https://issues.apache.org/jira/browse/SOLR-2382 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: James Dyer Priority: Minor Attachments: SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch Functionality: 1. Provide a pluggable caching framework for DIH so that users can choose a cache implementation that best suits their data and application. 2. Provide a means to temporarily cache a child Entity's data without needing to create a special cached implementation of the Entity Processor (such as CachedSqlEntityProcessor). 3. Provide a means to write the final (root entity) DIH output to a cache rather than to Solr. Then provide a way for a subsequent DIH call to use the cache as an Entity input. Also provide the ability to do delta updates on such persistent caches. 4. Provide the ability to partition data across multiple caches that can then be fed back into DIH and indexed either to varying Solr Shards, or to the same Core in parallel. Use Cases: 1. We needed a flexible scalable way to temporarily cache child-entity data prior to joining to parent entities. - Using SqlEntityProcessor with Child Entities can cause an n+1 select problem. - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching mechanism and does not scale. - There is no way to cache non-SQL inputs (ex: flat files, xml, etc). 2. We needed the ability to gather data from long-running entities by a process that runs separate from our main indexing process. 3. We wanted the ability to do a delta import of only the entities that changed. - Lucene/Solr requires entire documents to be re-indexed, even if only a few fields changed. - Our data comes from 50+ complex sql queries and/or flat files. - We do not want to incur overhead re-gathering all of this data if only 1 entity's data changed. - Persistent DIH caches solve this problem. 4. We want the ability to index several documents in parallel (using 1.4.1, which did not have the threads parameter). 5. In the future, we may need to use Shards, creating a need to easily partition our source data into Shards. Implementation Details: 1. De-couple EntityProcessorBase from caching. - Created a new interface, DIHCache two implementations: - SortedMapBackedCache - An in-memory cache, used as default with CachedSqlEntityProcessor (now deprecated). - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested with je-4.1.6.jar - NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar. I believe this may be incompatible due to Generic Usage. - NOTE: I did not modify the ant script to automatically get this jar, so to use or evaluate this patch, download bdb-je from http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 2. Allow Entity Processors to take a cacheImpl parameter to cause the entity data to be cached (see EntityProcessorBase DIHCacheProperties). 3. Partially De-couple SolrWriter from DocBuilder - Created a new interface DIHWriter, two implementations: - SolrWriter (refactored) - DIHCacheWriter (allows DIH to write ultimately to a Cache). 4. Create a new Entity Processor, DIHCacheProcessor, which reads a persistent Cache as DIH Entity Input. 5. Support a partition parameter with both DIHCacheWriter and DIHCacheProcessor to allow for easy partitioning of source entity data. 6. Change the semantics of entity.destroy() - Previously, it was being called on each iteration of DocBuilder.buildDocument(). - Now it is does one-time cleanup tasks (like closing or deleting a disk-backed cache) once the entity processor is completed. - The only out-of-the-box entity processor that previously implemented destroy() was LineEntitiyProcessor, so this is not a very invasive change. General Notes: We are near completion in converting our search functionality from a legacy search engine to Solr. However, I found that DIH did not support caching to the level of our prior product's data import utility. In order to get our data into Solr, I created these caching enhancements. Because I believe this has broad application, and because we would like this feature to be supported by the Community, I have front-ported this, enhanced, to Trunk.
RE: [VOTE] release 3.3
Hi Yonik, I wrote atestcase that checks how prevSetBit behaves, if I add you patch with optimization. It still had a bug, if the index is beyond last word but not at a multiple of bitsPerWord. The following code is correct: public int prevSetBit(int index) { int i = index 6; final int subIndex; if (i = wlen) { i = wlen - 1; if (i 0) return -1; subIndex = 0x3f; // last possible bit } else { if (i 0) return -1; subIndex = index 0x3f; // index within the word } long word = (bits[i] (63-subIndex)); // skip all the bits to the left of index if (word != 0) { return (i 6) + subIndex - Long.numberOfLeadingZeros(word); // See LUCENE-3197 } while (--i = 0) { word = bits[i]; if (word !=0 ) { return (i 6) + 63 - Long.numberOfLeadingZeros(word); } } return -1; } Your additional optimization with negative indexes is invalid, because on negative indexes prevSetBit() must be negative. If we dont do this, a typical loop like would AIOOBE: for (int i = bs.prevSetBit(0); i = 0; i = bs.prevSetBit(i-1)) { // operate on index i here } Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Friday, June 24, 2011 6:29 PM To: dev@lucene.apache.org Subject: Re: [VOTE] release 3.3 On Fri, Jun 24, 2011 at 12:14 PM, Yonik Seeley yo...@lucidimagination.com wrote: All that needs to be done is to move the negative index check to the bottom (the first index0 is not needed since we do a signed shift). public int prevSetBit(int index) { int i = index6; if (i = wlen) { i = wlen - 1; } if (i 0) return -1; final int subIndex = index 0x3f; // index within the word long word = (bits[i] (63-subIndex)); // skip all the bits to the left of index And a further minor optimization, if we assume that negative indexes are not legal, is to move the (i0) check inside the if (i=wlen) block (and just let a negative index passed by the user to cause a natural AIOOB). -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3179: -- Attachment: LUCENE-3179-fix.patch Yonik mentioned on mailing list that prevSetBit is broken for size==0 and also indexes = size. In that case you always get AIOOBE or even wrong results. In the case of an index = the length of the bitset, the scanning must start at the last possible bit, so subIndex must be 0x3f and not simply the anded bits. This is my naive fix. Tests pass (I added a extra check to the test that start beyond end of bitset to check prevSetBit). OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2382) DIH Cache Improvements
[ https://issues.apache.org/jira/browse/SOLR-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054575#comment-13054575 ] James Dyer commented on SOLR-2382: -- {quote} Why should the DocBuilder be even aware of DIHCache , Should it not be kept local to the EntityProcessor? {quote} You're right that when the cache is owned by an EntityProcessor, DocBuilder has no knowledge of it. But there is another way these caches can be used, described in the functionality section of this issue's description: {quote} 3. Provide a means to write the final (root entity) DIH output to a cache rather than to Solr ... Also provide the ability to do delta updates on such persistent caches. {quote} In this case, DocBuilder is outputting not to SolrWriter, but to DIHCacheWriter. It is arguable the DIHCacheWriter should not be instantiated in DocBuilder in this instance, as I currently have it. Perhaps its should happen up the stack in DataImporter, etc. But in any case, whenvever DIHCacheWriter gets instantiated, it needs to know which CacheImpl to create and also pass on any parameters that CacheImpl needs. DIH Cache Improvements -- Key: SOLR-2382 URL: https://issues.apache.org/jira/browse/SOLR-2382 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: James Dyer Priority: Minor Attachments: SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch, SOLR-2382.patch Functionality: 1. Provide a pluggable caching framework for DIH so that users can choose a cache implementation that best suits their data and application. 2. Provide a means to temporarily cache a child Entity's data without needing to create a special cached implementation of the Entity Processor (such as CachedSqlEntityProcessor). 3. Provide a means to write the final (root entity) DIH output to a cache rather than to Solr. Then provide a way for a subsequent DIH call to use the cache as an Entity input. Also provide the ability to do delta updates on such persistent caches. 4. Provide the ability to partition data across multiple caches that can then be fed back into DIH and indexed either to varying Solr Shards, or to the same Core in parallel. Use Cases: 1. We needed a flexible scalable way to temporarily cache child-entity data prior to joining to parent entities. - Using SqlEntityProcessor with Child Entities can cause an n+1 select problem. - CachedSqlEntityProcessor only supports an in-memory HashMap as a Caching mechanism and does not scale. - There is no way to cache non-SQL inputs (ex: flat files, xml, etc). 2. We needed the ability to gather data from long-running entities by a process that runs separate from our main indexing process. 3. We wanted the ability to do a delta import of only the entities that changed. - Lucene/Solr requires entire documents to be re-indexed, even if only a few fields changed. - Our data comes from 50+ complex sql queries and/or flat files. - We do not want to incur overhead re-gathering all of this data if only 1 entity's data changed. - Persistent DIH caches solve this problem. 4. We want the ability to index several documents in parallel (using 1.4.1, which did not have the threads parameter). 5. In the future, we may need to use Shards, creating a need to easily partition our source data into Shards. Implementation Details: 1. De-couple EntityProcessorBase from caching. - Created a new interface, DIHCache two implementations: - SortedMapBackedCache - An in-memory cache, used as default with CachedSqlEntityProcessor (now deprecated). - BerkleyBackedCache - A disk-backed cache, dependent on bdb-je, tested with je-4.1.6.jar - NOTE: the existing Lucene Contrib db project uses je-3.3.93.jar. I believe this may be incompatible due to Generic Usage. - NOTE: I did not modify the ant script to automatically get this jar, so to use or evaluate this patch, download bdb-je from http://www.oracle.com/technetwork/database/berkeleydb/downloads/index.html 2. Allow Entity Processors to take a cacheImpl parameter to cause the entity data to be cached (see EntityProcessorBase DIHCacheProperties). 3. Partially De-couple SolrWriter from DocBuilder - Created a new interface DIHWriter, two implementations: - SolrWriter (refactored) - DIHCacheWriter (allows DIH to write ultimately to a Cache). 4. Create a new Entity Processor, DIHCacheProcessor, which reads a persistent Cache as DIH Entity Input. 5. Support a partition parameter with both DIHCacheWriter and DIHCacheProcessor to allow for easy partitioning of source entity data. 6.
[jira] [Commented] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054578#comment-13054578 ] Uwe Schindler commented on LUCENE-3179: --- The check for negative indexes must be done to make the following loop work (which is standard to iterate backwards from startBit on all bits): {code:java} for (int i = bs.prevSetBit(startBit); i = 0; i = bs.prevSetBit(i-1)) { // operate on index i here } {code} This would fail with AIOOBE when i=0 on the last iteration (happens if 0th bit is set), because bs.prevSetBit(i-1) has negative parameter. The exit condition is checked later, so -1 must be allowed. OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] release 3.3
On Fri, Jun 24, 2011 at 1:35 PM, Uwe Schindler u...@thetaphi.de wrote: Hi Yonik, I wrote atestcase that checks how prevSetBit behaves, if I add you patch with optimization. It still had a bug, if the index is beyond last word but not at a multiple of bitsPerWord. Ahh, right, good catch! You want to start at the last bit rather than calculate the bit via MOD in that case. Your additional optimization with negative indexes is invalid, Well, invalid if negative indexes are valid. because on negative indexes prevSetBit() must be negative. If we don’t do this, a typical loop like would AIOOBE: for (int i = bs.prevSetBit(0); i = 0; i = bs.prevSetBit(i-1)) { // operate on index i here } Yep, that makes sense to allow. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3179: -- Attachment: LUCENE-3179-fix.patch Modified patch. I moved the assignment to the word variable to also inside the if/else branch, as for the beyond-last-bit case we can optimize to not shift at all. OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054606#comment-13054606 ] Paul Elschot commented on LUCENE-3179: -- The 3179-fix patch looks good to me. I remember I had some doubts about which bit was actually the last one, and stopped worrying about it when the tests passed. This patch makes it very clear what the last bit is. OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054607#comment-13054607 ] Yonik Seeley commented on LUCENE-3179: -- +1, patch looks good Uwe. OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054629#comment-13054629 ] Mark Miller commented on LUCENE-3239: - That seems reasonable to me - officially we endorse/support java 6, but we can the 3.x line to 5 features. I can live with that myself. drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2613) DIH Cache backed w/bdb-je
[ https://issues.apache.org/jira/browse/SOLR-2613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-2613: - Attachment: SOLR-2613.patch This version keeps BerkleyBackedCache in sync with the last version from SOLR-2382. This version takes parameters from a Context object rather from a Map. DIH Cache backed w/bdb-je - Key: SOLR-2613 URL: https://issues.apache.org/jira/browse/SOLR-2613 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: James Dyer Priority: Minor Attachments: SOLR-2613.patch, SOLR-2613.patch This is spun out of SOLR-2382, which provides a framework for multiple cacheing implementations with DIH. This cache implementation is fast flexible, supporting persistence and delta updates. However, it depends on Berkley Database Java Edition so in order to evaluate this and use it you must download bdb-je from Oracle and accept the license requirements. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054634#comment-13054634 ] Stefan Matheis (steffkes) commented on SOLR-2399: - Young, bq. ..., and I keep on hitting a build failed. Why did your build fail? Like Ryan already said, the UI is for trunk .. but the code should still build even on 3.x branches? Stefan Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Ryan McKinley Priority: Minor Fix For: 4.0 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, SOLR-2399-110606.patch, SOLR-2399-110622.patch, SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] *Features:* * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png] * [Query-Form|http://files.mathe.is/solr-admin/02_query.png] * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png] * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, SOLR-2400) * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png] * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482) * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png] * [Replication|http://files.mathe.is/solr-admin/10_replication.png] * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png] * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459) ** Stub (using static data) Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2619) two sfields in geospatial search
[ https://issues.apache.org/jira/browse/SOLR-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054663#comment-13054663 ] David Smiley commented on SOLR-2619: I am confused trying to understand this Wish. Is the situation two indexed points (each in a separate field), and you want to find documents that are within a particular radius from either point? That's already supported. two sfields in geospatial search Key: SOLR-2619 URL: https://issues.apache.org/jira/browse/SOLR-2619 Project: Solr Issue Type: Wish Components: clients - php Affects Versions: 3.2 Environment: Using with drupal Reporter: jose rodriguez Fix For: 3.2 Is it possible to create a query with two sfield (geospatial search)? .Want to mean two diferents pt and d for each field. If i need from - to then i need fields around the from coordinate and around the to coordinates. Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-3179. --- Resolution: Fixed Assignee: Paul Elschot Committed 3.x branch revision: 1139430 Committed trunk revision: 1139431 Committed 3.3 branch revision: 1139433 Thanks Yonik! OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Assignee: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3228) build should allow you (especially hudson) to refer to a local javadocs installation instead of downloading
[ https://issues.apache.org/jira/browse/LUCENE-3228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054679#comment-13054679 ] Hoss Man commented on LUCENE-3228: -- +1 bq. I think we should allow you optionally set a sysprop using linkoffline hell, why bother with the sysprop? .. lets just commit the package-list files for all third party libs we use into dev-tools and completely eliminate the need for net when building javadocs. build should allow you (especially hudson) to refer to a local javadocs installation instead of downloading --- Key: LUCENE-3228 URL: https://issues.apache.org/jira/browse/LUCENE-3228 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Assignee: Robert Muir Currently, we fail on all javadocs warnings. However, you get a warning if it cannot download the package-list from sun.com So I think we should allow you optionally set a sysprop using linkoffline. Then we would get much less hudson fake failures I feel like Mike opened an issue for this already but I cannot find it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054681#comment-13054681 ] Uwe Schindler commented on LUCENE-3179: --- One more comment: When working on the code, the symmetry all other methods have between long and int is broken here. For consistency we should add the long method, too. I just don't like the missing consistency. Also: OpenBitSet.nextSetBit() does not use Long.numberOfTrailingZeroes() but the new prevSetBit() does. As both methods have intrinsics, why only use one of them? Yonik? Any comments? OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Assignee: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1137601 - /lucene/dev/trunk/lucene/src/test-framework/org/apache/lucene/search/Query Utils.java
: URL: http://svn.apache.org/viewvc?rev=1137601view=rev : Log: revert speedup, the wrapping causes fc insanity. the only reason : this works today is that a new index is created in every setup/teardown, : which also makes these tests slow... there are some other tests where a single field is used in multiple ways that would normally cause inanity over the life of a single test (Solr's TestSort comes to mind) the solution used there was to have the test method directly call assertSaneFieldCache purgeFieldCache after each chunk of work to ensure no expected insanity exists the next time assertSaneFieldCache is called : : Modified: : lucene/dev/trunk/lucene/src/test-framework/org/apache/lucene/search/QueryUtils.java : : Modified: lucene/dev/trunk/lucene/src/test-framework/org/apache/lucene/search/QueryUtils.java : URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/test-framework/org/apache/lucene/search/QueryUtils.java?rev=1137601r1=1137600r2=1137601view=diff : == : --- lucene/dev/trunk/lucene/src/test-framework/org/apache/lucene/search/QueryUtils.java (original) : +++ lucene/dev/trunk/lucene/src/test-framework/org/apache/lucene/search/QueryUtils.java Mon Jun 20 12:00:12 2011 : @@ -148,35 +148,23 @@ public class QueryUtils { : // we can't put deleted docs before the nested reader, because : // it will throw off the docIds : IndexReader[] readers = new IndexReader[] { : - edge 0 ? r : emptyReaders[0], : - emptyReaders[0], : - new MultiReader(edge 0 ? emptyReaders[4] : emptyReaders[0], : - emptyReaders[0], : - 0 == edge ? r : emptyReaders[0]), : - 0 edge ? emptyReaders[0] : emptyReaders[7], : - emptyReaders[0], : - new MultiReader(0 edge ? emptyReaders[0] : emptyReaders[5], : - emptyReaders[0], : - 0 edge ? r : emptyReaders[0]) : + edge 0 ? r : IndexReader.open(makeEmptyIndex(random, 0), true), : + IndexReader.open(makeEmptyIndex(random, 0), true), : + new MultiReader(IndexReader.open(makeEmptyIndex(random, edge 0 ? 4 : 0), true), : + IndexReader.open(makeEmptyIndex(random, 0), true), : + 0 == edge ? r : IndexReader.open(makeEmptyIndex(random, 0), true)), : + IndexReader.open(makeEmptyIndex(random, 0 edge ? 0 : 7), true), : + IndexReader.open(makeEmptyIndex(random, 0), true), : + new MultiReader(IndexReader.open(makeEmptyIndex(random, 0 edge ? 0 : 5), true), : + IndexReader.open(makeEmptyIndex(random, 0), true), : + 0 edge ? r : IndexReader.open(makeEmptyIndex(random, 0), true)) : }; : IndexSearcher out = LuceneTestCase.newSearcher(new MultiReader(readers)); : out.setSimilarityProvider(s.getSimilarityProvider()); : return out; :} : - : - static final IndexReader[] emptyReaders = new IndexReader[8]; : - static { : -try { : - emptyReaders[0] = makeEmptyIndex(new Random(0), 0); : - emptyReaders[4] = makeEmptyIndex(new Random(0), 4); : - emptyReaders[5] = makeEmptyIndex(new Random(0), 5); : - emptyReaders[7] = makeEmptyIndex(new Random(0), 7); : -} catch (IOException ex) { : - throw new RuntimeException(ex); : -} : - } : : - private static IndexReader makeEmptyIndex(Random random, final int numDeletedDocs) : + private static Directory makeEmptyIndex(Random random, final int numDeletedDocs) : throws IOException { : Directory d = new MockDirectoryWrapper(random, new RAMDirectory()); :IndexWriter w = new IndexWriter(d, new IndexWriterConfig( : @@ -200,7 +188,8 @@ public class QueryUtils { :IndexReader r = IndexReader.open(d, true); :Assert.assertEquals(reader has wrong number of deleted docs, :numDeletedDocs, r.numDeletedDocs()); : - return r; : + r.close(); : + return d; :} : :/** alternate scorer skipTo(),skipTo(),next(),next(),skipTo(),skipTo(), etc : : : -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org