[jira] Commented: (LUCENE-2181) benchmark for collation

2010-01-11 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799056#action_12799056 ] Steven Rowe commented on LUCENE-2181: - +1, once again, tests all pass, and "ant collat

[jira] Updated: (LUCENE-2204) FastVectorHighlighter: some classes and members should be publicly accessible to implement FragmentsBuilder

2010-01-11 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated LUCENE-2204: --- Attachment: LUCENE-2204.patch A patch attached. It includes reset methods for Tokenizer that

[jira] Updated: (LUCENE-2204) FastVectorHighlighter: some classes and members should be publicly accessible to implement FragmentsBuilder

2010-01-11 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated LUCENE-2204: --- Summary: FastVectorHighlighter: some classes and members should be publicly accessible to im

[jira] Created: (LUCENE-2204) FastVectorHighlighter:

2010-01-11 Thread Koji Sekiguchi (JIRA)
FastVectorHighlighter: --- Key: LUCENE-2204 URL: https://issues.apache.org/jira/browse/LUCENE-2204 Project: Lucene - Java Issue Type: Improvement Components: contrib/highlighter Affects Versions: 3.0, 2.9.1

[jira] Updated: (LUCENE-2204) FastVectorHighlighter: some classes and members should be publicly accessible

2010-01-11 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated LUCENE-2204: --- Summary: FastVectorHighlighter: some classes and members should be publicly accessible (was

[jira] Commented: (LUCENE-2203) improved snowball testing

2010-01-11 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798957#action_12798957 ] Uwe Schindler commented on LUCENE-2203: --- The revision no. in the "svn co" works exac

Re: Compound File Default

2010-01-11 Thread Jason Rutherglen
Maybe the default can be conditional on the platform like NIOFSDirectory. On Mon, Jan 11, 2010 at 1:25 PM, Marvin Humphrey wrote: > On Mon, Jan 11, 2010 at 03:20:17PM -0500, Grant Ingersoll wrote: >> Should we really still be defaulting to true for setUseCompoundFile?  Do >> people still run out

Re: Compound File Default

2010-01-11 Thread Marvin Humphrey
On Mon, Jan 11, 2010 at 03:20:17PM -0500, Grant Ingersoll wrote: > Should we really still be defaulting to true for setUseCompoundFile? Do > people still run out of file handles? Yep. You're going to smack up against that limit pretty quick on Mac OS X: mar...@smokey:~ $ ulimit -n 256

Re: Compound File Default

2010-01-11 Thread Otis Gospodnetic
+1. I never liked having the compound format be the default, since increasing the max # of open file handles is a well documented thing, at least in the UNIX world. Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message > From: Grant Ingersoll > To: j

Re: Compound File Default

2010-01-11 Thread Michael McCandless
+1 I think we should make it Version dependent... Mike On Mon, Jan 11, 2010 at 3:20 PM, Grant Ingersoll wrote: > Should we really still be defaulting to true for setUseCompoundFile?  Do > people still run out of file handles?  If so, why not have them turn it on, > instead of everyone else ha

Compound File Default

2010-01-11 Thread Grant Ingersoll
Should we really still be defaulting to true for setUseCompoundFile? Do people still run out of file handles? If so, why not have them turn it on, instead of everyone else having to turn it off. -Grant - To unsubscribe, e-m

[jira] Commented: (LUCENE-2203) improved snowball testing

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798800#action_12798800 ] Robert Muir commented on LUCENE-2203: - Simon, these files are large (70MB) but so is t

[jira] Commented: (LUCENE-2203) improved snowball testing

2010-01-11 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798795#action_12798795 ] Simon Willnauer commented on LUCENE-2203: - Robert, those test seem to be very exte

[jira] Updated: (LUCENE-2181) benchmark for collation

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2181: Attachment: LUCENE-2181.patch Steven thanks, in addition to your comments I also changed the confi

[jira] Commented: (LUCENE-2201) more performance improvements for snowball

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798748#action_12798748 ] Robert Muir commented on LUCENE-2201: - all tests from LUCENE-2203 pass with this patch

[jira] Commented: (LUCENE-2203) improved snowball testing

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798737#action_12798737 ] Robert Muir commented on LUCENE-2203: - its worth mentioning for the two broken languag

[jira] Updated: (LUCENE-2203) improved snowball testing

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2203: Attachment: LUCENE-2203.patch attached is a patch that does an svn checkout of rev 500 (which is w

[jira] Created: (LUCENE-2203) improved snowball testing

2010-01-11 Thread Robert Muir (JIRA)
improved snowball testing - Key: LUCENE-2203 URL: https://issues.apache.org/jira/browse/LUCENE-2203 Project: Lucene - Java Issue Type: Test Components: contrib/analyzers Reporter: Robert Muir

[jira] Commented: (LUCENE-2181) benchmark for collation

2010-01-11 Thread Steven Rowe (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798679#action_12798679 ] Steven Rowe commented on LUCENE-2181: - +1, tests all pass, and "ant collation" produce

[jira] Updated: (LUCENE-2181) benchmark for collation

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2181: Attachment: LUCENE-2181.patch corrected this testReadTokens(), it tests by adding up token freq ac

[jira] Commented: (LUCENE-2181) benchmark for collation

2010-01-11 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798655#action_12798655 ] Robert Muir commented on LUCENE-2181: - bq. I just ran the contrib/benchmark tests, and

Re: update doc by query

2010-01-11 Thread Michael McCandless
Also, if the only reason why you're committing is so a reader can see the changes (ie, you don't need so much "safety"), you should use IndexWriter.getReader instead. commit is really only needed for safety (ie known recovery points on crash), or, for cases where the reader must be opened in a dif

Re: update doc by query

2010-01-11 Thread Sanne Grinovero
Then I wouldn't need it and can still improve performance by using periodic commits, nice! thanks for explaining this, Sanne On Mon, Jan 11, 2010 at 10:57 AM, Michael McCandless wrote: > On Sun, Jan 10, 2010 at 6:13 PM, Sanne Grinovero > wrote: >> Even if it's not strictly needed anymore, could

Re: update doc by query

2010-01-11 Thread Michael McCandless
On Sun, Jan 10, 2010 at 6:13 PM, Sanne Grinovero wrote: > Even if it's not strictly needed anymore, could it improve performance? I think there should be no real performance gains/losses one way or another. The current updateDocument call basically boils down to delete then add. > Right now I n