date:20121010


 [ 
https://issues.apache.org/jira/browse/LUCENE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4462:


Attachment: LUCENE-4462.patch

here is a new patch adding back the safety forcePurge. I will commit this to 
trunk and let it bake in a bit before I backport. I will keep this issue open 
until it's ported.

 Publishing flushed segments is single threaded and too costly
 -

 Key: LUCENE-4462
 URL: https://issues.apache.org/jira/browse/LUCENE-4462
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
Reporter: Michael McCandless
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4462.patch, LUCENE-4462.patch


 Spinoff from http://lucene.markmail.org/thread/4li6bbomru35qn7w
 The new TestBagOfPostings failed the build because it timed out after 2 hours 
 ... but in digging I found that it was a starvation issue: the 4 threads were 
 flushing segments much faster than the 1 thread could publish them.
 I think this is because publishing segments 
 (DocumentsWriter.publishFlushedSegment) is actually rather costly (creates 
 CFS file if necessary, writes .si, etc.).
 I committed a workaround for now, to prevent starvation (see svn diff -c 
 1394704 https://svn.apache.org/repos/asf/lucene/dev/trunk), but we really 
 should address the root cause by moving these costly ops into flush() so that 
 publishing is a low cost operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4462) Publishing flushed segments is single threaded and too costly


 [ 
https://issues.apache.org/jira/browse/LUCENE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4462:


  Component/s: core/index
Lucene Fields: New,Patch Available  (was: New)
Affects Version/s: 4.0
   4.0-ALPHA
   4.0-BETA
Fix Version/s: 5.0
   4.1

 Publishing flushed segments is single threaded and too costly
 -

 Key: LUCENE-4462
 URL: https://issues.apache.org/jira/browse/LUCENE-4462
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
Reporter: Michael McCandless
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4462.patch, LUCENE-4462.patch


 Spinoff from http://lucene.markmail.org/thread/4li6bbomru35qn7w
 The new TestBagOfPostings failed the build because it timed out after 2 hours 
 ... but in digging I found that it was a starvation issue: the 4 threads were 
 flushing segments much faster than the 1 thread could publish them.
 I think this is because publishing segments 
 (DocumentsWriter.publishFlushedSegment) is actually rather costly (creates 
 CFS file if necessary, writes .si, etc.).
 I committed a workaround for now, to prevent starvation (see svn diff -c 
 1394704 https://svn.apache.org/repos/asf/lucene/dev/trunk), but we really 
 should address the root cause by moving these costly ops into flush() so that 
 publishing is a low cost operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4462) Publishing flushed segments is single threaded and too costly


[ 
https://issues.apache.org/jira/browse/LUCENE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473069#comment-13473069
 ] 

Simon Willnauer commented on LUCENE-4462:
-

Committed to trunk in revision 1396500


 Publishing flushed segments is single threaded and too costly
 -

 Key: LUCENE-4462
 URL: https://issues.apache.org/jira/browse/LUCENE-4462
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0
Reporter: Michael McCandless
Assignee: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4462.patch, LUCENE-4462.patch


 Spinoff from http://lucene.markmail.org/thread/4li6bbomru35qn7w
 The new TestBagOfPostings failed the build because it timed out after 2 hours 
 ... but in digging I found that it was a starvation issue: the 4 threads were 
 flushing segments much faster than the 1 thread could publish them.
 I think this is because publishing segments 
 (DocumentsWriter.publishFlushedSegment) is actually rather costly (creates 
 CFS file if necessary, writes .si, etc.).
 I committed a workaround for now, to prevent starvation (see svn diff -c 
 1394704 https://svn.apache.org/repos/asf/lucene/dev/trunk), but we really 
 should address the root cause by moving these costly ops into flush() so that 
 publishing is a low cost operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3924) The solr weakness fault tolerance on ext4

2012-10-10 Thread shou aoki (JIRA)

shou aoki created SOLR-3924:
---

 Summary: The solr weakness fault tolerance on ext4
 Key: SOLR-3924
 URL: https://issues.apache.org/jira/browse/SOLR-3924
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.5
 Environment: Ubuntu 12.04 LTS, Filesystem is ext4.
Reporter: shou aoki


In few days ago ours machine (with solr) was crashed.
We rebooted machine and solr, The solr looks like validly behavior.
But, The solr is not valid because following : 
- Exists Solr core.
- Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
- Exists /tmp/solr/solr.xml and wrote solr/cores/core tag.
- We can start Solr Server without exception.
- $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.*
  
And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty 
file.
So, I think there is no (or weak) fault tolerance about Solr.

I hope the solr grow up crash-free server.
For example, File handling is atomically as much as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3924) The solr weakness fault tolerance on ext4

2012-10-10 Thread shou aoki (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

shou aoki updated SOLR-3924:

Description:
In few days ago ours machine (with solr) was crashed.
We rebooted machine and solr, The solr looks like validly behavior.
But, The solr is not valid because following :
- Exists Solr core.
- Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
- Exists /tmp/solr/solr.xml and exists solr/cores/core tag.
- We can start Solr Server without exception.
- $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.*

And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty
file.
So, I think there is no (or weak) fault tolerance about Solr.

I hope the solr grow up crash-free server.
For example, File handling is atomically as much as possible.

was:
In few days ago ours machine (with solr) was crashed.
We rebooted machine and solr, The solr looks like validly behavior.
But, The solr is not valid because following :
- Exists Solr core.
- Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
- Exists /tmp/solr/solr.xml and wrote solr/cores/core tag.
- We can start Solr Server without exception.
- $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not Found.*

And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is empty
file.
So, I think there is no (or weak) fault tolerance about Solr.

I hope the solr grow up crash-free server.
For example, File handling is atomically as much as possible.

The solr weakness fault tolerance on ext4
-

Key: SOLR-3924
URL: https://issues.apache.org/jira/browse/SOLR-3924
Project: Solr
Issue Type: Bug
Affects Versions: 3.5
Environment: Ubuntu 12.04 LTS, Filesystem is ext4.
Reporter: shou aoki

In few days ago ours machine (with solr) was crashed.
We rebooted machine and solr, The solr looks like validly behavior.
But, The solr is not valid because following :
- Exists Solr core.
- Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
- Exists /tmp/solr/solr.xml and exists solr/cores/core tag.
- We can start Solr Server without exception.
- $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not
Found.*

And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is
empty file.
So, I think there is no (or weak) fault tolerance about Solr.
I hope the solr grow up crash-free server.
For example, File handling is atomically as much as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3924) The solr weakness fault tolerance on ext4


 [ 
https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-3924.
-

   Resolution: Duplicate
Fix Version/s: 3.6

This is a duplicate of LUCENE-3627 and was already fixed in Lucene 3.6.0.

 The solr weakness fault tolerance on ext4
 -

 Key: SOLR-3924
 URL: https://issues.apache.org/jira/browse/SOLR-3924
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.5
 Environment: Ubuntu 12.04 LTS, Filesystem is ext4.
Reporter: shou aoki
 Fix For: 3.6


 In few days ago ours machine (with solr) was crashed.
 We rebooted machine and solr, The solr looks like validly behavior.
 But, The solr is not valid because following : 
 - Exists Solr core.
 - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
 - Exists /tmp/solr/solr.xml and exists solr/cores/core tag.
 - We can start Solr Server without exception.
 - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not 
 Found.*
   
 And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is 
 empty file.
 So, I think there is no (or weak) fault tolerance about Solr.
 I hope the solr grow up crash-free server.
 For example, File handling is atomically as much as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-3924) The solr weakness fault tolerance on ext4


 [ 
https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler closed SOLR-3924.
---


 The solr weakness fault tolerance on ext4
 -

 Key: SOLR-3924
 URL: https://issues.apache.org/jira/browse/SOLR-3924
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.5
 Environment: Ubuntu 12.04 LTS, Filesystem is ext4.
Reporter: shou aoki
 Fix For: 3.6


 In few days ago ours machine (with solr) was crashed.
 We rebooted machine and solr, The solr looks like validly behavior.
 But, The solr is not valid because following : 
 - Exists Solr core.
 - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
 - Exists /tmp/solr/solr.xml and exists solr/cores/core tag.
 - We can start Solr Server without exception.
 - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not 
 Found.*
   
 And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is 
 empty file.
 So, I think there is no (or weak) fault tolerance about Solr.
 I hope the solr grow up crash-free server.
 For example, File handling is atomically as much as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3924) The solr weakness fault tolerance on ext4

2012-10-10 Thread shou aoki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473110#comment-13473110
 ] 

shou aoki commented on SOLR-3924:
-

Thank you for your information Schindler!

 The solr weakness fault tolerance on ext4
 -

 Key: SOLR-3924
 URL: https://issues.apache.org/jira/browse/SOLR-3924
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.5
 Environment: Ubuntu 12.04 LTS, Filesystem is ext4.
Reporter: shou aoki
 Fix For: 3.6


 In few days ago ours machine (with solr) was crashed.
 We rebooted machine and solr, The solr looks like validly behavior.
 But, The solr is not valid because following : 
 - Exists Solr core.
 - Exists /tmp/solr directory and /tmp/solr/data/solr_core directory.
 - Exists /tmp/solr/solr.xml and exists solr/cores/core tag.
 - We can start Solr Server without exception.
 - $ curl http://localhost:8983/solr/solr_core/select?; *returns 404 Not 
 Found.*
   
 And I found the file of '/tmp/solr/data/solr_core/index/segments.gen' is 
 empty file.
 So, I think there is no (or weak) fault tolerance about Solr.
 I hope the solr grow up crash-free server.
 For example, File handling is atomically as much as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4470) Expose SpanFirst in eDismax

Markus Jelsma created LUCENE-4470:
-

 Summary: Expose SpanFirst in eDismax
 Key: LUCENE-4470
 URL: https://issues.apache.org/jira/browse/LUCENE-4470
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/queryparser
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0


Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
formatted value.

For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, 
originally generated for automatic phrase queries, is located within five 
positions from the field's start.

Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4470) Expose SpanFirst in eDismax


 [ 
https://issues.apache.org/jira/browse/LUCENE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated LUCENE-4470:
--

Attachment: SOLR-4470-trunk-1.patch

 Expose SpanFirst in eDismax
 ---

 Key: LUCENE-4470
 URL: https://issues.apache.org/jira/browse/LUCENE-4470
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/queryparser
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0

 Attachments: SOLR-4470-trunk-1.patch


 Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
 This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
 formatted value.
 For example, sf=title~5^2 will give a boost of 2 if one of the normal 
 clauses, originally generated for automatic phrase queries, is located within 
 five positions from the field's start.
 Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-4470) Expose SpanFirst in eDismax


 [ 
https://issues.apache.org/jira/browse/LUCENE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma closed LUCENE-4470.
-

   Resolution: Invalid
Fix Version/s: (was: 5.0)
   (was: 4.1)

Accidentally added to Lucene. I'll close and open in the Solr project.
Sorry.

 Expose SpanFirst in eDismax
 ---

 Key: LUCENE-4470
 URL: https://issues.apache.org/jira/browse/LUCENE-4470
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/queryparser
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Attachments: SOLR-4470-trunk-1.patch


 Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
 This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
 formatted value.
 For example, sf=title~5^2 will give a boost of 2 if one of the normal 
 clauses, originally generated for automatic phrase queries, is located within 
 five positions from the field's start.
 Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3925) Expose SpanFirst in eDismax

Markus Jelsma created SOLR-3925:
---

 Summary: Expose SpanFirst in eDismax
 Key: SOLR-3925
 URL: https://issues.apache.org/jira/browse/SOLR-3925
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0
 Attachments: SOLR-3925-trunk-1.patch

Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
formatted value.

For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, 
originally generated for automatic phrase queries, is located within five 
positions from the field's start.

Unit test is included and all tests pass.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3925) Expose SpanFirst in eDismax


 [ 
https://issues.apache.org/jira/browse/SOLR-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3925:


Attachment: SOLR-3925-trunk-1.patch

 Expose SpanFirst in eDismax
 ---

 Key: SOLR-3925
 URL: https://issues.apache.org/jira/browse/SOLR-3925
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.0-BETA
 Environment: solr-spec 5.0.0.2012.10.09.19.29.59
 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 
Reporter: Markus Jelsma
 Fix For: 4.1, 5.0

 Attachments: SOLR-3925-trunk-1.patch


 Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. 
 This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST 
 formatted value.
 For example, sf=title~5^2 will give a boost of 2 if one of the normal 
 clauses, originally generated for automatic phrase queries, is located within 
 five positions from the field's start.
 Unit test is included and all tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: release 4.0 (RC2)

2012-10-10 Thread Martijn v Groningen

+1

On 10 October 2012 05:23, Yonik Seeley yo...@lucidworks.com wrote:
 +1

 -Yonik
 http://lucidworks.com


 On Sat, Oct 6, 2012 at 4:10 AM, Robert Muir rcm...@gmail.com wrote:
 artifacts here: http://s.apache.org/lusolr40rc2

 Thanks for the good inspection of rc#1 and finding bugs, which found
 test bugs and other bugs!
 I am happy this was all discovered and sorted out before release.

 vote stays open until wednesday, the weekend is just extra time for
 evaluating the RC.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Met vriendelijke groet,

Martijn van Groningen

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9490 - Failure!

2012-10-10 Thread builder

Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9490/

1 tests failed.
REGRESSION:  org.apache.lucene.util.TestPagedBytes.testDataInputOutput

Error Message:
n must be positive

Stack Trace:
java.lang.IllegalArgumentException: n must be positive
at 
__randomizedtesting.SeedInfo.seed([E2AD98D7834D0534:B9E34FD3E8763D47]:0)
at java.util.Random.nextInt(Random.java:300)
at 
com.carrotsearch.randomizedtesting.AssertingRandom.nextInt(AssertingRandom.java:81)
at 
org.apache.lucene.util.TestPagedBytes.testDataInputOutput(TestPagedBytes.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:722)




Build Log:
[...truncated 484 lines...]
[junit4:junit4] Suite: org.apache.lucene.util.TestPagedBytes
[junit4:junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestPagedBytes 
-Dtests.method=testDataInputOutput -Dtests.seed=E2AD98D7834D0534 
-Dtests.slow=true -Dtests.locale=fr_CA -Dtests.timezone=Pacific/Samoa 
-Dtests.file.encoding=ISO-8859-1
[junit4:junit4] ERROR   0.05s J5 | TestPagedBytes.testDataInputOutput 
[junit4:junit4] Throwable #1: java.lang.IllegalArgumentException: n must 
be positive
[junit4:junit4]

Re: [JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 9490 - Failure!

2012-10-10 Thread Michael McCandless

I committed a fix ... test bug.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Oct 10, 2012 at 8:35 AM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/9490/

 1 tests failed.
 REGRESSION:  org.apache.lucene.util.TestPagedBytes.testDataInputOutput

 Error Message:
 n must be positive

 Stack Trace:
 java.lang.IllegalArgumentException: n must be positive
 at 
 __randomizedtesting.SeedInfo.seed([E2AD98D7834D0534:B9E34FD3E8763D47]:0)
 at java.util.Random.nextInt(Random.java:300)
 at 
 com.carrotsearch.randomizedtesting.AssertingRandom.nextInt(AssertingRandom.java:81)
 at 
 org.apache.lucene.util.TestPagedBytes.testDataInputOutput(TestPagedBytes.java:68)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at java.lang.Thread.run(Thread.java:722)




 Build Log:
 [...truncated 484 lines...]
 [junit4:junit4] Suite: org.apache.lucene.util.TestPagedBytes
 [junit4:junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=TestPagedBytes -Dtests.method=testDataInputOutput 
 -Dtests.seed=E2AD98D7834D0534 -Dtests.slow=true -Dtests.locale=fr_CA

[jira] [Created] (SOLR-3926) solrj should support better way of finding active sorts

2012-10-10 Thread Eirik Lygre (JIRA)

Eirik Lygre created SOLR-3926:
-

 Summary: solrj should support better way of finding active sorts
 Key: SOLR-3926
 URL: https://issues.apache.org/jira/browse/SOLR-3926
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 4.0-BETA
Reporter: Eirik Lygre
Priority: Minor


The Solrj api uses ortogonal concepts for setting/removing and getting sort 
information. Setting/removing uses a combination of (name,order), while getters 
return a String name order:

{code}
public SolrQuery setSortField(String field, ORDER order);
public SolrQuery addSortField(String field, ORDER order);
public SolrQuery removeSortField(String field, ORDER order);
public String[] getSortFields();
public String getSortField();
{code}

If you want to use the current sort information to present a list of active 
sorts, with the possibility to remove then, you need to manually parse the 
string(s) returned from getSortFields, to recreate the information required by 
removeSortField(). Not difficult, but not convenient either :-)

Therefore this suggestion: Add a new method {{public MapString,ORDER 
getSortFieldMap();}} which returns an ordered map of active sort fields. An 
example implementation is shown below (here as a utility method living outside 
SolrQuery; the rewrite should be trivial)

{code}
public MapString, ORDER getSortFieldMap(SolrQuery query) {
String[] actualSortFields = query.getSortFields();
if (actualSortFields == null || actualSortFields.length == 0)
return Collections.emptyMap();

MapString, ORDER sortFieldMap = new LinkedHashMapString, ORDER();
for (String sortField : actualSortFields) {
String[] fieldSpec = sortField.split( );
sortFieldMap.put(fieldSpec[0], ORDER.valueOf(fieldSpec[1]));
}

return sortFieldMap;
}
{code}

For what it's worth, this is possible client code:

{code}
System.out.println(Active sorts);

MapString, ORDER fieldMap = getSortFieldMap(query);
for (String field : fieldMap.keySet()) {
   System.out.println(-  + field + ; dir= + fieldMap.get(field));
}
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Q: dataset with full text + judgements for IR eval

2012-10-10 Thread Grant Ingersoll

TREC?

On Oct 4, 2012, at 10:49 AM, Otis Gospodnetic wrote:

 Hi,
 
 I checked the Wiki, but couldn't find any references to dataset that have:
 
 
 * full document content
 
 * queries with relevance judgements
 
 Are there any such datasets available?
 
 Thanks,
 Otis 
 
 Performance Monitoring for Solr / ElasticSearch / HBase - 
 http://sematext.com/spm 


Grant Ingersoll
http://www.lucidworks.com

[jira] [Commented] (SOLR-3923) eDismax: complex fielded query with parens is not recognized

2012-10-10 Thread Jack Krupansky (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473194#comment-13473194
 ] 

Jack Krupansky commented on SOLR-3923:
--

It looks like the pf phrase boosting is not ignoring fielded terms.

 eDismax: complex fielded query with parens is not recognized
 

 Key: SOLR-3923
 URL: https://issues.apache.org/jira/browse/SOLR-3923
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 3.5
Reporter: Shawn Heisey
 Fix For: 3.6.2


 This is similar to SOLR-3377.  That bug appears to have fixed this problem 
 for 4.x.
 I can see the effects of SOLR-3377 when I test a query similar to below on 
 the Solr 3.6 example, which is expected because SOLR-3377 was found in 3.6 
 but only fixed in 4.0.  This bug is a little different, and exists in 3.5.0 
 for sure, possibly earlier.  The first part of the parsed query looks right, 
 but then something weird happens and it gets interpreted as a very strange 
 phrase query.
 query URL sent to solr 3.5.0 example:
 {code}http://localhost:8983/solr/collection1/select?q=%28%28cat%3Astring1%29+%28Kitchen+Sink%29%29wt=xmldebugQuery=truedefType=edismaxqf=textpf=text^2.0
 {code}
 parsedquery_toString:
 {code}+((cat:string1 ((text:kitchen) (text:sink)))~2) (text:cat:string1 
 kitchen sink^2.0)
 {code}
 Adding some spaces before and after cat:string1 fixes it:
 {code}http://localhost:8983/solr/collection1/select?q=%28%28%20cat%3Astring1%20%29+%28Kitchen+Sink%29%29wt=xmldebugQuery=truedefType=edismaxqf=textpf=text^2.0
 {code}
 {code}+((cat:string1 ((text:kitchen) (text:sink)))~2) (text:kitchen 
 sink^2.0)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4471) Test4GBStoredFields

Adrien Grand created LUCENE-4471:


 Summary: Test4GBStoredFields
 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0


Yesterday I fixed a bug (integer overflow) that only happens when a fields data 
(.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4472) Add setting that prevents merging on updateDocument

Simon Willnauer created LUCENE-4472:
---

 Summary: Add setting that prevents merging on updateDocument
 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0
 Attachments: LUCENE-4472.patch

Currently we always call maybeMerge if a segment was flushed after 
updateDocument. Some apps and in particular ElasticSearch uses some hacky 
workarounds to disable that ie for merge throttling. It should be easier to 
enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4472) Add setting that prevents merging on updateDocument


 [ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4472:


Attachment: LUCENE-4472.patch

here is a patch that adds such an option to IWC via live settings. 

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4471) Test4GBStoredFields


 [ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-4471:
-

Attachment: Test4GBStoredFields.java

Test case that finds the bug that I fixed yesterday. The only problem is that 
it is very slow (~ 100s with Lucene40, up to 250s with Compressing).

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields

2012-10-10 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473245#comment-13473245
 ] 

Dawid Weiss commented on LUCENE-4471:
-

Make it a @Nightly and not worry? :)

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473249#comment-13473249
 ] 

Robert Muir commented on LUCENE-4471:
-

+1 for just making it a nightly test.

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields

2012-10-10 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473252#comment-13473252
 ] 

Michael McCandless commented on LUCENE-4471:


+1

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument

2012-10-10 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473255#comment-13473255
 ] 

Michael McCandless commented on LUCENE-4472:


Neat :)  How does ES customize / throttle its merging?

Maybe the setting should mean we never call maybeMerge implicitly?  (Ie, 
neither on close nor NRT reader or any other time), rather than just singling 
out updateDocument/addDocument?  Eg if we add other methods in the future (like 
field updates), it should also prevent those from doing merges?

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473260#comment-13473260
 ] 

Uwe Schindler commented on LUCENE-4471:
---

Does it produce a file of this size?

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument

2012-10-10 Thread selckin (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473264#comment-13473264
 ] 

selckin commented on LUCENE-4472:
-

+1, if you have a lot of indexes and they all start merging at the same time it 
can be quite taxing

I think ES has dedicated configurable thread pool where for each index a 
maybeMerge() is scheduled on an interval. (Size of thread pool limits number of 
 concurrent merges)

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument

2012-10-10 Thread Shai Erera (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473265#comment-13473265
]

Shai Erera commented on LUCENE-4472:

Patch looks good. One minor comment -- LiveIWC.setMaybeMergeAfterFlush()
contains a redundant 'a' in its javadocs -- .. after a each segment.

bq. Maybe the setting should mean we never call maybeMerge implicitly?

Perhaps the issue's description should change to Add setting to prevent
merging on segment flush. Because as I understand the fix, the check is made
only after a segment has been flushed, which already covers
addDocument/updateDocument as well as future field updates?

Add setting that prevents merging on updateDocument
---

Key: LUCENE-4472
URL: https://issues.apache.org/jira/browse/LUCENE-4472
Project: Lucene - Core
Issue Type: Improvement
Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
Fix For: 4.1, 5.0

Attachments: LUCENE-4472.patch

Currently we always call maybeMerge if a segment was flushed after
updateDocument. Some apps and in particular ElasticSearch uses some hacky
workarounds to disable that ie for merge throttling. It should be easier to
enable this kind of behavior.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473268#comment-13473268
 ] 

Adrien Grand commented on LUCENE-4471:
--

Yes it does (a little larger actually). I didn't find a way to lie to stored 
fields.

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument


[ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473270#comment-13473270
 ] 

Robert Muir commented on LUCENE-4472:
-

Can we consider instead giving MergePolicy the proper context here instead of 
adding a boolean?

This seems more flexible.

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473279#comment-13473279
 ] 

Adrien Grand commented on LUCENE-4471:
--

Uwe, is it a problem?

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4471) Test4GBStoredFields


[ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473300#comment-13473300
 ] 

Uwe Schindler commented on LUCENE-4471:
---

Depends on the Jenkins server :-) The nightly tests are running only at apache 
and that one has enough space. The Windows one by SDDS might have disk space 
problems (virtual Windows 7 machine with few diskspace), but we don't run 
nightly on it.

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4473) BlockPF encodes offsets inefficiently

Robert Muir created LUCENE-4473:
---

 Summary: BlockPF encodes offsets inefficiently
 Key: LUCENE-4473
 URL: https://issues.apache.org/jira/browse/LUCENE-4473
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: core/codecs
Reporter: Robert Muir


when writing a vint block. It should write these like Lucene40 does.

Here is geonames (all 19 fields as textfields with offsets):
trunk _68_Block_0.pos: 178700442
patch _68_Block_0.pos: 155929641

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4473) BlockPF encodes offsets inefficiently


 [ 
https://issues.apache.org/jira/browse/LUCENE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4473:


Attachment: LUCENE-4473.patch

patch. we already bumped Block's version in 4.1 to fix other bugs so we don't 
need to do it again.

 BlockPF encodes offsets inefficiently
 -

 Key: LUCENE-4473
 URL: https://issues.apache.org/jira/browse/LUCENE-4473
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: core/codecs
Reporter: Robert Muir
 Fix For: 4.1

 Attachments: LUCENE-4473.patch


 when writing a vint block. It should write these like Lucene40 does.
 Here is geonames (all 19 fields as textfields with offsets):
 trunk _68_Block_0.pos: 178700442
 patch _68_Block_0.pos: 155929641

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4467) SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - corrupted index

2012-10-10 Thread B.Nicolotti (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473330#comment-13473330
 ] 

B.Nicolotti commented on LUCENE-4467:
-

We deleted the index folder this folder and started from empty index, the 
system worked without problems all the day indexing the xml produced today, 
until we had another problem, see below.

We've 2 web applications in one tomcat java process that write the index. The 
two web applications use the same version of Lucene, 3.6.0.

May this be a problem? Shouldn't each web application obtain the write.lok 
before to write the index?

The index is small, so I can attach it

Many thanks

Best regards



Wed Oct 10 17:34:05 CEST 2012:com.siap.WebServices.Utility.UtiIndexerLucene 
caught an exception: 16801917 java.io.FileNotFoundException
 e.toString():java.io.FileNotFoundException: _42.fdt,
 e.getMessage():_42.fdt
org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:284)
org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:303)
org.apache.lucene.index.TieredMergePolicy.size(TieredMergePolicy.java:635)
org.apache.lucene.index.TieredMergePolicy.useCompoundFile(TieredMergePolicy.java:613)
org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:593)
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3580)
org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1852)
org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812)
org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776)
com.siap.WebServices.Utility.UtiIndexerLucene.indexFile(UtiIndexerLucene.java:272)
com.siap.WebServices.Utility.UtiLogPrintingThread.run(UtiLogPrintingThread.java:146)
Somma controllo versione:
Server info:Apache Tomcat/5.5.23@127.0.0.1(tomcatdemo)




 SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - 
 corrupted index
 

 Key: LUCENE-4467
 URL: https://issues.apache.org/jira/browse/LUCENE-4467
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.6
 Environment: Currently using:
 java -version
 java version 1.5.0_13
 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05)
 Java HotSpot(TM) Client VM (build 1.5.0_13-b05, mixed mode, sharing)
 Tomcat 5.5
 lucene 3.6.0
Reporter: B.Nicolotti
 Attachments: index.zip


 We're using lucene to index XML. We've had it in test on a server for some 
 weeks with no problem, but today we've got the error below and the index 
 seems no longer usable.
 Could you please tell us 
 1) is there a way to recover the index?
 2) is there a way to avoid this error?
 I can supply the index if needed
 many thanks
 Tue Oct 09 17:41:02 CEST 2012:com.siap.WebServices.Utility.UtiIndexerLucene 
 caught an exception: 32225010 java.io.FileNotFoundException
  e.toString():java.io.FileNotFoundException: 
 /usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or directory),
  e.getMessage():/usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or 
 directory)
 java.io.RandomAccessFile.open(Native Method)
 java.io.RandomAccessFile.init(RandomAccessFile.java:212)
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:71)
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:98)
 org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:92)
 org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:79)
 org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345)
 org.apache.lucene.util.BitVector.init(BitVector.java:266)
 org.apache.lucene.index.SegmentReader.loadDeletedDocs(SegmentReader.java:160)
 org.apache.lucene.index.SegmentReader.get(SegmentReader.java:120)
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:696)
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:671)
 org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:244)
 org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3608)
 org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
 org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1852)
 org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812)
 org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776)
 com.siap.WebServices.Utility.UtiIndexerLucene.delete(UtiIndexerLucene.java:143)
 com.siap.WebServices.Utility.UtiIndexerLucene.indexFile(UtiIndexerLucene.java:221)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (LUCENE-4467) SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - corrupted index

2012-10-10 Thread B.Nicolotti (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

B.Nicolotti updated LUCENE-4467:


Attachment: index.zip

 SegmentReader.loadDeletedDocs FileNotFoundExceptio load _hko_7.del - 
 corrupted index
 

 Key: LUCENE-4467
 URL: https://issues.apache.org/jira/browse/LUCENE-4467
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.6
 Environment: Currently using:
 java -version
 java version 1.5.0_13
 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05)
 Java HotSpot(TM) Client VM (build 1.5.0_13-b05, mixed mode, sharing)
 Tomcat 5.5
 lucene 3.6.0
Reporter: B.Nicolotti
 Attachments: index.zip


 We're using lucene to index XML. We've had it in test on a server for some 
 weeks with no problem, but today we've got the error below and the index 
 seems no longer usable.
 Could you please tell us 
 1) is there a way to recover the index?
 2) is there a way to avoid this error?
 I can supply the index if needed
 many thanks
 Tue Oct 09 17:41:02 CEST 2012:com.siap.WebServices.Utility.UtiIndexerLucene 
 caught an exception: 32225010 java.io.FileNotFoundException
  e.toString():java.io.FileNotFoundException: 
 /usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or directory),
  e.getMessage():/usr/local/WS_DynPkg/logs/index/_hko_7.del (No such file or 
 directory)
 java.io.RandomAccessFile.open(Native Method)
 java.io.RandomAccessFile.init(RandomAccessFile.java:212)
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:71)
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:98)
 org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:92)
 org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:79)
 org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345)
 org.apache.lucene.util.BitVector.init(BitVector.java:266)
 org.apache.lucene.index.SegmentReader.loadDeletedDocs(SegmentReader.java:160)
 org.apache.lucene.index.SegmentReader.get(SegmentReader.java:120)
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:696)
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:671)
 org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:244)
 org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3608)
 org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
 org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1852)
 org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812)
 org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776)
 com.siap.WebServices.Utility.UtiIndexerLucene.delete(UtiIndexerLucene.java:143)
 com.siap.WebServices.Utility.UtiIndexerLucene.indexFile(UtiIndexerLucene.java:221)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4474) CloseableThreadLocal maybePurge could be too expensive

Robert Muir created LUCENE-4474:
---

 Summary: CloseableThreadLocal maybePurge could be too expensive
 Key: LUCENE-4474
 URL: https://issues.apache.org/jira/browse/LUCENE-4474
 Project: Lucene - Core
  Issue Type: Task
Reporter: Robert Muir


Was doing some tests with geonames database (19 fields, just using 
StandardAnalyzer), and noticed this in the profiler.

It could be a ghost, but we should investigate anyway.

It seems ridiculous for a situation like mine:
* indexing with one thread
* every 40 Analyzer.tokenStream() calls [basically every other doc], this thing 
is called
* it gets iterators over the map, checks threads, this and that. but of course 
there is only one thread! 

Maybe its a good idea if it checks size() first or something. at least dont do 
this stuff if size() == 1, as I bet a lot of people index with a single thread. 

Or maybe all this stuff is really cheap and its just a ghost.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4474) CloseableThreadLocal maybePurge could be too expensive

[
https://issues.apache.org/jira/browse/LUCENE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473336#comment-13473336
]

Uwe Schindler commented on LUCENE-4474:
---

We should do this check in all cases! If there is only =1 entry in the map, we
don't need to do anything (because this is the live thread!).

CloseableThreadLocal maybePurge could be too expensive
--

Key: LUCENE-4474
URL: https://issues.apache.org/jira/browse/LUCENE-4474
Project: Lucene - Core
Issue Type: Task
Reporter: Robert Muir

Was doing some tests with geonames database (19 fields, just using
StandardAnalyzer), and noticed this in the profiler.
It could be a ghost, but we should investigate anyway.
It seems ridiculous for a situation like mine:
* indexing with one thread
* every 40 Analyzer.tokenStream() calls [basically every other doc], this
thing is called
* it gets iterators over the map, checks threads, this and that. but of
course there is only one thread!
Maybe its a good idea if it checks size() first or something. at least dont
do this stuff if size() == 1, as I bet a lot of people index with a single
thread.
Or maybe all this stuff is really cheap and its just a ghost.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3922) AbstractSolrTestCase duplicates a lot from SolrTestCaseJ4 and is one of the few lines of Solr test classes that do not inherit from SolrTestCaseJ4.

2012-10-10 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473343#comment-13473343
 ] 

Mark Miller commented on SOLR-3922:
---

Moving these tests over to SolrTestCaseJ4 should also bring some speed gains 
since the SolrTestCaseJ4 tests generally use the same CoreContainer/SolrCore 
across test methods.

 AbstractSolrTestCase duplicates a lot from SolrTestCaseJ4 and is one of the 
 few lines of Solr test classes that do not inherit from SolrTestCaseJ4.
 ---

 Key: SOLR-3922
 URL: https://issues.apache.org/jira/browse/SOLR-3922
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.1, 5.0


 I plan on fixing both of these issues as part of my work on SOLR-3911.
 Most of AbstractSolrTestCase can go away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4471) Test4GBStoredFields


 [ 
https://issues.apache.org/jira/browse/LUCENE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4471.
--

Resolution: Fixed
  Assignee: Adrien Grand

Committed (r1396656 on trunk and r1396671 on branch 4.x).

 Test4GBStoredFields
 ---

 Key: LUCENE-4471
 URL: https://issues.apache.org/jira/browse/LUCENE-4471
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: Test4GBStoredFields.java


 Yesterday I fixed a bug (integer overflow) that only happens when a fields 
 data (.fdt) file grows larger than 4GB. We should have a test for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3927) Ability to use CompressingStoredFieldsFormat

Adrien Grand created SOLR-3927:
--

 Summary: Ability to use CompressingStoredFieldsFormat
 Key: SOLR-3927
 URL: https://issues.apache.org/jira/browse/SOLR-3927
 Project: Solr
  Issue Type: Task
Reporter: Adrien Grand
Priority: Trivial


It would be nice to let Solr users use {{CompressingStoredFieldsFormat}} to 
compress their stored fields (with warnings given that this feature is 
experimental and that we don't guarantee backwards compat for it).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3927) Ability to use CompressingStoredFieldsFormat


 [ 
https://issues.apache.org/jira/browse/SOLR-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated SOLR-3927:
---

Assignee: Adrien Grand

 Ability to use CompressingStoredFieldsFormat
 

 Key: SOLR-3927
 URL: https://issues.apache.org/jira/browse/SOLR-3927
 Project: Solr
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial

 It would be nice to let Solr users use {{CompressingStoredFieldsFormat}} to 
 compress their stored fields (with warnings given that this feature is 
 experimental and that we don't guarantee backwards compat for it).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3911) Make Directory and DirectoryFactory first class so that the majority of Solr's features work with any custom implementations.

2012-10-10 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3911:
--

Attachment: SOLR-3911.patch

This issue has been very satisfying. All tests passing. I still force an fs 
directory for 2 solrcloud tests due to the recovery issue mentioned above. We 
can probably fix that in another issue.

 Make Directory and DirectoryFactory first class so that the majority of 
 Solr's features work with any custom implementations.
 -

 Key: SOLR-3911
 URL: https://issues.apache.org/jira/browse/SOLR-3911
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.1, 5.0

 Attachments: SOLR-3911.patch, SOLR-3911.patch, SOLR-3911.patch


 The biggest issue is that many parts of Solr rely on a local file system 
 based Directory implementation - most notably, replication. This should all 
 be changed to use the Directory and DirectoryFactory abstractions.
 Other parts of the code that count on the local file system for making paths 
 and getting file sizes should also be changed to use Directory and/or 
 DirectoryFactory.
 Original title: Replication should work with any Directory impl, not just 
 local filesystem based Directories.
 I've wanted to do this for a long time - there is no reason replication 
 should not support any directory impl. This will let us use the mockdir for 
 replication tests rather than having to force an FSDir and lose all the extra 
 test checks and simulations. This will improve our testing around replication 
 a lot, and allow custom Directory impls to be used on multi node Solr.
 Expanded scope - full first class support for DirectoryFactory and Directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3928) A PropertiesEntityProcessor for DIH

Tricia Jenkins created SOLR-3928:


 Summary: A PropertiesEntityProcessor for DIH
 Key: SOLR-3928
 URL: https://issues.apache.org/jira/browse/SOLR-3928
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Tricia Jenkins
Priority: Minor
 Fix For: 4.0


Add a simple PropertiesEntityProcessor which can read from any 
DataSourceReader and output rows corresponding to the a 
href=http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html;properties/a
 file key/value pairs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3928) A PropertiesEntityProcessor for DIH


 [ 
https://issues.apache.org/jira/browse/SOLR-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tricia Jenkins updated SOLR-3928:
-

Attachment: SOLR-3928.patch

PropertiesEntityProcessor with test.  It's in the dataimporthandler-extras 
directory but dataimporthandler might make more sense.

 A PropertiesEntityProcessor for DIH
 ---

 Key: SOLR-3928
 URL: https://issues.apache.org/jira/browse/SOLR-3928
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Tricia Jenkins
Priority: Minor
  Labels: dih, patch, test
 Fix For: 4.0

 Attachments: SOLR-3928.patch


 Add a simple PropertiesEntityProcessor which can read from any 
 DataSourceReader and output rows corresponding to the a 
 href=http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html;properties/a
  file key/value pairs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: release 4.0 (RC2)

2012-10-10 Thread Michael Busch


+1

smoketest succeeded on macos 10.7.4.

 Michael

On 10/6/12 1:10 AM, Robert Muir wrote:

artifacts here: http://s.apache.org/lusolr40rc2

Thanks for the good inspection of rc#1 and finding bugs, which found
test bugs and other bugs!
I am happy this was all discovered and sorted out before release.

vote stays open until wednesday, the weekend is just extra time for
evaluating the RC.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3928) A PropertiesEntityProcessor for DIH


 [ 
https://issues.apache.org/jira/browse/SOLR-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tricia Jenkins updated SOLR-3928:
-

Description: Add a simple PropertiesEntityProcessor which can read from any 
DataSourceReader and output rows corresponding to the 
[properties|http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html] 
file key/value pairs.  (was: Add a simple PropertiesEntityProcessor which can 
read from any DataSourceReader and output rows corresponding to the a 
href=http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html;properties/a
 file key/value pairs.)

 A PropertiesEntityProcessor for DIH
 ---

 Key: SOLR-3928
 URL: https://issues.apache.org/jira/browse/SOLR-3928
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Tricia Jenkins
Priority: Minor
  Labels: dih, patch, test
 Fix For: 4.0

 Attachments: SOLR-3928.patch


 Add a simple PropertiesEntityProcessor which can read from any 
 DataSourceReader and output rows corresponding to the 
 [properties|http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html]
  file key/value pairs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3928) A PropertiesEntityProcessor for DIH


 [ 
https://issues.apache.org/jira/browse/SOLR-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tricia Jenkins updated SOLR-3928:
-

Fix Version/s: (was: 4.0)
   4.1

 A PropertiesEntityProcessor for DIH
 ---

 Key: SOLR-3928
 URL: https://issues.apache.org/jira/browse/SOLR-3928
 Project: Solr
  Issue Type: New Feature
  Components: contrib - DataImportHandler
Reporter: Tricia Jenkins
Priority: Minor
  Labels: dih, patch, test
 Fix For: 4.1

 Attachments: SOLR-3928.patch


 Add a simple PropertiesEntityProcessor which can read from any 
 DataSourceReader and output rows corresponding to the 
 [properties|http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html]
  file key/value pairs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4473) BlockPF encodes offsets inefficiently

2012-10-10 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473554#comment-13473554
 ] 

Michael McCandless commented on LUCENE-4473:


+1

 BlockPF encodes offsets inefficiently
 -

 Key: LUCENE-4473
 URL: https://issues.apache.org/jira/browse/LUCENE-4473
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: core/codecs
Reporter: Robert Muir
 Fix For: 4.1

 Attachments: LUCENE-4473.patch


 when writing a vint block. It should write these like Lucene40 does.
 Here is geonames (all 19 fields as textfields with offsets):
 trunk _68_Block_0.pos: 178700442
 patch _68_Block_0.pos: 155929641

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument

2012-10-10 Thread Shay Banon (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473617#comment-13473617
 ] 

Shay Banon commented on LUCENE-4472:


Agree with Robert on the additional context flag, that would make things most 
flexible. A flag on IW makes things simpler from the user perspective though, 
cause then there is no need to customize the built in merge policies.

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument


[ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473624#comment-13473624
 ] 

Robert Muir commented on LUCENE-4472:
-

I don't think you need to customize anything built in. Just delegate and 
forward findMerges()?

The problem is this doesn't really have the necessary context today: I think we 
should fix that.
But I'd like policies around merging to stay in ... MergePolicy :)

Otherwise IWC could easily get cluttered with conflicting options which makes 
it complex.

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4472) Add setting that prevents merging on updateDocument


[ 
https://issues.apache.org/jira/browse/LUCENE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473626#comment-13473626
 ] 

Simon Willnauer commented on LUCENE-4472:
-

The MergePolicy is tricky here since we clone the MP in IW you need to actually 
pull and cast the MP from IW to change the setting if you want to do this in 
realtime. Maybe we can add something like this to MP so we can change 
MergeSettings in RT too. Otherwise you need to build a special MP but we can 
certainly do that.

 Add setting that prevents merging on updateDocument
 ---

 Key: LUCENE-4472
 URL: https://issues.apache.org/jira/browse/LUCENE-4472
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Simon Willnauer
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4472.patch


 Currently we always call maybeMerge if a segment was flushed after 
 updateDocument. Some apps and in particular ElasticSearch uses some hacky 
 workarounds to disable that ie for merge throttling. It should be easier to 
 enable this kind of behavior. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: release 4.0 (RC2)

2012-10-10 Thread Simon Willnauer

if buschmi votes we are good :D

simon

On Wed, Oct 10, 2012 at 9:30 PM, Michael Busch busch...@gmail.com wrote:
 +1

 smoketest succeeded on macos 10.7.4.

  Michael


 On 10/6/12 1:10 AM, Robert Muir wrote:

 artifacts here: http://s.apache.org/lusolr40rc2

 Thanks for the good inspection of rc#1 and finding bugs, which found
 test bugs and other bugs!
 I am happy this was all discovered and sorted out before release.

 vote stays open until wednesday, the weekend is just extra time for
 evaluating the RC.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3929) support configuring IndexWriter max thread count in solrconfig

2012-10-10 Thread Patrick Hunt (JIRA)

Patrick Hunt created SOLR-3929:
--

 Summary: support configuring IndexWriter max thread count in 
solrconfig
 Key: SOLR-3929
 URL: https://issues.apache.org/jira/browse/SOLR-3929
 Project: Solr
  Issue Type: New Feature
Affects Versions: 3.1
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 4.1, 5.0


Lucene 3.1.0 added the ability to configure the IndexWriter's previously fixed 
internal thread limit by calling setMaxThreadStates. This parameter should be 
exposed through Solr configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3929) support configuring IndexWriter max thread count in solrconfig

2012-10-10 Thread Patrick Hunt (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated SOLR-3929:
---

Attachment: SOLR-3929.patch

Added configuration parameter as indexConfig/maxIndexingThreads. Also added a 
test and example solrconfig.xml.

 support configuring IndexWriter max thread count in solrconfig
 --

 Key: SOLR-3929
 URL: https://issues.apache.org/jira/browse/SOLR-3929
 Project: Solr
  Issue Type: New Feature
Affects Versions: 3.1
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 4.1, 5.0

 Attachments: SOLR-3929.patch


 Lucene 3.1.0 added the ability to configure the IndexWriter's previously 
 fixed internal thread limit by calling setMaxThreadStates. This parameter 
 should be exposed through Solr configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4473) BlockPF encodes offsets inefficiently


 [ 
https://issues.apache.org/jira/browse/LUCENE-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4473.
-

   Resolution: Fixed
Fix Version/s: 5.0

 BlockPF encodes offsets inefficiently
 -

 Key: LUCENE-4473
 URL: https://issues.apache.org/jira/browse/LUCENE-4473
 Project: Lucene - Core
  Issue Type: Sub-task
  Components: core/codecs
Reporter: Robert Muir
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4473.patch


 when writing a vint block. It should write these like Lucene40 does.
 Here is geonames (all 19 fields as textfields with offsets):
 trunk _68_Block_0.pos: 178700442
 patch _68_Block_0.pos: 155929641

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax

2012-10-10 Thread Ron Mayer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473765#comment-13473765
 ] 

Ron Mayer commented on SOLR-2058:
-

I just tried them both (the committed one, and my original patch); and at least 
they both produce much better relevancy on my test data than I was able to get 
without the patch.

However I agree with you that it looks to me like the change was probably 
unintentional and  seems different from the way I think normal dismax queries 
work.

TL/DR:  I'm not sure.   Anyone else care to either testcompare them or just 
look at the code and see which is more reasonable?

 Adds optional phrase slop to edismax pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
 Environment: n/a
Reporter: Ron Mayer
Assignee: James Dyer
Priority: Minor
 Fix For: 4.0-ALPHA

 Attachments: edismax_pf_with_slop_v2.1.patch, 
 edismax_pf_with_slop_v2.patch, pf2_with_slop.patch, 
 SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E
 {quote}
 From  Ron Mayer r...@0ape.com
 ... my results might  be even better if I had a couple different pf2s with 
 different ps's  at the same time.   In particular.   One with ps=0 to put a 
 high boost on ones the have  the right ordering of words.  For example 
 insuring that [the query]:
   red hat black jacket
  boosts only documents with red hats and not black hats.   And another 
 pf2 with a more modest boost with ps=5 or so to handle the query above also 
 boosting docs with 
   red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 {{text:foo bar~1^2}}
 -Yonik
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3221) Make Shard handler threadpool configurable

2012-10-10 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473791#comment-13473791
 ] 

David Smiley commented on SOLR-3221:


Greg, I heard you intend to add a small patch to flip Solr 4's default on this 
feature?  I was picking Erick since forever and he pawned it off on you.

 Make Shard handler threadpool configurable
 --

 Key: SOLR-3221
 URL: https://issues.apache.org/jira/browse/SOLR-3221
 Project: Solr
  Issue Type: Improvement
Affects Versions: 3.6, 4.0-ALPHA
Reporter: Greg Bowyer
Assignee: Erick Erickson
  Labels: distributed, http, shard
 Fix For: 3.6, 4.0-ALPHA

 Attachments: SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, 
 SOLR-3221-3x_branch.patch, SOLR-3221-3x_branch.patch, 
 SOLR-3221-3x_branch.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, 
 SOLR-3221-trunk.patch, SOLR-3221-trunk.patch, SOLR-3221-trunk.patch


 From profiling of monitor contention, as well as observations of the
 95th and 99th response times for nodes that perform distributed search
 (or ‟aggregator‟ nodes) it would appear that the HttpShardHandler code
 currently does a suboptimal job of managing outgoing shard level
 requests.
 Presently the code contained within lucene 3.5's SearchHandler and
 Lucene trunk / 3x's ShardHandlerFactory create arbitrary threads in
 order to service distributed search requests. This is done presently to
 limit the size of the threadpool such that it does not consume resources
 in deployment configurations that do not use distributed search.
 This unfortunately has two impacts on the response time if the node
 coordinating the distribution is under high load.
 The usage of the MaxConnectionsPerHost configuration option results in
 aggressive activity on semaphores within HttpCommons, it has been
 observed that the aggregator can have a response time far greater than
 that of the searchers. The above monitor contention would appear to
 suggest that in some cases its possible for liveness issues to occur and
 for simple queries to be starved of resources simply due to a lack of
 attention from the viewpoint of context switching.
 With, as mentioned above the http commons connection being hotly
 contended
 The fair, queue based configuration eliminates this, at the cost of
 throughput.
 This patch aims to make the threadpool largely configurable allowing for
 those using solr to choose the throughput vs latency balance they
 desire.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4475) eDismax boost on multiValued fields

Bill Bell created LUCENE-4475:
-

 Summary: eDismax boost on multiValued fields
 Key: LUCENE-4475
 URL: https://issues.apache.org/jira/browse/LUCENE-4475
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Bill Bell


Replace bq with boost, but we
get the multi-valued field issue when we try to do the equivalent queries
HTTP ERROR 400
Problem accessing /solr/providersearch/select. Reason:
can not use FieldCache on multivalued field: specialties_ids
q=*:*bq=multi_field:87^2defType=dismax

How do you do this using boost?

q=*:*boost=multi_field:87defType=edismax

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3930) eDismax Multivalued boost

Bill Bell created SOLR-3930:
---

 Summary: eDismax Multivalued boost
 Key: SOLR-3930
 URL: https://issues.apache.org/jira/browse/SOLR-3930
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Bill Bell


Want to replace bq with boost, but we
get the multi-valued field issue when we try to do the equivalent queries
HTTP ERROR 400
Problem accessing /solr/providersearch/select. Reason:
can not use FieldCache on multivalued field: specialties_ids
q=*:*bq=multi_field:87^2defType=dismax

How do you do this using boost?

q=*:*boost=multi_field:87defType=edismax

We know we can use bq with edismax, but we like the multiply feature of
boost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-4475) eDismax boost on multiValued fields


 [ 
https://issues.apache.org/jira/browse/LUCENE-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell closed LUCENE-4475.
-

Resolution: Fixed

 eDismax boost on multiValued fields
 ---

 Key: LUCENE-4475
 URL: https://issues.apache.org/jira/browse/LUCENE-4475
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Bill Bell

 Replace bq with boost, but we
 get the multi-valued field issue when we try to do the equivalent queries
 HTTP ERROR 400
 Problem accessing /solr/providersearch/select. Reason:
 can not use FieldCache on multivalued field: specialties_ids
 q=*:*bq=multi_field:87^2defType=dismax
 How do you do this using boost?
 q=*:*boost=multi_field:87defType=edismax

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3931) Turn off coord() factor for scoring