date:20140109


 [ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5594:
---

Attachment: SOLR-5594.patch

Patch with the following changes:
* Fixes to SimpleQParserPlugin and PrefixQParserPlugin
* Test to show that prefix query for integer fields works as it did prior to 
this change.
* Test to show how custom fields override getPrefixQuery() method for 2 
different custom fields.

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings


 [ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5594:
---

Attachment: (was: SOLR-5594.patch)

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings

2014-01-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5594:
---

Attachment: SOLR-5594.patch

There was something wrong with the last patch. Here's another one.

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866512#comment-13866512
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1556775 from [~mikemccand] in branch 'dev/branches/lucene5376'
[ https://svn.apache.org/r1556775 ]

LUCENE-5376: allow setting norms format, including compressed norms

 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4906) PostingsHighlighter's PassageFormatter should allow for rendering to arbitrary objects

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866535#comment-13866535
 ] 

ASF subversion and git services commented on LUCENE-4906:
-

Commit 1556786 from [~mikemccand] in branch 'dev/branches/lucene5376'
[ https://svn.apache.org/r1556786 ]

LUCENE-4906, LUCENE-5376: using the expert 'render to Object' APIs in 
PostingsHighlighter to render directly to JSONArray in lucene server

 PostingsHighlighter's PassageFormatter should allow for rendering to 
 arbitrary objects
 --

 Key: LUCENE-4906
 URL: https://issues.apache.org/jira/browse/LUCENE-4906
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0

 Attachments: LUCENE-4906.patch, LUCENE-4906.patch, LUCENE-4906.patch


 For example, in a server, I may want to render the highlight result to 
 JsonObject to send back to the front-end. Today since we render to string, I 
 have to render to JSON string and then re-parse to JsonObject, which is 
 inefficient...
 Or, if (Rob's idea:) we make a query that's like MoreLikeThis but it pulls 
 terms from snippets instead, so you get proximity-influenced salient/expanded 
 terms, then perhaps that renders to just an array of tokens or fragments or 
 something from each snippet.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866536#comment-13866536
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1556786 from [~mikemccand] in branch 'dev/branches/lucene5376'
[ https://svn.apache.org/r1556786 ]

LUCENE-4906, LUCENE-5376: using the expert 'render to Object' APIs in 
PostingsHighlighter to render directly to JSONArray in lucene server

 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5388) Eliminate construction over readers for Tokenizer

2014-01-09 Thread Benson Margulies (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866561#comment-13866561
 ] 

Benson Margulies commented on LUCENE-5388:
--

Should I try to get the branch in git to match the .patch, or should I just let 
you proceed from here? I guess that might depend on reactions of others.

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

2014-01-09 Thread Ramkumar Aiyengar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866565#comment-13866565
 ] 

Ramkumar Aiyengar commented on SOLR-5213:
-

Shalin, any objection to this patch going in? May be with SOLR-5338, the 
severity of the 0 shard case can be reduced from log.error, but the patch 
should good otherwise..

 collections?action=SPLITSHARD parent vs. sub-shards numDocs
 ---

 Key: SOLR-5213
 URL: https://issues.apache.org/jira/browse/SOLR-5213
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5213.patch


 The problem we saw was that splitting a shard took a long time and at the end 
 of it the sub-shards contained fewer documents than the original shard.
 The root cause was eventually tracked down to the disappearing documents not 
 falling into the hash ranges of the sub-shards.
 Could SolrIndexSplitter split report per-segment numDocs for parent and 
 sub-shards, with at least a warning logged for any discrepancies (documents 
 falling into none of the sub-shards or documents falling into several 
 sub-shards)?
 Additionally, could a case be made for erroring out when discrepancies are 
 detected i.e. not proceeding with the shard split? Either to always error or 
 to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD 
 action.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

2014-01-09 Thread Ramkumar Aiyengar (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866565#comment-13866565
]

Ramkumar Aiyengar edited comment on SOLR-5213 at 1/9/14 11:25 AM:
--

Shalin, any objection to this patch going in? May be with SOLR-5338, the
severity of the 0 shard case can be reduced from log.error (alternatively, it
could check for split.key being present and decide severity if we want to be
smarter), but the patch should good otherwise..

was (Author: andyetitmoves):
Shalin, any objection to this patch going in? May be with SOLR-5338, the
severity of the 0 shard case can be reduced from log.error, but the patch
should good otherwise..

collections?action=SPLITSHARD parent vs. sub-shards numDocs
---

Key: SOLR-5213
URL: https://issues.apache.org/jira/browse/SOLR-5213
Project: Solr
Issue Type: Improvement
Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
Attachments: SOLR-5213.patch

The problem we saw was that splitting a shard took a long time and at the end
of it the sub-shards contained fewer documents than the original shard.
The root cause was eventually tracked down to the disappearing documents not
falling into the hash ranges of the sub-shards.
Could SolrIndexSplitter split report per-segment numDocs for parent and
sub-shards, with at least a warning logged for any discrepancies (documents
falling into none of the sub-shards or documents falling into several
sub-shards)?
Additionally, could a case be made for erroring out when discrepancies are
detected i.e. not proceeding with the shard split? Either to always error or
to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD
action.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings


[ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866615#comment-13866615
 ] 

Robert Muir commented on SOLR-5594:
---

Can we avoid reformatting SimpleQParser here? it makes it impossible to review 
the changes

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5388) Eliminate construction over readers for Tokenizer


[ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866617#comment-13866617
 ] 

Robert Muir commented on LUCENE-5388:
-

Benson, dont worry about it, I think its good. I just put up the patch so that 
Uwe might look at it.

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5388) Eliminate construction over readers for Tokenizer

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866627#comment-13866627
 ] 

Uwe Schindler commented on LUCENE-5388:
---

I am fine with this patch in trunk only. We can decide later if we backport.

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5388) Eliminate construction over readers for Tokenizer


[ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866640#comment-13866640
 ] 

ASF subversion and git services commented on LUCENE-5388:
-

Commit 1556801 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1556801 ]

LUCENE-5388: remove Reader from Tokenizer ctor (closes #16)

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

lucene-solr pull request: LUCENE-5388: code and highlighting changes to rem...

2014-01-09 Thread benson-basis

Github user benson-basis closed the pull request at:

https://github.com/apache/lucene-solr/pull/16


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5388) Eliminate construction over readers for Tokenizer


 [ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5388:
--

Fix Version/s: 5.0

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Fix For: 5.0

 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5388) Eliminate construction over readers for Tokenizer


[ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866641#comment-13866641
 ] 

Uwe Schindler commented on LUCENE-5388:
---

Juhu!

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Fix For: 5.0

 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5388) Eliminate construction over readers for Tokenizer


 [ 
https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5388.
-

Resolution: Fixed

Marking fixed for 5.0.
Thanks a lot Benson for doing all the grunt work here.

Note: If you really want a backport please just open an issue and hash it out.

 Eliminate construction over readers for Tokenizer
 -

 Key: LUCENE-5388
 URL: https://issues.apache.org/jira/browse/LUCENE-5388
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: Benson Margulies
 Fix For: 5.0

 Attachments: LUCENE-5388.patch


 In the modern world, Tokenizers are intended to be reusable, with input 
 supplied via #setReader. The constructors that take Reader are a vestige. 
 Worse yet, they invite people to make mistakes in handling the reader that 
 tangle them up with the state machine in Tokenizer. The sensible thing is to 
 eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested Grouping / Field Collapsing

2014-01-09 Thread Joel Bernstein

Kranti,

You've got it exactly. And yes sorting and limiting the doclist within the
nested groups will be supported.


Joel Bernstein
Search Engineer at Heliosearch


On Wed, Jan 8, 2014 at 6:54 PM, Kranti Parisa kranti.par...@gmail.comwrote:

 Joel,

 1) Collapse on the top level group.
 - done thru CollapsingQParserPlugin

 2) Expand a single page of collapsed results to display nested groups.
 - probably done thru ExpandComponent

 Is that correct? and does the scope of ExpandComponent includes the
 options to sort and limit the docList within the nested groups?

  Which means, we are going to first create the top level groups and while
 expanding each group, we create nested groups and allow to pass the sort,
 limit params?

 Thanks,
 Kranti K. Parisa
 http://www.linkedin.com/in/krantiparisa



 On Wed, Jan 8, 2014 at 5:48 PM, Joel Bernstein joels...@gmail.com wrote:

 Kranti,

 I'm wondering if this can be separated into two phases:

 1) Collapse on the top level group.
 2) Expand a single page of collapsed results to display nested groups.

 I'll be working on the ExpandComponent shortly, which will expand a
 single page of results that were collapsed by the CollapsingQParserPlugin.
 This seems like something that could be implemented as part of the
 ExpandComponent.

 Joel










 Joel Bernstein
 Search Engineer at Heliosearch


 On Wed, Jan 8, 2014 at 12:28 PM, Kranti Parisa 
 kranti.par...@gmail.comwrote:

 Anyone has got latest updates for
 https://issues.apache.org/jira/browse/SOLR-2553 ?
 I am trying to take a look at the implementation and see how complex
 this is to achieve.

 If someone else had a look into it earlier, could you please share your
 thoughts/comments..

 Thanks,
 Kranti K. Parisa
 http://www.linkedin.com/in/krantiparisa

Analysis API next step: Reader-CharFilter?

2014-01-09 Thread Benson Margulies

Now that we're forcing everyone to think about the Analysis API in
5.0, what do you think of making the fundamental input source be a
CharFilter, thus removing the need for instanceof-ing?

To touch a hotter potato, I also wonder about 'reset()'. In a world
where the only way to put something in there is setReader, do we need
'reset' in between setReader and incrementToken?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5390) Loosen assert in IW on pending event after close

2014-01-09 Thread Simon Willnauer (JIRA)

Simon Willnauer created LUCENE-5390:
---

 Summary: Loosen assert in IW on pending event after close
 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1


Sometimes the assert in the IW is tripped due to pending merge events. Those 
events can always happen but they are meaningless since we close / rollback the 
IW anyway. I suggest we loosen the assert here to not fail if there are only 
pending merge events.

noformat
1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads

Error Message:
Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
group=TGRP-TestIndexWriterWithThreads]
Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
[org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
at 
org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
Caused by: java.lang.AssertionError: 
[org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
at 
org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)

/noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Analysis API next step: Reader-CharFilter?

2014-01-09 Thread Robert Muir

On Thu, Jan 9, 2014 at 9:08 AM, Benson Margulies bimargul...@gmail.com wrote:
 Now that we're forcing everyone to think about the Analysis API in
 5.0, what do you think of making the fundamental input source be a
 CharFilter, thus removing the need for instanceof-ing?

Personally, i don't like doing that, because when we change a
parameter from a 'standard jdk' one to a custom lucene one, it makes
the API harder to grok as its more classes the user *must* wrap their
head around. On the other hand, today users only have to grok
CharFilter if they want to do CharFiltering, which is pretty expert.
Instanceofs are cheap in java, what is the benefit?


 To touch a hotter potato, I also wonder about 'reset()'. In a world
 where the only way to put something in there is setReader, do we need
 'reset' in between setReader and incrementToken?

But the main issue is TokenStream: it doesnt have any concept of
Readers baked in. So there must be a way to reset state in things like
TokenFilters, too.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5390) Loosen assert in IW on pending event after close

2014-01-09 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5390:


Attachment: LUCENE-5390.patch

here is a patch

 Loosen assert in IW on pending event after close
 

 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: LUCENE-5390.patch


 Sometimes the assert in the IW is tripped due to pending merge events. Those 
 events can always happen but they are meaningless since we close / rollback 
 the IW anyway. I suggest we loosen the assert here to not fail if there are 
 only pending merge events.
 noformat
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
 Error Message:
 Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
 state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]
 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 /noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5579) Leader stops processing collection-work-queue after failed collection reload

2014-01-09 Thread Eric Bus (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866696#comment-13866696
]

Eric Bus commented on SOLR-5579:

Just a quick update: the leader again stopped working. I had to restart the
cluster to get everything working again. The script that is running to check
the status did not work, so unfortunately I don't have additional information
from the logs. When I do, I'll report back here.

Leader stops processing collection-work-queue after failed collection reload

Key: SOLR-5579
URL: https://issues.apache.org/jira/browse/SOLR-5579
Project: Solr
Issue Type: Bug
Affects Versions: 4.5.1
Environment: Debian Linux 6.0 running on VMWare
Using embedded SOLR Jetty.
Reporter: Eric Bus
Assignee: Mark Miller
Labels: collections, queue

I've been experiencing the same problem a few times now. My leader in
/overseer_elect/leader stops processing the collection queue at
/overseer/collection-queue-work. The queue will build up and it will trigger
an alert in my monitoring tool.
I haven't been able to pinpoint the reason that the leader stops, but usually
I kill the leader node to trigger a leader election. The new node will pick
up the queue. And this is where the problems start.
When the new leader is processing the queue and picks up a reload for a shard
without an active leader, the queue stops. It keeps repeating the message
that there is no active leader for the shard. But a new leader is never
elected:
{quote}
ERROR - 2013-12-24 14:43:40.390; org.apache.solr.common.SolrException; Error
while trying to recover.
core=magento_349_shard1_replica1:org.apache.solr.common.SolrException: No
registered leader was found, collection:magento_349 slice:shard1
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:482)
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:465)
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:317)
at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219)
ERROR - 2013-12-24 14:43:40.391; org.apache.solr.cloud.RecoveryStrategy;
Recovery failed - trying again... (7) core=magento_349_shard1_replica1
INFO - 2013-12-24 14:43:40.391; org.apache.solr.cloud.RecoveryStrategy; Wait
256.0 seconds before trying to recover again (8)
{quote}
Is the leader election in some way connected to the collection queue? If so,
can this be a deadlock, because it won't elect until the reload is complete?

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings


[ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866713#comment-13866713
 ] 

Anshum Gupta commented on SOLR-5594:


Robert, I thought about how to handle SimleQParser with this change before I 
even put up this patch but I can't think of another way to handle it here. This 
seems like the best way to go as far as handling SimpleQParser for this change 
is concerned.

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866720#comment-13866720
 ] 

Robert Muir commented on SOLR-5594:
---

Thats not what i mean: i mean that in the patch its not possible to see your 
actual logic changes, because every single line of code is reformatted. 

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

[
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866723#comment-13866723
]

ASF subversion and git services commented on SOLR-1301:
---

Commit 1556846 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1556846 ]

SOLR-1301: IntelliJ config: morphlines-cell Solr contrib needs lucene-core
test-scope dependency

Add a Solr contrib that allows for building Solr indexes via Hadoop's
Map-Reduce.
-

Key: SOLR-1301
URL: https://issues.apache.org/jira/browse/SOLR-1301
Project: Solr
Issue Type: New Feature
Reporter: Andrzej Bialecki
Assignee: Mark Miller
Fix For: 5.0, 4.7

Attachments: README.txt, SOLR-1301-hadoop-0-20.patch,
SOLR-1301-hadoop-0-20.patch, SOLR-1301-maven-intellij.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SolrRecordWriter.java, commons-logging-1.0.4.jar,
commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar,
hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch,
log4j-1.2.15.jar

This patch contains a contrib module that provides distributed indexing
(using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is
twofold:
* provide an API that is familiar to Hadoop developers, i.e. that of
OutputFormat
* avoid unnecessary export and (de)serialization of data maintained on HDFS.
SolrOutputFormat consumes data produced by reduce tasks directly, without
storing it in intermediate files. Furthermore, by using an
EmbeddedSolrServer, the indexing task is split into as many parts as there
are reducers, and the data to be indexed is not sent over the network.
Design
--
Key/value pairs produced by reduce tasks are passed to SolrOutputFormat,
which in turn uses SolrRecordWriter to write this data. SolrRecordWriter
instantiates an EmbeddedSolrServer, and it also instantiates an
implementation of SolrDocumentConverter, which is responsible for turning
Hadoop (key, value) into a SolrInputDocument. This data is then added to a
batch, which is periodically submitted to EmbeddedSolrServer. When reduce
task completes, and the OutputFormat is closed, SolrRecordWriter calls
commit() and optimize() on the EmbeddedSolrServer.
The API provides facilities to specify an arbitrary existing solr.home
directory, from which the conf/ and lib/ files will be taken.
This process results in the creation of as many partial Solr home directories
as there were reduce tasks. The output shards are placed in the output
directory on the default filesystem (e.g. HDFS). Such part-N directories
can be used to run N shard servers. Additionally, users can specify the
number of reduce tasks, in particular 1 reduce task, in which case the output
will consist of a single shard.
An example application is provided that processes large CSV files and uses
this API. It uses a custom CSV processing to avoid (de)serialization overhead.
This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this
issue, you should put it in contrib/hadoop/lib.
Note: the development of this patch was sponsored by an anonymous contributor
and approved for release under Apache License.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings


[ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866728#comment-13866728
 ] 

Anshum Gupta commented on SOLR-5594:


I'm sorry I misread it.  Perhaps it's something that idea did. Let me have a 
look at it and fix that.
Thanks for pointing that out.

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion

2014-01-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866729#comment-13866729
 ] 

Markus Jelsma commented on SOLR-5379:
-

Yes +1

 Query-time multi-word synonym expansion
 ---

 Key: SOLR-5379
 URL: https://issues.apache.org/jira/browse/SOLR-5379
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Reporter: Tien Nguyen Manh
  Labels: multi-word, queryparser, synonym
 Fix For: 4.7

 Attachments: quoted.patch, synonym-expander.patch


 While dealing with synonym at query time, solr failed to work with multi-word 
 synonyms due to some reasons:
 - First the lucene queryparser tokenizes user query by space so it split 
 multi-word term into two terms before feeding to synonym filter, so synonym 
 filter can't recognized multi-word term to do expansion
 - Second, if synonym filter expand into multiple terms which contains 
 multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to 
 handle synonyms. But MultiPhraseQuery don't work with term have different 
 number of words.
 For the first one, we can extend quoted all multi-word synonym in user query 
 so that lucene queryparser don't split it. There are a jira task related to 
 this one https://issues.apache.org/jira/browse/LUCENE-2605.
 For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery 
 SHOULD which contains multiple PhraseQuery in case tokens stream have 
 multi-word synonym.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Iterating BinaryDocValues

2014-01-09 Thread Mikhail Khludnev

Don't you think it's worth to raise a jira regarding those 'new bytes[]' ?
I'm able to provide a patch if you wish.


On Wed, Jan 8, 2014 at 2:02 PM, Mikhail Khludnev mkhlud...@griddynamics.com
 wrote:

 FWIW,

 Micro benchmark shows 4% gain on reusing incoming ByteRef.bytes in short
 binary docvalues Test2BBinaryDocValues.testVariableBinary() with mmap
 directory.
 I wonder why it doesn't reads into incoming bytes
 https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401



 On Wed, Jan 8, 2014 at 12:53 AM, Michael McCandless 
 luc...@mikemccandless.com wrote:

 Going sequentially should help, if the pages are not hot (in the OS's IO
 cache).

 You can also use a different DVFormat, e.g. Direct, but this holds all
 bytes in RAM.

 Mike McCandless

 http://blog.mikemccandless.com


 On Tue, Jan 7, 2014 at 1:09 PM, Mikhail Khludnev
 mkhlud...@griddynamics.com wrote:
  Joel,
 
  I tried to hack it straightforwardly, but found no free gain there. The
 only
  attempt I can suggest is to try to reuse bytes in
 
 https://github.com/apache/lucene-solr/blame/trunk/lucene/core/src/java/org/apache/lucene/codecs/lucene45/Lucene45DocValuesProducer.java#L401
  right now it allocates bytes every time, which beside of GC can also
 impact
  memory access locality. Could you try fix memory waste and repeat
  performance test?
 
  Have a good hack!
 
 
  On Mon, Dec 23, 2013 at 9:51 PM, Joel Bernstein joels...@gmail.com
 wrote:
 
 
  Hi,
 
  I'm looking for a faster way to perform large scale docId - bytesRef
  lookups for BinaryDocValues.
 
  I'm finding that I can't get the performance that I need from the
 random
  access seek in the BinaryDocValues interface.
 
  I'm wondering if sequentially scanning the docValues would be a faster
  approach. I have a BitSet of matching docs, so if I sequentially moved
  through the docValues I could test each one against that bitset.
 
  Wondering if that approach would be faster for bulk extracts and how
  tricky it would be to add an iterator to the BinaryDocValues interface?
 
  Thanks,
  Joel
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Principal Engineer,
  Grid Dynamics
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com

[jira] [Updated] (LUCENE-5354) Blended score in AnalyzingInfixSuggester

2014-01-09 Thread Remi Melisson (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remi Melisson updated LUCENE-5354:
--

Attachment: LUCENE-5354_3.patch

Hi!
Here is new patch including your comment for the coefficient calculation (I 
guess a Lambda function would be perfect here!).

I ran the performance test on my laptop, here is the results compared to the 
AnalyzingInfixSuggester : 
-- construction time
AnalyzingInfixSuggester input: 50001, time[ms]: 1780 [+- 367.58]
BlendedInfixSuggester input: 50001, time[ms]: 6507 [+- 2106.52]
-- prefixes: 2-4, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 6804 [+- 1403.13], ~kQPS: 7
BlendedInfixSuggester queries: 50001, time[ms]: 26503 [+- 2624.41], ~kQPS: 2
-- prefixes: 6-9, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 3995 [+- 551.20], ~kQPS: 13
BlendedInfixSuggester queries: 50001, time[ms]: 5355 [+- 1295.41], ~kQPS: 9
-- prefixes: 100-200, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 2626 [+- 588.43], ~kQPS: 19
BlendedInfixSuggester queries: 50001, time[ms]: 1980 [+- 574.16], ~kQPS: 25
-- RAM consumption
AnalyzingInfixSuggester size[B]:1,430,920
BlendedInfixSuggester size[B]:1,630,488

If you have any idea on how we could improve the performance, let me know (see 
above my comment for your previous suggestion to avoid visiting term vectors).

 Blended score in AnalyzingInfixSuggester
 

 Key: LUCENE-5354
 URL: https://issues.apache.org/jira/browse/LUCENE-5354
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Remi Melisson
Priority: Minor
  Labels: suggester
 Attachments: LUCENE-5354.patch, LUCENE-5354_2.patch, 
 LUCENE-5354_3.patch


 I'm working on a custom suggester derived from the AnalyzingInfix. I require 
 what is called a blended score (//TODO ln.399 in AnalyzingInfixSuggester) 
 to transform the suggestion weights depending on the position of the searched 
 term(s) in the text.
 Right now, I'm using an easy solution :
 If I want 10 suggestions, then I search against the current ordered index for 
 the 100 first results and transform the weight :
 bq. a) by using the term position in the text (found with TermVector and 
 DocsAndPositionsEnum)
 or
 bq. b) by multiplying the weight by the score of a SpanQuery that I add when 
 searching
 and return the updated 10 most weighted suggestions.
 Since we usually don't need to suggest so many things, the bigger search + 
 rescoring overhead is not so significant but I agree that this is not the 
 most elegant solution.
 We could include this factor (here the position of the term) directly into 
 the index.
 So, I can contribute to this if you think it's worth adding it.
 Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a 
 dedicated class ?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5594) Enable using extended field types with prefix queries for non-default encoded strings


 [ 
https://issues.apache.org/jira/browse/SOLR-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5594:
---

Attachment: SOLR-5594.patch

Fixed the reformatting, however as things have moved (and there's been a level 
change.. new inner classes etc) it still looks a little tricky but yes, it's no 
longer just reformatted code in the patch.

 Enable using extended field types with prefix queries for non-default encoded 
 strings
 -

 Key: SOLR-5594
 URL: https://issues.apache.org/jira/browse/SOLR-5594
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, Schema and Analysis
Affects Versions: 4.6
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Priority: Minor
 Attachments: SOLR-5594-branch_4x.patch, SOLR-5594.patch, 
 SOLR-5594.patch, SOLR-5594.patch, SOLR-5594.patch


 Enable users to be able to use prefix query with custom field types with 
 non-default encoding/decoding for queries more easily. e.g. having a custom 
 field work with base64 encoded query strings.
 Currently, the workaround for it is to have the override at getRewriteMethod 
 level. Perhaps having the prefixQuery also use the calling FieldType's 
 readableToIndexed method would work better.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5389) Even more doc for construction of TokenStream components

2014-01-09 Thread Benson Margulies (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866765#comment-13866765
 ] 

Benson Margulies commented on LUCENE-5389:
--

[~rcmuir]I think that this is ready to go . If you commit this and merge down 
to 4.x, I can then tackle work on this file for the new stuff.


 Even more doc for construction of TokenStream components
 

 Key: LUCENE-5389
 URL: https://issues.apache.org/jira/browse/LUCENE-5389
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Benson Margulies

 There are more useful things to tell would-be authors of tokenizers. Let's 
 tell them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5389) Even more doc for construction of TokenStream components

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866766#comment-13866766
 ] 

Robert Muir commented on LUCENE-5389:
-

Thanks Benson! I'll take a look at this in a bit.

 Even more doc for construction of TokenStream components
 

 Key: LUCENE-5389
 URL: https://issues.apache.org/jira/browse/LUCENE-5389
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Benson Margulies

 There are more useful things to tell would-be authors of tokenizers. Let's 
 tell them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: overriding getRangeQuery

2014-01-09 Thread Andi Vajda

On Jan 9, 2014, at 16:00, Shawn Grant shawn.gr...@orcatec.com wrote:

Updating the parameters in the extension's method and rebuilding wasn't
enough to fix the problem for me. Not sure what I'm missing.

I checked in a fix on trunk a couple of days ago. Did you remove extensions.jar
so that it would be rebuilt and rewrapped ?
If this still doesn't fix it, please write a small test case so that I can
reproduce this.
Thanks !

It looks like this bug also affects the analysis of the range clause. I need
the terms to be case sensitive so I'm using a per-field analyzer to make sure
that field doesn't get lowercased but it's getting ignored and sent to the
default analyzer.

That's probably an issue with using Lucene itself. You should ask about this on
java-u...@lucene.apache.org.

Andi..

On 01/03/2014 05:38 PM, Andi Vajda wrote:
On Jan 3, 2014, at 21:35, Shawn Grant shawn.gr...@orcatec.com wrote:

whoops, bad link expansion. Was supposed to be:
getRangeQuery(String field, String part1, String part2, boolean
startInclusive, boolean endInclusive);
Yes, that would be the problem. The signature changed but the extension's
didn't.

Andi..

On 01/03/2014 04:33 PM, Shawn Grant wrote:
I have a subclass of PythonQueryParser that overrides several methods but
I can't seem to get it to use getRangeQuery. I noticed that the method
definition in PythonQueryParser is:

getRangeQuery(String field, String part1, String part2, boolean inclusive);

but the lucene definition for QueryParser (in QueryParserBase) is:

|*getRangeQuery
https://lucene.apache.org/core/4_4_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#getRangeQuery%28java.lang.String,%20java.lang.String,%20java.lang.String,%20boolean,%20boolean%29*(String

http://download.oracle.com/javase/6/docs/api/java/lang/String.html?is-external=true
field, String
http://download.oracle.com/javase/6/docs/api/java/lang/String.html?is-external=true
part1, String
http://download.oracle.com/javase/6/docs/api/java/lang/String.html?is-external=true
part2, boolean startInclusive, boolean endInclusive)|

Is that an issue?

[jira] [Comment Edited] (LUCENE-5354) Blended score in AnalyzingInfixSuggester

2014-01-09 Thread Remi Melisson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866760#comment-13866760
 ] 

Remi Melisson edited comment on LUCENE-5354 at 1/9/14 4:57 PM:
---

Hi!
Here is new patch including your comment for the coefficient calculation (I 
guess a Lambda function would be perfect here!).

I ran the performance test on my laptop, here are the results compared to the 
AnalyzingInfixSuggester : 
-- construction time
AnalyzingInfixSuggester input: 50001, time[ms]: 1780 [+- 367.58]
BlendedInfixSuggester input: 50001, time[ms]: 6507 [+- 2106.52]
-- prefixes: 2-4, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 6804 [+- 1403.13], ~kQPS: 7
BlendedInfixSuggester queries: 50001, time[ms]: 26503 [+- 2624.41], ~kQPS: 2
-- prefixes: 6-9, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 3995 [+- 551.20], ~kQPS: 13
BlendedInfixSuggester queries: 50001, time[ms]: 5355 [+- 1295.41], ~kQPS: 9
-- prefixes: 100-200, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 2626 [+- 588.43], ~kQPS: 19
BlendedInfixSuggester queries: 50001, time[ms]: 1980 [+- 574.16], ~kQPS: 25
-- RAM consumption
AnalyzingInfixSuggester size[B]:1,430,920
BlendedInfixSuggester size[B]:1,630,488

If you have any idea on how we could improve the performance, let me know (see 
above my comment for your previous suggestion to avoid visiting term vectors).


was (Author: rmelisson):
Hi!
Here is new patch including your comment for the coefficient calculation (I 
guess a Lambda function would be perfect here!).

I ran the performance test on my laptop, here is the results compared to the 
AnalyzingInfixSuggester : 
-- construction time
AnalyzingInfixSuggester input: 50001, time[ms]: 1780 [+- 367.58]
BlendedInfixSuggester input: 50001, time[ms]: 6507 [+- 2106.52]
-- prefixes: 2-4, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 6804 [+- 1403.13], ~kQPS: 7
BlendedInfixSuggester queries: 50001, time[ms]: 26503 [+- 2624.41], ~kQPS: 2
-- prefixes: 6-9, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 3995 [+- 551.20], ~kQPS: 13
BlendedInfixSuggester queries: 50001, time[ms]: 5355 [+- 1295.41], ~kQPS: 9
-- prefixes: 100-200, num: 7, onlyMorePopular: false
AnalyzingInfixSuggester queries: 50001, time[ms]: 2626 [+- 588.43], ~kQPS: 19
BlendedInfixSuggester queries: 50001, time[ms]: 1980 [+- 574.16], ~kQPS: 25
-- RAM consumption
AnalyzingInfixSuggester size[B]:1,430,920
BlendedInfixSuggester size[B]:1,630,488

If you have any idea on how we could improve the performance, let me know (see 
above my comment for your previous suggestion to avoid visiting term vectors).

 Blended score in AnalyzingInfixSuggester
 

 Key: LUCENE-5354
 URL: https://issues.apache.org/jira/browse/LUCENE-5354
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Remi Melisson
Priority: Minor
  Labels: suggester
 Attachments: LUCENE-5354.patch, LUCENE-5354_2.patch, 
 LUCENE-5354_3.patch


 I'm working on a custom suggester derived from the AnalyzingInfix. I require 
 what is called a blended score (//TODO ln.399 in AnalyzingInfixSuggester) 
 to transform the suggestion weights depending on the position of the searched 
 term(s) in the text.
 Right now, I'm using an easy solution :
 If I want 10 suggestions, then I search against the current ordered index for 
 the 100 first results and transform the weight :
 bq. a) by using the term position in the text (found with TermVector and 
 DocsAndPositionsEnum)
 or
 bq. b) by multiplying the weight by the score of a SpanQuery that I add when 
 searching
 and return the updated 10 most weighted suggestions.
 Since we usually don't need to suggest so many things, the bigger search + 
 rescoring overhead is not so significant but I agree that this is not the 
 most elegant solution.
 We could include this factor (here the position of the term) directly into 
 the index.
 So, I can contribute to this if you think it's worth adding it.
 Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a 
 dedicated class ?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters


[ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866868#comment-13866868
 ] 

ASF subversion and git services commented on SOLR-5541:
---

Commit 1556903 from [~joel.bernstein] in branch 'dev/trunk'
[ https://svn.apache.org/r1556903 ]

SOLR-5541: Allow QueryElevationComponent to accept elevateIds and excludeIds as 
http parameters

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch, 
 SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866922#comment-13866922
 ] 

ASF subversion and git services commented on SOLR-5541:
---

Commit 1556923 from [~joel.bernstein] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1556923 ]

SOLR-5541: Allow QueryElevationComponent to accept elevateIds and excludeIds as 
http parameters

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch, 
 SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2014-01-09 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein resolved SOLR-5541.
--

Resolution: Fixed

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch, 
 SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested Grouping / Field Collapsing

2014-01-09 Thread Kranti Parisa

That's cool. just curious, do you have any tentative timelines for the
ExpandComponent?

Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Thu, Jan 9, 2014 at 8:37 AM, Joel Bernstein joels...@gmail.com wrote:

 Kranti,

 You've got it exactly. And yes sorting and limiting the doclist within the
 nested groups will be supported.


 Joel Bernstein
 Search Engineer at Heliosearch


 On Wed, Jan 8, 2014 at 6:54 PM, Kranti Parisa kranti.par...@gmail.comwrote:

 Joel,

 1) Collapse on the top level group.
 - done thru CollapsingQParserPlugin

 2) Expand a single page of collapsed results to display nested groups.
 - probably done thru ExpandComponent

 Is that correct? and does the scope of ExpandComponent includes the
 options to sort and limit the docList within the nested groups?

  Which means, we are going to first create the top level groups and while
 expanding each group, we create nested groups and allow to pass the sort,
 limit params?

  Thanks,
 Kranti K. Parisa
 http://www.linkedin.com/in/krantiparisa



 On Wed, Jan 8, 2014 at 5:48 PM, Joel Bernstein joels...@gmail.comwrote:

 Kranti,

 I'm wondering if this can be separated into two phases:

 1) Collapse on the top level group.
 2) Expand a single page of collapsed results to display nested groups.

 I'll be working on the ExpandComponent shortly, which will expand a
 single page of results that were collapsed by the CollapsingQParserPlugin.
 This seems like something that could be implemented as part of the
 ExpandComponent.

 Joel










 Joel Bernstein
 Search Engineer at Heliosearch


 On Wed, Jan 8, 2014 at 12:28 PM, Kranti Parisa 
 kranti.par...@gmail.comwrote:

 Anyone has got latest updates for
 https://issues.apache.org/jira/browse/SOLR-2553 ?
 I am trying to take a look at the implementation and see how complex
 this is to achieve.

 If someone else had a look into it earlier, could you please share your
 thoughts/comments..

 Thanks,
 Kranti K. Parisa
 http://www.linkedin.com/in/krantiparisa

[jira] [Created] (SOLR-5621) Let Solr use Lucene's SeacherManager

Tomás Fernández Löbbe created SOLR-5621:
---

 Summary: Let Solr use Lucene's SeacherManager
 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe


It would be nice if Solr could take advantage of Lucene's SearcherManager and 
get rid of most of the logic related to managing Searchers in SolrCore. 
I've been taking a look at how possible it is to achieve this, and even if I 
haven't finish with the changes (there are some use cases that are still not 
working exactly the same) it looks like it is possible to do.  Some things 
still could use a lot  of improvement (like the realtime searcher management) 
and some other not yet implemented, like Searchers on deck or IndexReaderFactory

I'm attaching an initial patch (many TODOs yet). 




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5621) Let Solr use Lucene's SeacherManager


 [ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-5621:


Attachment: SOLR-5621.patch

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866962#comment-13866962
 ] 

Uwe Schindler commented on SOLR-5621:
-

Thanks for opening this. This is really a good idea, I had the same idea in the 
past, but my Solr internals knowledge was to limited to be successful here. 

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5390) Loosen assert in IW on pending event after close


[ 
https://issues.apache.org/jira/browse/LUCENE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866979#comment-13866979
 ] 

Michael McCandless commented on LUCENE-5390:


+1

 Loosen assert in IW on pending event after close
 

 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: LUCENE-5390.patch


 Sometimes the assert in the IW is tripped due to pending merge events. Those 
 events can always happen but they are meaningless since we close / rollback 
 the IW anyway. I suggest we loosen the assert here to not fail if there are 
 only pending merge events.
 noformat
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
 Error Message:
 Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
 state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]
 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 /noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866980#comment-13866980
 ] 

Michael McCandless commented on SOLR-5621:
--

+1, this would be awesome.

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866983#comment-13866983
 ] 

Yonik Seeley commented on SOLR-5621:


It seems like a ton of change, and a lot of risk to gain really no 
additional functionality.

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5354) Blended score in AnalyzingInfixSuggester

[
https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866989#comment-13866989
]

Michael McCandless commented on LUCENE-5354:

Thanks Remi, the performance seems fine? But I realized this is not the best
benchmark, since all suggestions are just a single token.

New patch looks great; I think we should commit this approach, and performance
improvements can come later if necessary.

bq. see above my comment for your previous suggestion to avoid visiting term
vectors

Oh, the idea I had was to not use term vectors at all: you can get a TermsEnum
for the normal inverted index, and then visit each term from the query, and
then .advance to each doc from the top N results. But we can do this later ...
I'll commit this patch (I'll make some small code style improvements, e.g.
adding { } around all ifs).

Blended score in AnalyzingInfixSuggester

Key: LUCENE-5354
URL: https://issues.apache.org/jira/browse/LUCENE-5354
Project: Lucene - Core
Issue Type: Improvement
Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Remi Melisson
Priority: Minor
Labels: suggester
Attachments: LUCENE-5354.patch, LUCENE-5354_2.patch,
LUCENE-5354_3.patch

I'm working on a custom suggester derived from the AnalyzingInfix. I require
what is called a blended score (//TODO ln.399 in AnalyzingInfixSuggester)
to transform the suggestion weights depending on the position of the searched
term(s) in the text.
Right now, I'm using an easy solution :
If I want 10 suggestions, then I search against the current ordered index for
the 100 first results and transform the weight :
bq. a) by using the term position in the text (found with TermVector and
DocsAndPositionsEnum)
or
bq. b) by multiplying the weight by the score of a SpanQuery that I add when
searching
and return the updated 10 most weighted suggestions.
Since we usually don't need to suggest so many things, the bigger search +
rescoring overhead is not so significant but I agree that this is not the
most elegant solution.
We could include this factor (here the position of the term) directly into
the index.
So, I can contribute to this if you think it's worth adding it.
Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a
dedicated class ?

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1207 - Failure!

2014-01-09 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1207/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 10415 lines...]
   [junit4] JVM J0: stdout was not empty, see: 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140109_195727_377.sysout
   [junit4]  JVM J0: stdout (verbatim) 
   [junit4] #
   [junit4] # A fatal error has been detected by the Java Runtime Environment:
   [junit4] #
   [junit4] #  SIGBUS (0xa) at pc=0x0001394fd59f, pid=210, tid=111879
   [junit4] #
   [junit4] # JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 
1.7.0_45-b18)
   [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode 
bsd-amd64 )
   [junit4] # Problematic frame:
   [junit4] # C  0x0001394fd59f
   [junit4] #
   [junit4] # Failed to write core dump. Core dumps have been disabled. To 
enable core dumping, try ulimit -c unlimited before starting Java again
   [junit4] #
   [junit4] # An error report file with more information is saved as:
   [junit4] # 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/J0/hs_err_pid210.log
   [junit4] #
   [junit4] # If you would like to submit a bug report, please visit:
   [junit4] #   http://bugreport.sun.com/bugreport/crash.jsp
   [junit4] # The crash happened outside the Java Virtual Machine in native 
code.
   [junit4] # See problematic frame for where to report the bug.
   [junit4] #
   [junit4]  JVM J0: EOF 

[...truncated 1 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_45.jdk/Contents/Home/jre/bin/java 
-XX:-UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=DC1FA1AD8188BFD4 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 -classpath

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867007#comment-13867007
 ] 

Tomás Fernández Löbbe commented on SOLR-5621:
-

That's true, however I think it's good because it allows Solr to reuse Lucene's 
components instead of duplicate the code. I understand that the SearcherManager 
was not originally used because it didn't exist by the time Solr was created, 
but now that it does (and AFAK it's a Lucene best practice for cases like this) 
we should try to adopt it. 
Also, I think it would allow Solr to also use Lucene's SearcherLifetimeManager 
for searcher leases, which I think could allow Solr to use internal docids for 
distributed search instead of the unique key.  I know leases could be 
implemented in Solr too without using the SearcherLifetimeManager, but that way 
we continue duplicating functionality instead of using what's already built. 

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4647) Grouping is broken on docvalues-only fields

2014-01-09 Thread Iker Huerga (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867009#comment-13867009
 ] 

Iker Huerga commented on SOLR-4647:
---

Hi,

I've been able to replicate the issue which I think happens when stored=false 
in schema.xml for the DocValue field type. I could start working on a patch for 
it if nobody else is already working on it.

Thanks
Iker

 Grouping is broken on docvalues-only fields
 ---

 Key: SOLR-4647
 URL: https://issues.apache.org/jira/browse/SOLR-4647
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Adrien Grand
  Labels: newdev

 There are a few places where grouping uses 
 FieldType.toObject(SchemaField.createField(String, float)) to translate a 
 String field value to an Object. The problem is that createField returns null 
 when the field is neither stored nor indexed, even if it has doc values.
 An option to fix it could be to use the ValueSource instead to resolve the 
 Object value (similarily to NumericFacets).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867020#comment-13867020
]

Yonik Seeley commented on SOLR-5621:

bq. That's true, however I think it's good because it allows Solr to reuse
Lucene's components instead of duplicate the code.

That's not a good enough reason for me. It would be if one were about to write
the Solr code and it already existed in Lucene... but that's not the case.
Lucene did the duplication of code here, and there's no reason Solr should have
to move just because duplicated code now exists.

Let Solr use Lucene's SeacherManager

Key: SOLR-5621
URL: https://issues.apache.org/jira/browse/SOLR-5621
Project: Solr
Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
Attachments: SOLR-5621.patch

It would be nice if Solr could take advantage of Lucene's SearcherManager and
get rid of most of the logic related to managing Searchers in SolrCore.
I've been taking a look at how possible it is to achieve this, and even if I
haven't finish with the changes (there are some use cases that are still not
working exactly the same) it looks like it is possible to do. Some things
still could use a lot of improvement (like the realtime searcher management)
and some other not yet implemented, like Searchers on deck or
IndexReaderFactory
I'm attaching an initial patch (many TODOs yet).

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5390) Loosen assert in IW on pending event after close


[ 
https://issues.apache.org/jira/browse/LUCENE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867016#comment-13867016
 ] 

ASF subversion and git services commented on LUCENE-5390:
-

Commit 1556942 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1556942 ]

LUCENE-5390: Loosen assert in IW on pending event after close

 Loosen assert in IW on pending event after close
 

 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: LUCENE-5390.patch


 Sometimes the assert in the IW is tripped due to pending merge events. Those 
 events can always happen but they are meaningless since we close / rollback 
 the IW anyway. I suggest we loosen the assert here to not fail if there are 
 only pending merge events.
 noformat
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
 Error Message:
 Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
 state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]
 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 /noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5390) Loosen assert in IW on pending event after close

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867033#comment-13867033
 ] 

ASF subversion and git services commented on LUCENE-5390:
-

Commit 1556947 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1556947 ]

LUCENE-5390: Loosen assert in IW on pending event after close

 Loosen assert in IW on pending event after close
 

 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: LUCENE-5390.patch


 Sometimes the assert in the IW is tripped due to pending merge events. Those 
 events can always happen but they are meaningless since we close / rollback 
 the IW anyway. I suggest we loosen the assert here to not fail if there are 
 only pending merge events.
 noformat
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
 Error Message:
 Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
 state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]
 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 /noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867053#comment-13867053
 ] 

Tomás Fernández Löbbe commented on SOLR-5621:
-

I'm not saying that Solr duplicated Lucene code or the other way around, I'm 
just saying that at this point, the code is duplicated. Lucene can't use Solr 
code, but Solr can use Lucene's. Making that happen would not only remove from 
Solr part of the code, but it would also improve the testing in both, Lucene 
and Solr. Using custom code also causes the need of more custom code (like in 
my previous example with the SearcherLifetimeManager). I think that as Lucene 
evolves, Solr should keep up to date with Lucene's changes and best practices, 
after all, it's the same Apache project, right? I do think these are good 
reasons. 

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867062#comment-13867062
 ] 

Yonik Seeley commented on SOLR-5621:


bq. it's the same Apache project, right?

It was supposed to be.  Hasn't exactly worked out well IMO.

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5345) range facets don't work with float/double fields


[ 
https://issues.apache.org/jira/browse/LUCENE-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867065#comment-13867065
 ] 

ASF subversion and git services commented on LUCENE-5345:
-

Commit 1556952 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1556952 ]

LUCENE-5345: add new BlendedInfixSuggester

 range facets don't work with float/double fields
 

 Key: LUCENE-5345
 URL: https://issues.apache.org/jira/browse/LUCENE-5345
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.7

 Attachments: LUCENE-5345.patch


 With LUCENE-5297 we generalized range faceting to accept a ValueSource.
 But, when I tried to use this to facet by distance ( 1 km,  2 km, etc.), 
 it's not working ... the problem is that the RangeAccumulator always uses 
 .longVal and assumes this was a double encoded as a long (via DoubleField).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5374) Call processEvents before IndexWriter is closed

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867066#comment-13867066
 ] 

ASF subversion and git services commented on LUCENE-5374:
-

Commit 1556953 from [~simonw] in branch 'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1556953 ]

LUCENE-5374: Call IW#processEvents before IndexWriter is closed

 Call processEvents before IndexWriter is closed
 ---

 Key: LUCENE-5374
 URL: https://issues.apache.org/jira/browse/LUCENE-5374
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.6
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 5.0, 4.7

 Attachments: LUCENE-5374.patch


 We saw failures on jenkins that complain about processing events in the IW 
 while the IW is already closed:
 {noformat}
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=193, name=Thread-133, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: 
 org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
 at __randomizedtesting.SeedInfo.seed([3FAF37E1AFFB2502]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter 
 is closed
 at 
 org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:645)
 at 
 org.apache.lucene.index.IndexWriter.numDeletedDocs(IndexWriter.java:622)
 at 
 org.apache.lucene.index.IndexWriter.segString(IndexWriter.java:4265)
 at 
 org.apache.lucene.index.IndexWriter.publishFlushedSegment(IndexWriter.java:2324)
 at 
 org.apache.lucene.index.DocumentsWriterFlushQueue$FlushTicket.publishFlushedSegment(DocumentsWriterFlushQueue.java:198)
 at 
 org.apache.lucene.index.DocumentsWriterFlushQueue$FlushTicket.finishFlush(DocumentsWriterFlushQueue.java:213)
 at 
 org.apache.lucene.index.DocumentsWriterFlushQueue$SegmentFlushTicket.publish(DocumentsWriterFlushQueue.java:249)
 at 
 org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(DocumentsWriterFlushQueue.java:116)
 at 
 org.apache.lucene.index.DocumentsWriterFlushQueue.forcePurge(DocumentsWriterFlushQueue.java:138)
 at 
 org.apache.lucene.index.DocumentsWriter.purgeBuffer(DocumentsWriter.java:185)
 at org.apache.lucene.index.IndexWriter.purge(IndexWriter.java:4634)
 at 
 org.apache.lucene.index.DocumentsWriter$ForcedPurgeEvent.process(DocumentsWriter.java:701)
 at 
 org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:4665)
 at 
 org.apache.lucene.index.IndexWriter.processEvents(IndexWriter.java:4657)
 at 
 org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1067)
 at 
 org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2106)
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2024)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 {noformat}
 we need to process the events before we enter the finally block in 
 IW#closeInternal



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5345) range facets don't work with float/double fields

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867069#comment-13867069
 ] 

ASF subversion and git services commented on LUCENE-5345:
-

Commit 1556954 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1556954 ]

LUCENE-5345: add new BlendedInfixSuggester

 range facets don't work with float/double fields
 

 Key: LUCENE-5345
 URL: https://issues.apache.org/jira/browse/LUCENE-5345
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.7

 Attachments: LUCENE-5345.patch


 With LUCENE-5297 we generalized range faceting to accept a ValueSource.
 But, when I tried to use this to facet by distance ( 1 km,  2 km, etc.), 
 it's not working ... the problem is that the RangeAccumulator always uses 
 .longVal and assumes this was a double encoded as a long (via DoubleField).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5345) range facets don't work with float/double fields


[ 
https://issues.apache.org/jira/browse/LUCENE-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867070#comment-13867070
 ] 

Michael McCandless commented on LUCENE-5345:


Woops, above commit was for LUCENE-5354 instead.

 range facets don't work with float/double fields
 

 Key: LUCENE-5345
 URL: https://issues.apache.org/jira/browse/LUCENE-5345
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.7

 Attachments: LUCENE-5345.patch


 With LUCENE-5297 we generalized range faceting to accept a ValueSource.
 But, when I tried to use this to facet by distance ( 1 km,  2 km, etc.), 
 it's not working ... the problem is that the RangeAccumulator always uses 
 .longVal and assumes this was a double encoded as a long (via DoubleField).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5354) Blended score in AnalyzingInfixSuggester

2014-01-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5354.


   Resolution: Fixed
Fix Version/s: 4.7
   5.0

Thanks Remi!

I committed with the wrong issue LUCENE-5345 by accident...

 Blended score in AnalyzingInfixSuggester
 

 Key: LUCENE-5354
 URL: https://issues.apache.org/jira/browse/LUCENE-5354
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Remi Melisson
Priority: Minor
  Labels: suggester
 Fix For: 5.0, 4.7

 Attachments: LUCENE-5354.patch, LUCENE-5354_2.patch, 
 LUCENE-5354_3.patch


 I'm working on a custom suggester derived from the AnalyzingInfix. I require 
 what is called a blended score (//TODO ln.399 in AnalyzingInfixSuggester) 
 to transform the suggestion weights depending on the position of the searched 
 term(s) in the text.
 Right now, I'm using an easy solution :
 If I want 10 suggestions, then I search against the current ordered index for 
 the 100 first results and transform the weight :
 bq. a) by using the term position in the text (found with TermVector and 
 DocsAndPositionsEnum)
 or
 bq. b) by multiplying the weight by the score of a SpanQuery that I add when 
 searching
 and return the updated 10 most weighted suggestions.
 Since we usually don't need to suggest so many things, the bigger search + 
 rescoring overhead is not so significant but I agree that this is not the 
 most elegant solution.
 We could include this factor (here the position of the term) directly into 
 the index.
 So, I can contribute to this if you think it's worth adding it.
 Do you think I should tweak AnalyzingInfixSuggester, subclass it or create a 
 dedicated class ?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5390) Loosen assert in IW on pending event after close


[ 
https://issues.apache.org/jira/browse/LUCENE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867074#comment-13867074
 ] 

ASF subversion and git services commented on LUCENE-5390:
-

Commit 1556956 from [~simonw] in branch 'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1556956 ]

LUCENE-5390: Loosen assert in IW on pending event after close

 Loosen assert in IW on pending event after close
 

 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: LUCENE-5390.patch


 Sometimes the assert in the IW is tripped due to pending merge events. Those 
 events can always happen but they are meaningless since we close / rollback 
 the IW anyway. I suggest we loosen the assert here to not fail if there are 
 only pending merge events.
 noformat
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
 Error Message:
 Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
 state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]
 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 /noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5390) Loosen assert in IW on pending event after close

2014-01-09 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-5390.
-

Resolution: Fixed

 Loosen assert in IW on pending event after close
 

 Key: LUCENE-5390
 URL: https://issues.apache.org/jira/browse/LUCENE-5390
 Project: Lucene - Core
  Issue Type: Task
Affects Versions: 4.6, 5.0, 4.7, 4.6.1
Reporter: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: LUCENE-5390.patch


 Sometimes the assert in the IW is tripped due to pending merge events. Those 
 events can always happen but they are meaningless since we close / rollback 
 the IW anyway. I suggest we loosen the assert here to not fail if there are 
 only pending merge events.
 noformat
 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
 Error Message:
 Captured an uncaught exception in thread: Thread[id=288, name=Thread-222, 
 state=RUNNABLE, group=TGRP-TestIndexWriterWithThreads]
 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=288, name=Thread-222, state=RUNNABLE, 
 group=TGRP-TestIndexWriterWithThreads]
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at __randomizedtesting.SeedInfo.seed([98DFB1602D9F9A2A]:0)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:619)
 Caused by: java.lang.AssertionError: 
 [org.apache.lucene.index.DocumentsWriter$MergePendingEvent@67ef293b]
 at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:2026)
 at 
 org.apache.lucene.index.TestIndexWriterWithThreads$1.run(TestIndexWriterWithThreads.java:575)
 /noformat



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5391) uax29urlemailtokenizer - unexpected tokenisation of index2.php (and other inputs)

2014-01-09 Thread Chris (JIRA)

Chris created LUCENE-5391:
-

 Summary: uax29urlemailtokenizer - unexpected tokenisation of 
index2.php (and other inputs)
 Key: LUCENE-5391
 URL: https://issues.apache.org/jira/browse/LUCENE-5391
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Chris


The uax29urlemailtokenizer tokenises index2.php as:

URL index2.ph
ALPHANUM p

While it does not do the same for index.php

Screenshot from analyser: http://postimg.org/image/aj6c98n3b/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5543) solr.xml duplicat eentries after SWAP 4.6

2014-01-09 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867142#comment-13867142
 ] 

Shawn Heisey commented on SOLR-5543:


With a verbal OK from [~markrmil...@gmail.com] via IRC, I am backporting the 
fix for this issue to 4.6.1.

Both precommit and tests in solr/ are passing.  The commits for trunk and 
branch_4x have no code changes, I'm just moving the CHANGES.txt entry.


 solr.xml duplicat eentries after SWAP 4.6
 -

 Key: SOLR-5543
 URL: https://issues.apache.org/jira/browse/SOLR-5543
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Bill Bell
Assignee: Alan Woodward
 Fix For: 5.0, 4.7

 Attachments: SOLR-5543.patch


 We are having issues with SWAP CoreAdmin in 4.6.
 Using legacy solr.xml we issue a COreodmin SWAP, and we want it persistent. 
 It has been running flawless since 4.5. Now it creates duplicate lines in 
 solr.xml.
 Even the example multi core schema in doesn't work with persistent=true - 
 it creates duplicate lines in solr.xml.
  cores adminPath=/admin/cores
 core name=autosuggest loadOnStartup=true instanceDir=autosuggest 
 transient=false/
 core name=citystateprovider loadOnStartup=true 
 instanceDir=citystateprovider transient=false/
 core name=collection1 loadOnStartup=true instanceDir=collection1 
 transient=false/
 core name=facility loadOnStartup=true instanceDir=facility 
 transient=false/
 core name=inactiveproviders loadOnStartup=true 
 instanceDir=inactiveproviders transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 core name=locationgeo loadOnStartup=true instanceDir=locationgeo 
 transient=false/
 core name=market loadOnStartup=true instanceDir=market 
 transient=false/
 core name=portalprovider loadOnStartup=true 
 instanceDir=portalprovider transient=false/
 core name=practice loadOnStartup=true instanceDir=practice 
 transient=false/
 core name=provider loadOnStartup=true instanceDir=provider 
 transient=false/
 core name=providersearch loadOnStartup=true 
 instanceDir=providersearch transient=false/
 core name=tridioncomponents loadOnStartup=true 
 instanceDir=tridioncomponents transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 /cores



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

[
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867147#comment-13867147
]

Uwe Schindler commented on SOLR-5621:
-

Hi Yonik,

please don't be unfair to Tomás: You might be right, that this is too risky for
the stable branch, but we have still LuSolr trunk, so I see no problem
committing this (once its done) to Solr trunk. It can then bake for long time,
until Solr 5.0 is released. You have to recognize that he opened this issue
with affects/fix version 5.0.

As Tomás describes:

bq. Also, I think it would allow Solr to also use Lucene's
SearcherLifetimeManager for searcher leases, which I think could allow Solr to
use internal docids for distributed search instead of the unique key.

This is a perfect use-case, although I am not sure if this would be easy. But a
Job for another followup issue for Solr 5.0.

bq. It was supposed to be. Hasn't exactly worked out well IMO.

You are the only one that uses this statement. In my opinion the same Apache
project worked perfectly:

- We got a lot of additional per-segment stuff in Solr.
- I helped a lot to get lot's of API changes in Lucene into Solr, e.g. the
refactoring of Document, IndexReader. Others helped with TermsEnum,...
- Better Analyzer support in Solr. Users don't need to write factories for
stuff that's already in Lucene. Just plugin e.g. lucene-analysis-kuromoji into
your lib/ folder and it automatically works thaks to SPI. If we would still
have facories solely in Sold, one would have to write factories for all Lucene
modules or we would need to ship with them with Solr Core (so dependiing on
stuff like kuromoji the user don't needs).
- All codec support was mostly written by (originally) Lucene committers.

With your statement, you are the only person who fights against working
together even more! Some examples:
- The Facet module improved so much, why not allow to use it from Solr? To me
it looks like you are against. Just because you would need to configure in the
schema which fields you want facet on! The current Solr facetting uninverting
all stuff is a disaster performance- and memory wise
- Extracting factories from Solr: No you were against, because your enemy ES
could use it - But we did it anyhow. That's good! And ES did not yet completely
took over your code, so where is the problem?

With the given possibilities for improvements and better maintainability of
this code we are on the right path. I am sure with the new code maybe the crazy
Solr failures hitting us all the time from Solr tests maybe get better (you
know, the damnful: java.lang.AssertionError: ERROR: SolrIndexSearcher opens=59
closes=58).

Uwe

Let Solr use Lucene's SeacherManager

Key: SOLR-5621
URL: https://issues.apache.org/jira/browse/SOLR-5621
Project: Solr
Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
Attachments: SOLR-5621.patch

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2014-01-09 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867145#comment-13867145
]

Jan Høydahl commented on SOLR-5541:
---

Great feature.

Valid ids may contain commas. Should not this feature provide a way to
elevate/exclude such docs?
Either allow escaping, i.e. {{one\,id,another\,id}}, or allow configuring
separator, i.e. {{elevate.sep=;}}.

Allow QueryElevationComponent to accept elevateIds and excludeIds as http
parameters

Key: SOLR-5541
URL: https://issues.apache.org/jira/browse/SOLR-5541
Project: Solr
Issue Type: Improvement
Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
Fix For: 4.7

Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch,
SOLR-5541.patch

The QueryElevationComponent currently uses an xml file to map query strings
to elevateIds and excludeIds.
This ticket adds the ability to pass in elevateIds and excludeIds through two
new http parameters elevateIds and excludeIds.
This will allow more sophisticated business logic to be used in selecting
which ids to elevate/exclude.
Proposed syntax:
http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
The elevateIds and excludeIds point to the unique document Id.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5543) solr.xml duplicat eentries after SWAP 4.6


[ 
https://issues.apache.org/jira/browse/SOLR-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867155#comment-13867155
 ] 

ASF subversion and git services commented on SOLR-5543:
---

Commit 1556965 from [~elyograg] in branch 'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1556965 ]

SOLR-5543: Backport to 4.6 branch, for 4.6.1 release.

 solr.xml duplicat eentries after SWAP 4.6
 -

 Key: SOLR-5543
 URL: https://issues.apache.org/jira/browse/SOLR-5543
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Bill Bell
Assignee: Alan Woodward
 Fix For: 5.0, 4.7

 Attachments: SOLR-5543.patch


 We are having issues with SWAP CoreAdmin in 4.6.
 Using legacy solr.xml we issue a COreodmin SWAP, and we want it persistent. 
 It has been running flawless since 4.5. Now it creates duplicate lines in 
 solr.xml.
 Even the example multi core schema in doesn't work with persistent=true - 
 it creates duplicate lines in solr.xml.
  cores adminPath=/admin/cores
 core name=autosuggest loadOnStartup=true instanceDir=autosuggest 
 transient=false/
 core name=citystateprovider loadOnStartup=true 
 instanceDir=citystateprovider transient=false/
 core name=collection1 loadOnStartup=true instanceDir=collection1 
 transient=false/
 core name=facility loadOnStartup=true instanceDir=facility 
 transient=false/
 core name=inactiveproviders loadOnStartup=true 
 instanceDir=inactiveproviders transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 core name=locationgeo loadOnStartup=true instanceDir=locationgeo 
 transient=false/
 core name=market loadOnStartup=true instanceDir=market 
 transient=false/
 core name=portalprovider loadOnStartup=true 
 instanceDir=portalprovider transient=false/
 core name=practice loadOnStartup=true instanceDir=practice 
 transient=false/
 core name=provider loadOnStartup=true instanceDir=provider 
 transient=false/
 core name=providersearch loadOnStartup=true 
 instanceDir=providersearch transient=false/
 core name=tridioncomponents loadOnStartup=true 
 instanceDir=tridioncomponents transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 /cores



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867147#comment-13867147
]

Uwe Schindler edited comment on SOLR-5621 at 1/9/14 10:08 PM:
--

Hi Yonik,

As Tomás describes:

This is a perfect use-case, although I am not sure if this would be easy. But a
Job for another followup issue for Solr 5.0.

bq. It was supposed to be. Hasn't exactly worked out well IMO.

You are the only one that uses this statement. In my opinion the same Apache
project worked perfectly:

- We got a lot of additional per-segment stuff in Solr.
- I helped a lot to get lot's of API changes in Lucene into Solr, e.g. the
refactoring of Document, IndexReader. Others helped with TermsEnum,...
- Better Analyzer support in Solr. Users don't need to write factories for
stuff that's already in Lucene. Just plugin e.g.
{{lucene-analysis-kuromoji.jar}} into your lib/ folder and it automatically
works thanks to SPI. If we would still have factories solely in Solr, one would
have to write factories for all Lucene modules or we would need to ship with
them with Solr Core (so dependiing on stuff like kuromoji the user don't needs).
- All codec support was mostly written by (originally) Lucene committers.

Uwe

was (Author: thetaphi):
Hi Yonik,

As Tomás describes:

This is a perfect use-case, although I am not sure if this would be easy. But a
Job for another followup issue for Solr 5.0.

bq. It was supposed to be. Hasn't exactly worked out well IMO.

You are the only one that uses this statement. In my opinion the same Apache
project worked perfectly:

[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry

2014-01-09 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867165#comment-13867165
 ] 

Shawn Heisey commented on SOLR-5615:


Noted while backporting SOLR-5543 to the 4.6 branch: In the trunk CHANGES.txt 
file for trunk, this issue number shows up in the 4.6.1 section, but does not 
appear to have been actually backported to the 4.6 branch yet.


 Deadlock while trying to recover after a ZK session expiry
 --

 Key: SOLR-5615
 URL: https://issues.apache.org/jira/browse/SOLR-5615
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 4.6
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5615.patch, SOLR-5615.patch, SOLR-5615.patch


 The sequence of events which might trigger this is as follows:
  - Leader of a shard, say OL, has a ZK expiry
  - The new leader, NL, starts the election process
  - NL, through Overseer, clears the current leader (OL) for the shard from 
 the cluster state
  - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread)
  - OL marks itself down
  - OL sets up watches for cluster state, and then retrieves it (with no 
 leader for this shard)
  - NL, through Overseer, updates cluster state to mark itself leader for the 
 shard
  - OL tries to register itself as a replica, and waits till the cluster state 
 is updated
with the new leader from event thread
  - ZK sends a watch update to OL, but it is blocked on the event thread 
 waiting for it.
 Oops. This finally breaks out after trying to register itself as replica 
 times out after 20 mins.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5543) solr.xml duplicat eentries after SWAP 4.6


[ 
https://issues.apache.org/jira/browse/SOLR-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867167#comment-13867167
 ] 

ASF subversion and git services commented on SOLR-5543:
---

Commit 1556968 from [~elyograg] in branch 'dev/trunk'
[ https://svn.apache.org/r1556968 ]

SOLR-5543: move changes entry from 4.7.0 to 4.6.1.

 solr.xml duplicat eentries after SWAP 4.6
 -

 Key: SOLR-5543
 URL: https://issues.apache.org/jira/browse/SOLR-5543
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Bill Bell
Assignee: Alan Woodward
 Fix For: 5.0, 4.7

 Attachments: SOLR-5543.patch


 We are having issues with SWAP CoreAdmin in 4.6.
 Using legacy solr.xml we issue a COreodmin SWAP, and we want it persistent. 
 It has been running flawless since 4.5. Now it creates duplicate lines in 
 solr.xml.
 Even the example multi core schema in doesn't work with persistent=true - 
 it creates duplicate lines in solr.xml.
  cores adminPath=/admin/cores
 core name=autosuggest loadOnStartup=true instanceDir=autosuggest 
 transient=false/
 core name=citystateprovider loadOnStartup=true 
 instanceDir=citystateprovider transient=false/
 core name=collection1 loadOnStartup=true instanceDir=collection1 
 transient=false/
 core name=facility loadOnStartup=true instanceDir=facility 
 transient=false/
 core name=inactiveproviders loadOnStartup=true 
 instanceDir=inactiveproviders transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 core name=locationgeo loadOnStartup=true instanceDir=locationgeo 
 transient=false/
 core name=market loadOnStartup=true instanceDir=market 
 transient=false/
 core name=portalprovider loadOnStartup=true 
 instanceDir=portalprovider transient=false/
 core name=practice loadOnStartup=true instanceDir=practice 
 transient=false/
 core name=provider loadOnStartup=true instanceDir=provider 
 transient=false/
 core name=providersearch loadOnStartup=true 
 instanceDir=providersearch transient=false/
 core name=tridioncomponents loadOnStartup=true 
 instanceDir=tridioncomponents transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 /cores



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5543) solr.xml duplicat eentries after SWAP 4.6

2014-01-09 Thread Shawn Heisey (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-5543:
---

Fix Version/s: 4.6.1

 solr.xml duplicat eentries after SWAP 4.6
 -

 Key: SOLR-5543
 URL: https://issues.apache.org/jira/browse/SOLR-5543
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Bill Bell
Assignee: Alan Woodward
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5543.patch


 We are having issues with SWAP CoreAdmin in 4.6.
 Using legacy solr.xml we issue a COreodmin SWAP, and we want it persistent. 
 It has been running flawless since 4.5. Now it creates duplicate lines in 
 solr.xml.
 Even the example multi core schema in doesn't work with persistent=true - 
 it creates duplicate lines in solr.xml.
  cores adminPath=/admin/cores
 core name=autosuggest loadOnStartup=true instanceDir=autosuggest 
 transient=false/
 core name=citystateprovider loadOnStartup=true 
 instanceDir=citystateprovider transient=false/
 core name=collection1 loadOnStartup=true instanceDir=collection1 
 transient=false/
 core name=facility loadOnStartup=true instanceDir=facility 
 transient=false/
 core name=inactiveproviders loadOnStartup=true 
 instanceDir=inactiveproviders transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 core name=locationgeo loadOnStartup=true instanceDir=locationgeo 
 transient=false/
 core name=market loadOnStartup=true instanceDir=market 
 transient=false/
 core name=portalprovider loadOnStartup=true 
 instanceDir=portalprovider transient=false/
 core name=practice loadOnStartup=true instanceDir=practice 
 transient=false/
 core name=provider loadOnStartup=true instanceDir=provider 
 transient=false/
 core name=providersearch loadOnStartup=true 
 instanceDir=providersearch transient=false/
 core name=tridioncomponents loadOnStartup=true 
 instanceDir=tridioncomponents transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 /cores



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5543) solr.xml duplicat eentries after SWAP 4.6

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867170#comment-13867170
 ] 

ASF subversion and git services commented on SOLR-5543:
---

Commit 1556969 from [~elyograg] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1556969 ]

SOLR-5543: Move changes entry from 4.7.0 to 4.6.1 (merge trunk r1556968)

 solr.xml duplicat eentries after SWAP 4.6
 -

 Key: SOLR-5543
 URL: https://issues.apache.org/jira/browse/SOLR-5543
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Bill Bell
Assignee: Alan Woodward
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5543.patch


 We are having issues with SWAP CoreAdmin in 4.6.
 Using legacy solr.xml we issue a COreodmin SWAP, and we want it persistent. 
 It has been running flawless since 4.5. Now it creates duplicate lines in 
 solr.xml.
 Even the example multi core schema in doesn't work with persistent=true - 
 it creates duplicate lines in solr.xml.
  cores adminPath=/admin/cores
 core name=autosuggest loadOnStartup=true instanceDir=autosuggest 
 transient=false/
 core name=citystateprovider loadOnStartup=true 
 instanceDir=citystateprovider transient=false/
 core name=collection1 loadOnStartup=true instanceDir=collection1 
 transient=false/
 core name=facility loadOnStartup=true instanceDir=facility 
 transient=false/
 core name=inactiveproviders loadOnStartup=true 
 instanceDir=inactiveproviders transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 core name=locationgeo loadOnStartup=true instanceDir=locationgeo 
 transient=false/
 core name=market loadOnStartup=true instanceDir=market 
 transient=false/
 core name=portalprovider loadOnStartup=true 
 instanceDir=portalprovider transient=false/
 core name=practice loadOnStartup=true instanceDir=practice 
 transient=false/
 core name=provider loadOnStartup=true instanceDir=provider 
 transient=false/
 core name=providersearch loadOnStartup=true 
 instanceDir=providersearch transient=false/
 core name=tridioncomponents loadOnStartup=true 
 instanceDir=tridioncomponents transient=false/
 core name=linesvcgeo instanceDir=linesvcgeo loadOnStartup=true 
 transient=false/
 core name=linesvcgeofull instanceDir=linesvcgeofull 
 loadOnStartup=true transient=false/
 /cores



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867172#comment-13867172
 ] 

Yonik Seeley commented on SOLR-5621:


There's boatloads of FUD in your response Uwe, but I'm too tired of the 
politics to respond to them all.  Solr support for lucene faceting doesn't 
exist because no one has developed a patch yet.

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Fix For: 5.0

 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread Otis Gospodnetic (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Otis Gospodnetic updated SOLR-5621:
---

Fix Version/s: 5.0

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Fix For: 5.0

 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5615) Deadlock while trying to recover after a ZK session expiry

2014-01-09 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867180#comment-13867180
 ] 

Mark Miller commented on SOLR-5615:
---

Yeah, I started it, but turns out it's difficult without backporting another 
fix first.

 Deadlock while trying to recover after a ZK session expiry
 --

 Key: SOLR-5615
 URL: https://issues.apache.org/jira/browse/SOLR-5615
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 4.6
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5615.patch, SOLR-5615.patch, SOLR-5615.patch


 The sequence of events which might trigger this is as follows:
  - Leader of a shard, say OL, has a ZK expiry
  - The new leader, NL, starts the election process
  - NL, through Overseer, clears the current leader (OL) for the shard from 
 the cluster state
  - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread)
  - OL marks itself down
  - OL sets up watches for cluster state, and then retrieves it (with no 
 leader for this shard)
  - NL, through Overseer, updates cluster state to mark itself leader for the 
 shard
  - OL tries to register itself as a replica, and waits till the cluster state 
 is updated
with the new leader from event thread
  - ZK sends a watch update to OL, but it is blocked on the event thread 
 waiting for it.
 Oops. This finally breaks out after trying to register itself as replica 
 times out after 20 mins.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5379) Query-time multi-word synonym expansion

2014-01-09 Thread Nolan Lawson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867202#comment-13867202
 ] 

Nolan Lawson commented on SOLR-5379:


+1 as well. Tien's patch also seems to be a better candidate seeing as they 
included Java tests, whereas my tests are in Python 'cuz I was lazy.  :)

 Query-time multi-word synonym expansion
 ---

 Key: SOLR-5379
 URL: https://issues.apache.org/jira/browse/SOLR-5379
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Reporter: Tien Nguyen Manh
  Labels: multi-word, queryparser, synonym
 Fix For: 4.7

 Attachments: quoted.patch, synonym-expander.patch


 While dealing with synonym at query time, solr failed to work with multi-word 
 synonyms due to some reasons:
 - First the lucene queryparser tokenizes user query by space so it split 
 multi-word term into two terms before feeding to synonym filter, so synonym 
 filter can't recognized multi-word term to do expansion
 - Second, if synonym filter expand into multiple terms which contains 
 multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to 
 handle synonyms. But MultiPhraseQuery don't work with term have different 
 number of words.
 For the first one, we can extend quoted all multi-word synonym in user query 
 so that lucene queryparser don't split it. There are a jira task related to 
 this one https://issues.apache.org/jira/browse/LUCENE-2605.
 For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery 
 SHOULD which contains multiple PhraseQuery in case tokens stream have 
 multi-word synonym.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

[
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867218#comment-13867218
]

Michael McCandless commented on SOLR-5621:
--

+1 to do this in trunk, and give it time to bake.

A future cutover to SearcherLifetimeManager makes sense too; then Solr doesn't
need to load stored documents to get the id field to reference documents
anymore. Just use the searcher version + docID.

Refactoring code is a healthy and ongoing process in good open-source projects,
like ours. Yes, there is short-term risk of instability, but over time this
trades off for a stronger long-term design for Solr.

Solr should also be [more] per-segment, use Lucene Filters, etc.

Let Solr use Lucene's SeacherManager

Key: SOLR-5621
URL: https://issues.apache.org/jira/browse/SOLR-5621
Project: Solr
Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
Fix For: 5.0

Attachments: SOLR-5621.patch

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867226#comment-13867226
 ] 

Michael McCandless commented on SOLR-5621:
--

bq. Solr support for lucene faceting doesn't exist because no one has developed 
a patch yet.

In fact, cutting over to SearcherManager is a good step towards adding Lucene 
facets to Solr: the jump from SearcherManager (a ReferenceManagerIS) to 
SearcherTaxonomyManager (a ReferenceManagerIS + TaxoReader) is easy.

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Fix For: 5.0

 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters


[ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867246#comment-13867246
 ] 

Yonik Seeley commented on SOLR-5541:


The canonical way to do this in Solr is a StrUtils.splitSmart variant (the 
second one that doesn't do quotes I imagine)

 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch, 
 SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5618) false query result cache hits possible when duplicate filter queries exist in one query -- discovered via: Reproducible failure from TestFiltering.testRandomFiltering

2014-01-09 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5618:
---

Attachment: SOLR-5618.patch

bq. Probably most efficient for small lists would be to make a copy of one list 
and then remove equivalent elements as they are found.

Attached patch fixes the bug and adds some randomized testng on the 
QueryResultKey equality comparisons ensuring that both the positive and 
negative situations are covered.

(I'm still running full tests, but unless there are any objections i'll 
probably commit  backport to 4.6.1 ASAP)


 false query result cache hits possible when duplicate filter queries exist in 
 one query -- discovered via: Reproducible failure from 
 TestFiltering.testRandomFiltering
 --

 Key: SOLR-5618
 URL: https://issues.apache.org/jira/browse/SOLR-5618
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.6.1

 Attachments: SOLR-5618.patch, SOLR-5618.patch, SOLR-5618.patch, 
 SOLR-5618.patch, SOLR-5618.patch


 SOLR-5057 introduced a bug in queryResultCaching such that the following 
 circumstances can result in a false cache hit...
 * identical main query in both requests
 * identical number of filter queries in both requests
 * filter query from one request exists multiple times in other request
 * sum of hashCodes for all filter queries is equal in both request
 Details of how this problem was initially uncovered listed below...
 
 uwe's jenkins found this in java8...
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestFiltering 
 -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E 
 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY 
 -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8
[junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering 
[junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 
 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange 
 v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, 
 {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true 
 tag=t}-_query_:{!frange v=val_i l=1 u=1}]
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0)
[junit4]  at 
 org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327)
 {noformat}
 The seed fails consistently for me on trunk using java7, and on 4x using both 
 java7 and java6 - details to follow in comment.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5617) Default SolrResourceLoader restrictions may be too tight


 [ 
https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-5617:


Summary: Default SolrResourceLoader restrictions may be too tight  (was: 
Default classloader restrictions may be too tight)

 Default SolrResourceLoader restrictions may be too tight
 

 Key: SOLR-5617
 URL: https://issues.apache.org/jira/browse/SOLR-5617
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6
Reporter: Shawn Heisey
Priority: Minor
  Labels: security
 Fix For: 5.0, 4.7


 SOLR-4882 introduced restrictions for the Solr class loader that cause 
 resources outside the instanceDir to fail to load.  This is a very good goal, 
 but what if you have common resources like included config files that are 
 outside instanceDir but are still fully inside the solr home?
 I can understand not wanting to load resources from an arbitrary path, but 
 the solr home and its children should be about as trustworthy as instanceDir.
 Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted 
 automatically.  If I need to define a system property to make this happen, 
 I'm OK with that -- as long as I don't have to turn off the safety checking 
 entirely.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867272#comment-13867272
 ] 

Ryan McKinley commented on SOLR-5621:
-

 Refactoring code is a healthy and ongoing process in good open-source 
 projects, like ours. Yes, there is short-term risk of instability, but over 
 time this trades off for a stronger long-term design for Solr.

+1 for trunk

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Fix For: 5.0

 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2014-01-09 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867274#comment-13867274
 ] 

Hoss Man commented on SOLR-5463:


I haven't seen any negative feedback or suspicious jenkins failures, so unless 
someone sees a problem i'll start backporting tomorrow.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 5.0

 Attachments: SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters

2014-01-09 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867279#comment-13867279
 ] 

Joel Bernstein commented on SOLR-5541:
--

Looks like splitSmart will give us \ escapes. I'll slide that in.



 Allow QueryElevationComponent to accept elevateIds and excludeIds as http 
 parameters
 

 Key: SOLR-5541
 URL: https://issues.apache.org/jira/browse/SOLR-5541
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 4.6
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.7

 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch, 
 SOLR-5541.patch


 The QueryElevationComponent currently uses an xml file to map query strings 
 to elevateIds and excludeIds.
 This ticket adds the ability to pass in elevateIds and excludeIds through two 
 new http parameters elevateIds and excludeIds.
 This will allow more sophisticated business logic to be used in selecting 
 which ids to elevate/exclude.
 Proposed syntax:
 http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8
 The elevateIds and excludeIds point to the unique document Id.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5617) Default SolrResourceLoader restrictions may be too tight

[
https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867287#comment-13867287
]

Uwe Schindler commented on SOLR-5617:
-

Hi Shawn,
in fact the code was written exactly to support symbolic links! So your
workaround is actually wanted.

The idea of also using the Solr Home directory is theoretically possible, if
you would extend SolrResourceLoader.getResource to also look in the parent
ResourceLoader. There is already work done that this may work in the future (if
ResourceLoaders would have the same parent-child relations like ClassLoaders),
but currently its not easy possible.

There is currently also another elegant workaround: If the file is not in the
config dir directly, SolrResourceLoader looks in the classpath (through Core's
ClassLoader) and tries to find the file from there. So the easiest for you is
to add the shared directory as additional lib folder to the solrconfig.xml of
all cores. You may need to pack the files as JAR, but we can improve solr here,
that it might also accept non-jared class path components for lib directives.
Thats in fact the most clean solution, also working on windows without
symlinks. Also this is easy for the user to understand: Just add another lib /
classes / whatevername folder where your shared config files are.

Default SolrResourceLoader restrictions may be too tight

Key: SOLR-5617
URL: https://issues.apache.org/jira/browse/SOLR-5617
Project: Solr
Issue Type: Bug
Affects Versions: 4.6
Reporter: Shawn Heisey
Priority: Minor
Labels: security
Fix For: 5.0, 4.7

SOLR-4882 introduced restrictions for the Solr class loader that cause
resources outside the instanceDir to fail to load. This is a very good goal,
but what if you have common resources like included config files that are
outside instanceDir but are still fully inside the solr home?
I can understand not wanting to load resources from an arbitrary path, but
the solr home and its children should be about as trustworthy as instanceDir.
Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted
automatically. If I need to define a system property to make this happen,
I'm OK with that -- as long as I don't have to turn off the safety checking
entirely.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5617) Default SolrResourceLoader restrictions may be too tight

2014-01-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-5617:


Fix Version/s: (was: 4.7)
   Issue Type: Task  (was: Bug)

 Default SolrResourceLoader restrictions may be too tight
 

 Key: SOLR-5617
 URL: https://issues.apache.org/jira/browse/SOLR-5617
 Project: Solr
  Issue Type: Task
Affects Versions: 4.6
Reporter: Shawn Heisey
Priority: Minor
  Labels: security
 Fix For: 5.0


 SOLR-4882 introduced restrictions for the Solr class loader that cause 
 resources outside the instanceDir to fail to load.  This is a very good goal, 
 but what if you have common resources like included config files that are 
 outside instanceDir but are still fully inside the solr home?
 I can understand not wanting to load resources from an arbitrary path, but 
 the solr home and its children should be about as trustworthy as instanceDir.
 Ideally I'd like to have anything that's in $\{solr.solr.home\} trusted 
 automatically.  If I need to define a system property to make this happen, 
 I'm OK with that -- as long as I don't have to turn off the safety checking 
 entirely.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867291#comment-13867291
 ] 

Chris Male commented on SOLR-5621:
--

+1 for trunk

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Fix For: 5.0

 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5618) false query result cache hits possible when duplicate filter queries exist in one query -- discovered via: Reproducible failure from TestFiltering.testRandomFiltering


[ 
https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867305#comment-13867305
 ] 

ASF subversion and git services commented on SOLR-5618:
---

Commit 1556988 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1556988 ]

SOLR-5618: Fix false cache hits in queryResultCache when hashCodes are equal 
and duplicate filter queries exist in one of the requests

 false query result cache hits possible when duplicate filter queries exist in 
 one query -- discovered via: Reproducible failure from 
 TestFiltering.testRandomFiltering
 --

 Key: SOLR-5618
 URL: https://issues.apache.org/jira/browse/SOLR-5618
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.6.1

 Attachments: SOLR-5618.patch, SOLR-5618.patch, SOLR-5618.patch, 
 SOLR-5618.patch, SOLR-5618.patch


 SOLR-5057 introduced a bug in queryResultCaching such that the following 
 circumstances can result in a false cache hit...
 * identical main query in both requests
 * identical number of filter queries in both requests
 * filter query from one request exists multiple times in other request
 * sum of hashCodes for all filter queries is equal in both request
 Details of how this problem was initially uncovered listed below...
 
 uwe's jenkins found this in java8...
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestFiltering 
 -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E 
 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY 
 -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8
[junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering 
[junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 
 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange 
 v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, 
 {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true 
 tag=t}-_query_:{!frange v=val_i l=1 u=1}]
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0)
[junit4]  at 
 org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327)
 {noformat}
 The seed fails consistently for me on trunk using java7, and on 4x using both 
 java7 and java6 - details to follow in comment.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5463) Provide cursor/token based searchAfter support that works with arbitrary sorting (ie: deep paging)

2014-01-09 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867316#comment-13867316
 ] 

Joel Bernstein commented on SOLR-5463:
--

This is a great feature. I think this should work automatically with the 
CollapsingQParserPlugin so there's some grouping support. I'll do some testing 
on this to confirm.

 Provide cursor/token based searchAfter support that works with arbitrary 
 sorting (ie: deep paging)
 --

 Key: SOLR-5463
 URL: https://issues.apache.org/jira/browse/SOLR-5463
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 5.0

 Attachments: SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
 SOLR-5463__straw_man__MissingStringLastComparatorSource.patch


 I'd like to revist a solution to the problem of deep paging in Solr, 
 leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
 at the lucene level: require the clients to provide back a token indicating 
 the sort values of the last document seen on the previous page.  This is 
 similar to the cursor model I've seen in several other REST APIs that 
 support pagnation over a large sets of results (notable the twitter API and 
 it's since_id param) except that we'll want something that works with 
 arbitrary multi-level sort critera that can be either ascending or descending.
 SOLR-1726 laid some initial ground work here and was commited quite a while 
 ago, but the key bit of argument parsing to leverage it was commented out due 
 to some problems (see comments in that issue).  It's also somewhat out of 
 date at this point: at the time it was commited, IndexSearcher only supported 
 searchAfter for simple scores, not arbitrary field sorts; and the params 
 added in SOLR-1726 suffer from this limitation as well.
 ---
 I think it would make sense to start fresh with a new issue with a focus on 
 ensuring that we have deep paging which:
 * supports arbitrary field sorts in addition to sorting by score
 * works in distributed mode
 {panel:title=Basic Usage}
 * send a request with {{sort=Xstart=0rows=NcursorMark=*}}
 ** sort can be anything, but must include the uniqueKey field (as a tie 
 breaker) 
 ** N can be any number you want per page
 ** start must be 0
 ** \* denotes you want to use a cursor starting at the beginning mark
 * parse the response body and extract the (String) {{nextCursorMark}} value
 * Replace the \* value in your initial request params with the 
 {{nextCursorMark}} value from the response in the subsequent request
 * repeat until the {{nextCursorMark}} value stops changing, or you have 
 collected as many docs as you need
 {panel}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5618) false query result cache hits possible when duplicate filter queries exist in one query -- discovered via: Reproducible failure from TestFiltering.testRandomFiltering

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867325#comment-13867325
 ] 

ASF subversion and git services commented on SOLR-5618:
---

Commit 1556996 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1556996 ]

SOLR-5618: Fix false cache hits in queryResultCache when hashCodes are equal 
and duplicate filter queries exist in one of the requests (merge r1556988)

 false query result cache hits possible when duplicate filter queries exist in 
 one query -- discovered via: Reproducible failure from 
 TestFiltering.testRandomFiltering
 --

 Key: SOLR-5618
 URL: https://issues.apache.org/jira/browse/SOLR-5618
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.6.1

 Attachments: SOLR-5618.patch, SOLR-5618.patch, SOLR-5618.patch, 
 SOLR-5618.patch, SOLR-5618.patch


 SOLR-5057 introduced a bug in queryResultCaching such that the following 
 circumstances can result in a false cache hit...
 * identical main query in both requests
 * identical number of filter queries in both requests
 * filter query from one request exists multiple times in other request
 * sum of hashCodes for all filter queries is equal in both request
 Details of how this problem was initially uncovered listed below...
 
 uwe's jenkins found this in java8...
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestFiltering 
 -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E 
 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY 
 -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8
[junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering 
[junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 
 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange 
 v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, 
 {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true 
 tag=t}-_query_:{!frange v=val_i l=1 u=1}]
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0)
[junit4]  at 
 org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327)
 {noformat}
 The seed fails consistently for me on trunk using java7, and on 4x using both 
 java7 and java6 - details to follow in comment.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread Mark Miller (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867327#comment-13867327
]

Mark Miller commented on SOLR-5621:
---

I don't think we can flat out reject refactoring or code contributions because
they might destabilize code thats been around for a while. If we do that, Solr
will not evolve properly.

I sympathize with the idea that we don't want to add a lot of instability - I'm
fighting that battle with SolrCloud while it's still in it's hardening phase.
However, it's an argument that can easily be carried too far.

Honestly, I wouldn't be that happy to have such a big change in 5 that is not
in 4 - it starts making development and back porting a major pain. But the sad
fact is, this is exactly what 5.0 is for and all about.

I have not read the patch, so I don't know if I am for or against this, but
simply sharing code with Lucene adds to the contributors to the code, so flat
out, there are certainly some advantages. Perhaps some disadvantages too, but
without a doubt, advantages.

Anyway, I think we need to judge this on the technical merits of the final
patch.

Let Solr use Lucene's SeacherManager

Key: SOLR-5621
URL: https://issues.apache.org/jira/browse/SOLR-5621
Project: Solr
Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
Fix For: 5.0

Attachments: SOLR-5621.patch

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5621) Let Solr use Lucene's SeacherManager

2014-01-09 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867331#comment-13867331
 ] 

Mark Miller commented on SOLR-5621:
---

One way to make such a refactoring a bit more palatable IMO, is to add a lot to 
the testing around this rather than just relying on the existing tests...

 Let Solr use Lucene's SeacherManager
 

 Key: SOLR-5621
 URL: https://issues.apache.org/jira/browse/SOLR-5621
 Project: Solr
  Issue Type: Improvement
Affects Versions: 5.0
Reporter: Tomás Fernández Löbbe
 Fix For: 5.0

 Attachments: SOLR-5621.patch


 It would be nice if Solr could take advantage of Lucene's SearcherManager and 
 get rid of most of the logic related to managing Searchers in SolrCore. 
 I've been taking a look at how possible it is to achieve this, and even if I 
 haven't finish with the changes (there are some use cases that are still not 
 working exactly the same) it looks like it is possible to do.  Some things 
 still could use a lot  of improvement (like the realtime searcher management) 
 and some other not yet implemented, like Searchers on deck or 
 IndexReaderFactory
 I'm attaching an initial patch (many TODOs yet). 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5391) uax29urlemailtokenizer - unexpected tokenisation of index2.php (and other inputs)

2014-01-09 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867359#comment-13867359
 ] 

Steve Rowe edited comment on LUCENE-5391 at 1/10/14 12:52 AM:
--

I understand why index.php is not broken up: the URL rule matches 
index.ph, but the ALPHANUM rule has a longer match, so it wins.

Conversely, ALPHANUM does not match index2.php (likely because the 
[number][period] sequence is not allowed), so the shorter URL match is 
tokenized.  

Another improperly broken-up filename-looking thing: index-h.php - the URL 
rule matches index-h.ph, but the ALPHANUM rule doesn't match (likely 
because of the hyphen).

I think the fix here is to disallow  URLs when there is no trailing port, 
path, query or fragment, and the following character is [-A-Za-z0-9] (allowable 
domain label characters).

I'll make a patch.


was (Author: steve_rowe):
I understand why index.php is not broken up: the URL rule matches 
index.ph, but the ALPHANUM rule has a longer match, so it wins.

Conversely, ALPHANUM does not match index2.php (likely because the 
{[number][period]} sequence is not allowed), so the shorter URL match is 
tokenized.  

Another improperly broken-up filename-looking thing: index-h.php - the URL 
rule matches index-h.ph, but the ALPHANUM rule doesn't match (likely 
because of the hyphen).

I think the fix here is to disallow  URLs when there is no trailing port, 
path, query or fragment, and the following character is [-A-Za-z0-9] (allowable 
domain label characters).

I'll make a patch.

 uax29urlemailtokenizer - unexpected tokenisation of index2.php (and other 
 inputs)
 ---

 Key: LUCENE-5391
 URL: https://issues.apache.org/jira/browse/LUCENE-5391
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Chris Geeringh

 The uax29urlemailtokenizer tokenises index2.php as:
 URL index2.ph
 ALPHANUM p
 While it does not do the same for index.php
 Screenshot from analyser: http://postimg.org/image/aj6c98n3b/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5391) uax29urlemailtokenizer - unexpected tokenisation of index2.php (and other inputs)

2014-01-09 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867359#comment-13867359
 ] 

Steve Rowe commented on LUCENE-5391:


I understand why index.php is not broken up: the URL rule matches 
index.ph, but the ALPHANUM rule has a longer match, so it wins.

Conversely, ALPHANUM does not match index2.php (likely because the 
{[number][period]} sequence is not allowed), so the shorter URL match is 
tokenized.  

Another improperly broken-up filename-looking thing: index-h.php - the URL 
rule matches index-h.ph, but the ALPHANUM rule doesn't match (likely 
because of the hyphen).

I think the fix here is to disallow  URLs when there is no trailing port, 
path, query or fragment, and the following character is [-A-Za-z0-9] (allowable 
domain label characters).

I'll make a patch.

 uax29urlemailtokenizer - unexpected tokenisation of index2.php (and other 
 inputs)
 ---

 Key: LUCENE-5391
 URL: https://issues.apache.org/jira/browse/LUCENE-5391
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Chris Geeringh

 The uax29urlemailtokenizer tokenises index2.php as:
 URL index2.ph
 ALPHANUM p
 While it does not do the same for index.php
 Screenshot from analyser: http://postimg.org/image/aj6c98n3b/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5618) false query result cache hits possible when duplicate filter queries exist in one query -- discovered via: Reproducible failure from TestFiltering.testRandomFiltering

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867398#comment-13867398
 ] 

ASF subversion and git services commented on SOLR-5618:
---

Commit 1557008 from hoss...@apache.org in branch 'dev/branches/lucene_solr_4_6'
[ https://svn.apache.org/r1557008 ]

SOLR-5618: Fix false cache hits in queryResultCache when hashCodes are equal 
and duplicate filter queries exist in one of the requests (merge r1556988)

 false query result cache hits possible when duplicate filter queries exist in 
 one query -- discovered via: Reproducible failure from 
 TestFiltering.testRandomFiltering
 --

 Key: SOLR-5618
 URL: https://issues.apache.org/jira/browse/SOLR-5618
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.6.1

 Attachments: SOLR-5618.patch, SOLR-5618.patch, SOLR-5618.patch, 
 SOLR-5618.patch, SOLR-5618.patch


 SOLR-5057 introduced a bug in queryResultCaching such that the following 
 circumstances can result in a false cache hit...
 * identical main query in both requests
 * identical number of filter queries in both requests
 * filter query from one request exists multiple times in other request
 * sum of hashCodes for all filter queries is equal in both request
 Details of how this problem was initially uncovered listed below...
 
 uwe's jenkins found this in java8...
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestFiltering 
 -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E 
 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY 
 -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8
[junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering 
[junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 
 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange 
 v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, 
 {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true 
 tag=t}-_query_:{!frange v=val_i l=1 u=1}]
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0)
[junit4]  at 
 org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327)
 {noformat}
 The seed fails consistently for me on trunk using java7, and on 4x using both 
 java7 and java6 - details to follow in comment.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5618) false query result cache hits possible when duplicate filter queries exist in one query -- discovered via: Reproducible failure from TestFiltering.testRandomFiltering

2014-01-09 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-5618.


   Resolution: Fixed
Fix Version/s: 4.7
   5.0

 false query result cache hits possible when duplicate filter queries exist in 
 one query -- discovered via: Reproducible failure from 
 TestFiltering.testRandomFiltering
 --

 Key: SOLR-5618
 URL: https://issues.apache.org/jira/browse/SOLR-5618
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 5.0, 4.7, 4.6.1

 Attachments: SOLR-5618.patch, SOLR-5618.patch, SOLR-5618.patch, 
 SOLR-5618.patch, SOLR-5618.patch


 SOLR-5057 introduced a bug in queryResultCaching such that the following 
 circumstances can result in a false cache hit...
 * identical main query in both requests
 * identical number of filter queries in both requests
 * filter query from one request exists multiple times in other request
 * sum of hashCodes for all filter queries is equal in both request
 Details of how this problem was initially uncovered listed below...
 
 uwe's jenkins found this in java8...
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9004/consoleText
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestFiltering 
 -Dtests.method=testRandomFiltering -Dtests.seed=C22042E80957AE3E 
 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ar_LY 
 -Dtests.timezone=Asia/Katmandu -Dtests.file.encoding=UTF-8
[junit4] FAILURE 16.9s J1 | TestFiltering.testRandomFiltering 
[junit4] Throwable #1: java.lang.AssertionError: FAILURE: iiter=11 
 qiter=336 request=[q, {!frange v=val_i l=0 u=1 cost=139 tag=t}, fq, {!frange 
 v=val_i l=0 u=1}, fq, {! cost=92}-_query_:{!frange v=val_i l=1 u=1}, fq, 
 {!frange v=val_i l=0 u=1 cache=true tag=t}, fq, {! cache=true 
 tag=t}-_query_:{!frange v=val_i l=1 u=1}]
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([C22042E80957AE3E:DD43E12DEC70EE37]:0)
[junit4]  at 
 org.apache.solr.search.TestFiltering.testRandomFiltering(TestFiltering.java:327)
 {noformat}
 The seed fails consistently for me on trunk using java7, and on 4x using both 
 java7 and java6 - details to follow in comment.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5389) Even more doc for construction of TokenStream components

2014-01-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867429#comment-13867429
 ] 

Robert Muir commented on LUCENE-5389:
-

OK, i took a look. I had to make a fix for documentation-lint to pass, 
basically it didnt like the multiline \{@code} element you had for the code 
sample, because 'javadoc' would give an error that it couldnt find the closing 
brace. Maybe the \{@override} was messing it up. In general i've never used 
multiline \{@code} before...  Anyway i just made it consistent with other code 
samples by doing this:

{code}
pre class=prettyprint
public class ForwardingTokenizer extends Tokenizer {
   private Tokenizer delegate;
   ...
   {@literal @Override}
   public void reset() {
  super.reset();
  delegate.setReader(this.input);
  delegate.reset();
   }
}
  /pre
{code}

The class=prettyprint gives colored syntax highlighting in the javadocs, and 
the override is escaped with literal. At least these are the way the others are 
done.

I'm committing this. Do you want to make a patch to trunk-only to update the 
5.x docs with respect to LUCENE-5388? Stuff like (A future release of Apache 
Lucene may remove the reader parameters from the Tokenizer constructors.)

Thanks!


 Even more doc for construction of TokenStream components
 

 Key: LUCENE-5389
 URL: https://issues.apache.org/jira/browse/LUCENE-5389
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Benson Margulies

 There are more useful things to tell would-be authors of tokenizers. Let's 
 tell them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5389) Even more doc for construction of TokenStream components


[ 
https://issues.apache.org/jira/browse/LUCENE-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867430#comment-13867430
 ] 

ASF subversion and git services commented on LUCENE-5389:
-

Commit 1557010 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1557010 ]

LUCENE-5389: Add more guidance in the analyis documentation package overview 
(closes #14)

 Even more doc for construction of TokenStream components
 

 Key: LUCENE-5389
 URL: https://issues.apache.org/jira/browse/LUCENE-5389
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Benson Margulies

 There are more useful things to tell would-be authors of tokenizers. Let's 
 tell them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

lucene-solr pull request: LUCENE-5389: more analysis advice.

2014-01-09 Thread benson-basis

Github user benson-basis closed the pull request at:

https://github.com/apache/lucene-solr/pull/14


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5389) Even more doc for construction of TokenStream components