[jira] Commented: (SOLR-236) Field collapsing

2010-01-19 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802186#action_12802186
 ] 

Martijn van Groningen commented on SOLR-236:


If the field is tokenized and has more than one token your field collapse 
result will become incorrect. What happens if I remember correctly is that it 
will only collapse on the field's last token. This off course leads to weird 
collapse groups. For the users that only have one token per collapse field are 
because of this check out of luck. Somehow I think we should make the user know 
that is not possible to collapse on a tokenized field (at least with multiple 
tokens). Maybe adding a warning in the response. Still I think the exception is 
more clear, but also prohibits it off course. 

bq. Or someone could come after me and write a patch that checks for 
multi-tokened fields somehow and throws an exception.
Checking if a tokenized field contains only one token is really inefficient, 
because you have the check all every collapse field of all documents. Now do 
check is done based on the field's definition in the schema.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1726) Deep Paging and Large Results Improvements

2010-01-19 Thread Grant Ingersoll (JIRA)
Deep Paging and Large Results Improvements
--

 Key: SOLR-1726
 URL: https://issues.apache.org/jira/browse/SOLR-1726
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5


There are possibly ways to improve collections of deep paging by passing 
Solr/Lucene more information about the last page of results seen, thereby 
saving priority queue operations.   See LUCENE-2215.

There may also be better options for retrieving large numbers of rows at a time 
that are worth exploring.  LUCENE-2127.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2010-01-19 Thread Yaniv S. (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802334#action_12802334
 ] 

Yaniv S. commented on SOLR-236:
---

Hi All, this is a very exciting feature and I'm trying to apply it on our 
system.
I've tried patching on 1.4 and on the trunk version but both give me build 
errors.
Any suggestions on how I can build 1.4 or latest with this patch?

Many Thanks,
Yaniv

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1719) stock TokenFilterFactory for flattening positions

2010-01-19 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802342#action_12802342
 ] 

Otis Gospodnetic commented on SOLR-1719:


Does PositionFilterFactory  fix the problem?

 stock TokenFilterFactory for flattening positions
 -

 Key: SOLR-1719
 URL: https://issues.apache.org/jira/browse/SOLR-1719
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man

 People seem to occasionally be confused by why certain inputs result in 
 PhraseQueries instead of BooleanQueries...
 http://old.nabble.com/Understanding-the-query-parser-to27071483.html
 http://old.nabble.com/Tokenizer-question-to27099119.html
 ...it would probably be handy if there was a TokenFilterFactory provided out 
 of the box that just set the positionIncrement of every token to 0 to deal 
 with situations where people don't care about term positions at query time, 
 and are just using tokenization/analysis as a way to split up some input 
 string into multiple SHOULD clauses for a BooleanQuery

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1719) stock TokenFilterFactory for flattening positions

2010-01-19 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-1719.


Resolution: Not A Problem

Indeed

I thought we had something like that, but overlooked it when i went looking, so 
i opened this issue.

 stock TokenFilterFactory for flattening positions
 -

 Key: SOLR-1719
 URL: https://issues.apache.org/jira/browse/SOLR-1719
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man

 People seem to occasionally be confused by why certain inputs result in 
 PhraseQueries instead of BooleanQueries...
 http://old.nabble.com/Understanding-the-query-parser-to27071483.html
 http://old.nabble.com/Tokenizer-question-to27099119.html
 ...it would probably be handy if there was a TokenFilterFactory provided out 
 of the box that just set the positionIncrement of every token to 0 to deal 
 with situations where people don't care about term positions at query time, 
 and are just using tokenization/analysis as a way to split up some input 
 string into multiple SHOULD clauses for a BooleanQuery

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-19 Thread Chris Hostetter

: again. I don't think it matters if its the same FileChannel or not - you
: just can't use Native Locks within the same JVM, as the lock is held by
: the JVM - they are per process - so Lucene does its own little static
: map stuff to lock within JVM (simple in memory lock tracking) and uses
: the actual Native Lock for multiple JVMs (which is all its good for -
: process granularity). But obviously, the in memory locking doesn't work
: across webapps.

Assuming I'm understanding all of this correctly, that implies a bug in 
Lucene's NativeFSLockFactory when used in a multiple classloader type 
situation -- including any app running in a servlet container.

At a minimu, shouldn't NativeFSLock.obtain() be checking for 
OverlappingFileLockException and treating that as a failure to acquire the 
lock?



-Hoss



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-19 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802496#action_12802496
 ] 

Uri Boness commented on SOLR-1725:
--

The DIH ScriptTransformer can really be cleaned up using this patch as well. I 
didn't add it to this patch as I didn't know whether it was a good idea to put 
too much into one patch. 

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2010-01-19 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802512#action_12802512
 ] 

Martijn van Groningen commented on SOLR-236:


Hi Yaniv, I tried the same on 1.4 branch (from svn) and the svn trunk. Applying 
the patch on both sources went fine, but when building (ant dist) on trunk I 
also got compile errors. This had to dowith that SolrQueryResponse changed 
package from request package to response package. I will update the patch 
shortly. Building on the 1.4 branch went without any problems (ant dist). What 
errors did occur when running ant dist on 1.4 branch?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-19 Thread Mark Miller
Chris Hostetter wrote:
 : again. I don't think it matters if its the same FileChannel or not - you
 : just can't use Native Locks within the same JVM, as the lock is held by
 : the JVM - they are per process - so Lucene does its own little static
 : map stuff to lock within JVM (simple in memory lock tracking) and uses
 : the actual Native Lock for multiple JVMs (which is all its good for -
 : process granularity). But obviously, the in memory locking doesn't work
 : across webapps.

 Assuming I'm understanding all of this correctly, that implies a bug in 
 Lucene's NativeFSLockFactory when used in a multiple classloader type 
 situation -- including any app running in a servlet container.

 At a minimu, shouldn't NativeFSLock.obtain() be checking for 
 OverlappingFileLockException and treating that as a failure to acquire the 
 lock?



 -Hoss

   
Perhaps - that should make it work in more cases - but in my simple
testing its not 100% reliable.

If I startup two threads and and try and get a lock (with the same
channel, with different channels) with first one thread and then the
other - sometimes it throws OverlappingFileLockException
... and sometimes it doesn't. From what I can tell, you certainly can't
count on it.

If you pause between attempts, it does appear to always work - so it
certainly would give us a lot of ground it would seem - but if they
attempts are back to back, both threads can still successfully get the lock.

This behavior could be OS dependent as its using OS level locks.

FileChannel does appear to say that this should work (though its
obviously not completely thread safe from what I can tell), but it also
says:

File locks are held on behalf of the entire Java virtual machine.
 * They are not suitable for controlling access to a file by multiple
 * threads within the same virtual machine.

Which seems to be the case.

-- 
- Mark

http://www.lucidimagination.com





Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-19 Thread Chris Hostetter

:  At a minimu, shouldn't NativeFSLock.obtain() be checking for 
:  OverlappingFileLockException and treating that as a failure to acquire the 
:  lock?
...
: Perhaps - that should make it work in more cases - but in my simple
: testing its not 100% reliable.
...
: File locks are held on behalf of the entire Java virtual machine.
:  * They are not suitable for controlling access to a file by multiple
:  * threads within the same virtual machine.

...Grrr  so where does that leave us?

Yonik's added comment was that native isnt' recommended when running 
multiple webapps in the same container.  in truth, native *can* 
work when running multiple webapps in the same container, just as long as 
those cotnainers don't refrence the same data dirs

I'm worried that we should recommend people avoid native altogether 
because even if you are only running one webapp, it seems like a reload 
or that app could trigger some similar bad behavior.

So what/how should we document all of this?

-Hoss



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-19 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802607#action_12802607
 ] 

Mark Miller commented on SOLR-1725:
---

We might want to think about making this a contrib?

How do people feel about putting in core Solr code/functionality that requires 
Java 6?

So far you have been able to run all of Solr with just Java 5. Do we want to go 
down the road of some non contrib features only working with something higher 
than 5?

Should that only be allowed as a contrib?

We certainly want the functionality I think, but as far as I can tell this 
would break new ground in terms of Solr's jvm version requirements (noting that 
Uri has made it so that everything still builds and runs with 5, you just can't 
use this functionality unless you are on 6) 

Any opinions?

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1677) Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory

2010-01-19 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802614#action_12802614
 ] 

Mark Miller commented on SOLR-1677:
---

If you are thinking of VERSION as alternate versions, I can see your point.

But I can't imagine thats what VERSION is for.

{quote} everyone else seems to have a very fixed view that these Version based 
changes are genuine improvements/bug-fixes, w/o any expectation that clients 
might/could subjective decide i want the old behavior and that older 
Versions are supported purely for back-compatibility. {quote}

I don't think Versions is meant to be used so that users can choose how things 
operate - personally I do see it as purely a way to get bad behavior for back 
compatibility. If thats not the case, we should not use Version in Lucene, we 
should make a Class2. Then you pick which you want. To me, Version is for 
fixing bugs or things that are clearly not the right way of doing things. Not a 
choice list. If more than one choice makes sense that should be done without 
Version. Personally thats all that makes sense to me. Perhaps it will be 
abused, but personally I'd push back. Version is not a functionality selector - 
its a way to handle back compat for bugs and clear improvements - stuff we plan 
and hope to drop into a big black hole forever. Not options that make sense 
and we plan to keep around for users to mull over.

I'm also not that worried that users won't know what changed - they will just 
know that they are in the same boat as those downloading Lucene latest greatest 
for the first time. Likely the best boat to be in when it comes to this stuff. 
If they want to manage things piece mail, I'm still all for allowing Version 
per component for experts use. But man, I wouldn't want to be in the boat, 
managing all my components as they mimic various bugs/bad behavior for various 
components.

When I download the latest Solr and do a fresh install, I want it to have all 
of the latest Lucene bugs fixed (not the case currently). When I have an old 
install, I want to be able to change one setting and reindex to get all known 
bugs fixed (currently not the case - heck its not even possible to run Solr 
currently with all the known Lucene bugs fixed).

 Add support for o.a.lucene.util.Version for BaseTokenizerFactory and 
 BaseTokenFilterFactory
 ---

 Key: SOLR-1677
 URL: https://issues.apache.org/jira/browse/SOLR-1677
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Reporter: Uwe Schindler
 Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, 
 SOLR-1677.patch


 Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards 
 compatibility with old indexes created using older versions of Lucene. The 
 most important example is StandardTokenizer, which changed its behaviour with 
 posIncr and incorrect host token types in 2.4 and also in 2.9.
 In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with 
 much more Unicode support, almost every Tokenizer/TokenFilter needs this 
 Version parameter. In 2.9, the deprecated old ctors without Version take 
 LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer.
 This patch adds basic support for the Lucene Version property to the base 
 factories. Subclasses then can use the luceneMatchVersion decoded enum (in 
 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently 
 contains a helper map to decode the version strings, but in 3.0 is can be 
 replaced by Version.valueOf(String), as the Version is a subclass of Java5 
 enums. The default value is Version.LUCENE_24 (as this is the default for the 
 no-version ctors in Lucene).
 This patch also removes unneeded conversions to CharArraySet from 
 StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed 
 to match Lucene 3.0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1630) StringIndexOutOfBoundsException in SpellCheckComponent

2010-01-19 Thread Jay Hill (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802630#action_12802630
 ] 

Jay Hill commented on SOLR-1630:


I have seen another case of a production system hitting this exact same 
exception. However I'm unable to reproduce it outside of production. However it 
is occurring on all queries with hyphenated words. For a search on: 
ochoa-brillembourg

SEVERE: java.lang.StringIndexOutOfBoundsException: String index out of range: 
-14
at 
java.lang.AbstractStringBuilder.replace(AbstractStringBuilder.java:797)
at java.lang.StringBuilder.replace(StringBuilder.java:271)
at 
org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248)
at 
org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
at java.lang.Thread.run(Thread.java:619)


 StringIndexOutOfBoundsException in SpellCheckComponent
 --

 Key: SOLR-1630
 URL: https://issues.apache.org/jira/browse/SOLR-1630
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis, spellchecker
Affects Versions: 1.4
 Environment: Solr 1.4
 Lucene 2.9.1
 Win XP
 java version 1.6.0_14
Reporter: Robin Wojciki
Assignee: Shalin Shekhar Mangar
 Attachments: bug.xml, schema.xml, SOLR-1630.patch, solrconfig.xml, 
 spellcheckconfig.xml


 For some documents/search strings, the SpellCheckComponent throws 
 StringIndexOutOfBoundsException
 See: http://www.lucidimagination.com/search/document/3be6555227e031fc/
 h2. Replication
  * Save attached schema.xml and solrconfig.xml in 
 apache-solr-1.4.0/example/solr/conf
  * Start Solr
  * Index attached bug.xml
  * Query [http://localhost:8983/solr/select/?q=awehjse-wjkekw]
 It throws a StringIndexOutOfBoundsException
 {noformat} String index out of range: -7
 java.lang.StringIndexOutOfBoundsException: String index out of range: -7
   at java.lang.AbstractStringBuilder.replace(Unknown Source)
   at java.lang.StringBuilder.replace(Unknown Source)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:248)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:143)
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1727) SolrEventListener should extend NamedListInitializedPlugin

2010-01-19 Thread Noble Paul (JIRA)
SolrEventListener should extend NamedListInitializedPlugin
--

 Key: SOLR-1727
 URL: https://issues.apache.org/jira/browse/SOLR-1727
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5


SolrEventListenerhas the init(NamedList args) method but does not extend 
namedListInitializedPlugin. It is non-standard

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1727) SolrEventListener should extend NamedListInitializedPlugin

2010-01-19 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1727:
-

Attachment: SOLR-1727.patch

 SolrEventListener should extend NamedListInitializedPlugin
 --

 Key: SOLR-1727
 URL: https://issues.apache.org/jira/browse/SOLR-1727
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1727.patch


 SolrEventListenerhas the init(NamedList args) method but does not extend 
 namedListInitializedPlugin. It is non-standard

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1728) ResponseWriters should support byte[], ByteBuffer

2010-01-19 Thread Noble Paul (JIRA)
ResponseWriters should support byte[], ByteBuffer
-

 Key: SOLR-1728
 URL: https://issues.apache.org/jira/browse/SOLR-1728
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Priority: Minor


Only BinaryResponseWriter supports byte[] and ByteBuffer. Other writers also 
should support these

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1728) ResponseWriters should support byte[], ByteBuffer

2010-01-19 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1728:
-

Fix Version/s: 1.5
 Assignee: Noble Paul

 ResponseWriters should support byte[], ByteBuffer
 -

 Key: SOLR-1728
 URL: https://issues.apache.org/jira/browse/SOLR-1728
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5


 Only BinaryResponseWriter supports byte[] and ByteBuffer. Other writers also 
 should support these

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.