from:"Koji Sekiguchi \(JIRA\)"

[jira] Closed: (SOLR-1879) Error loading class 'Solr.ASCIIFoldingFilterFactory'

2010-04-12 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi closed SOLR-1879.


Resolution: Not A Problem

Adlene, please use solr-user mailing list for getting help.

http://lucene.apache.org/solr/mailing_lists.html


> Error loading class 'Solr.ASCIIFoldingFilterFactory'
> 
>
> Key: SOLR-1879
> URL: https://issues.apache.org/jira/browse/SOLR-1879
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
> Environment: Windows XP, Apache Tomcat 6
>Reporter: adlene sifi
>
> I am trying  to use Solr.ASCIIFoldingFilterFactory filter  as follow :
> 
>   
> 
>  ignoreCase="true"
> words="french_stopwords.txt"
> enablePositionIncrements="true"
> />
> 
> 
>   
>mapping="mapping-ISOLatin1Accent.txt"/>
>   
> ...
> 
> However I receive the following error message when restarting Apach Tomcat 
> server :
> GRAVE: org.apache.solr.common.SolrException: Error loading class 
> 'Solr.ASCIIFoldingFilterFactory'
>   at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
>   at 
> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:388)
> .
> Caused by: java.lang.ClassNotFoundException: Solr.ASCIIFoldingFilterFactory
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   ... 40 more
> Could you please help me on that ?
> Thanks a lot
> Adlene

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (SOLR-1878) RelaxQueryComponent - A new SearchComponent that relaxes the main query in a semiautomatic way

2010-04-11 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1878:
-

Summary: RelaxQueryComponent - A new SearchComponent that relaxes the 
main query in a semiautomatic way  (was: RelaxQueryComponent - A new 
SearchComponent that relaxes the main in a semiautomatic way)
Description: 
I have the following use case:

Imagine that you visit a web page for searching an apartment for rent. You 
choose parameters, usually mark check boxes and this makes AND queries:

{code}
rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *]
{code}

If the conditions are too tight, Solr may return few or zero leasehold 
properties. Because the things is not good for the site visitors and also 
owners, the owner may want to recommend the visitors to relax the conditions 
something like:

{code}
rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *]
{code}

or:

{code}
rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *]
{code}

And if the relaxed query get more numFound than original, the web page can 
provide a link with a comment "if you can pay additional $100, ${numFound} 
properties will be found!".

Today, I need to implement Solr client for this scenario, but this way makes 
two round trips for showing one page and consistency problem (and laborious of 
course!).

I'm thinking a new SearchComponent that can be used with QueryComponent. It 
does search when numFound of the main query is less than a threshold. Clients 
can specify via request parameters how the query can be relaxed.

  was:
I have the following use case:

Imagine that you visit a web page for searching an apartment for rent. You 
choose parameters, usually mark check boxes and this makes AND queries:

{code}
rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *]
{code}

If the conditions are too tight, Solr may return few or zero leasehold 
properties. Because the things is not good for the site visitors and also 
owners, the owner may want to recommend the visitors to relax the conditions 
something like:

{code}
rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *]
{code}

or:

{code}
rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *]
{code}

And if the relaxed query get more numFound than original, the web page can 
provide a link with a comment "if you can pay additional $100, ${numFound} 
properties will be found!".

Today, I need to implement client for this scenario, but this way makes two 
round trips for showing one page and consistency problem (and laborious of 
course!).

I'm thinking a new SearchComponent that can be used with QueryComponent. It 
does search when numFound of the main query is less than a threshold. Clients 
can specify via request parameters how the query can be relaxed.


> RelaxQueryComponent - A new SearchComponent that relaxes the main query in a 
> semiautomatic way
> --
>
> Key: SOLR-1878
> URL: https://issues.apache.org/jira/browse/SOLR-1878
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
>
> I have the following use case:
> Imagine that you visit a web page for searching an apartment for rent. You 
> choose parameters, usually mark check boxes and this makes AND queries:
> {code}
> rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *]
> {code}
> If the conditions are too tight, Solr may return few or zero leasehold 
> properties. Because the things is not good for the site visitors and also 
> owners, the owner may want to recommend the visitors to relax the conditions 
> something like:
> {code}
> rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *]
> {code}
> or:
> {code}
> rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *]
> {code}
> And if the relaxed query get more numFound than original, the web page can 
> provide a link with a comment "if you can pay additional $100, ${numFound} 
> properties will be found!".
> Today, I need to implement Solr client for this scenario, but this way makes 
> two round trips for showing one page and consistency problem (and laborious 
> of course!).
> I'm thinking a new SearchComponent that can be used with QueryComponent. It 
> does search when numFound of the main query is less than a threshold. Clients 
> can specify via request parameters how the query can be relaxed.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Created: (SOLR-1878) RelaxQueryComponent - A new SearchComponent that relaxes the main in a semiautomatic way

2010-04-11 Thread Koji Sekiguchi (JIRA)

RelaxQueryComponent - A new SearchComponent that relaxes the main in a 
semiautomatic way


 Key: SOLR-1878
 URL: https://issues.apache.org/jira/browse/SOLR-1878
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Priority: Minor


I have the following use case:

Imagine that you visit a web page for searching an apartment for rent. You 
choose parameters, usually mark check boxes and this makes AND queries:

{code}
rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[100 TO *]
{code}

If the conditions are too tight, Solr may return few or zero leasehold 
properties. Because the things is not good for the site visitors and also 
owners, the owner may want to recommend the visitors to relax the conditions 
something like:

{code}
rent:[* TO 1700] AND bedroom:[2 TO *] AND floor:[100 TO *]
{code}

or:

{code}
rent:[* TO 1500] AND bedroom:[2 TO *] AND floor:[90 TO *]
{code}

And if the relaxed query get more numFound than original, the web page can 
provide a link with a comment "if you can pay additional $100, ${numFound} 
properties will be found!".

Today, I need to implement client for this scenario, but this way makes two 
round trips for showing one page and consistency problem (and laborious of 
course!).

I'm thinking a new SearchComponent that can be used with QueryComponent. It 
does search when numFound of the main query is less than a threshold. Clients 
can specify via request parameters how the query can be relaxed.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (SOLR-860) moreLikeThis Degug

2010-04-06 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-860:


  Component/s: (was: search)
   SearchComponents - other
 Priority: Minor  (was: Major)
Fix Version/s: (was: 1.5)
   3.1

> moreLikeThis Degug
> --
>
> Key: SOLR-860
> URL: https://issues.apache.org/jira/browse/SOLR-860
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Affects Versions: 1.3
> Environment: Gentoo Linux, Solr 1.4, tomcat webserver
>Reporter: Jeff
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 3.1
>
> Attachments: SOLR-860.patch
>
>
> moreLikeThis searchcomponent currently has no way to debug or see information 
> on the process.  This means that if moreLikeThis suggests another document 
> there is no way to actually view why it picked that to hone the searching.  
> Adding an explain would be extremely useful in determining the reasons why 
> solr is recommending the items.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-860) moreLikeThis Degug

2010-04-06 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-860:


Attachment: SOLR-860.patch

With the attached patch, BooleanQueries constructed by MLT and MLT helper 
function can be seen in debug area.

sample request and response:

{code}
http://localhost:8983/solr/select/?q=solr+ipod&indent=on&mlt=on&mlt.fl=features&mlt.mintf=1&mlt.count=2&debugQuery=on&wt=json
{code}

{code}
"debug":{
  "moreLikeThis":{
"IW-02":{
   "rawMLTQuery":"",
   "boostedMLTQuery":"",
   "realMLTQuery":"+() -id:IW-02"},
"SOLR1000":{
   "rawMLTQuery":"",
   "boostedMLTQuery":"",
   "realMLTQuery":"+() -id:SOLR1000"},
"F8V7067-APL-KIT":{
   "rawMLTQuery":"",
   "boostedMLTQuery":"",
   "realMLTQuery":"+() -id:F8V7067-APL-KIT"},
"MA147LL/A":{
   "rawMLTQuery":"features:2 features:0 features:lcd features:x features:3",
   "boostedMLTQuery":"features:2 features:0 features:lcd features:x 
features:3",
   "realMLTQuery":"+(features:2 features:0 features:lcd features:x 
features:3) -id:MA147LL/A"}},

}
{code}


> moreLikeThis Degug
> --
>
> Key: SOLR-860
> URL: https://issues.apache.org/jira/browse/SOLR-860
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
> Environment: Gentoo Linux, Solr 1.4, tomcat webserver
>Reporter: Jeff
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
> Attachments: SOLR-860.patch
>
>
> moreLikeThis searchcomponent currently has no way to debug or see information 
> on the process.  This means that if moreLikeThis suggests another document 
> there is no way to actually view why it picked that to hone the searching.  
> Adding an explain would be extremely useful in determining the reasons why 
> solr is recommending the items.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-860) moreLikeThis Degug

2010-03-29 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851072#action_12851072
 ] 

Koji Sekiguchi commented on SOLR-860:
-

At minimum, I'd like to see how the BooleanQuery constructed by mlt look like. 
Can ResponseBuilder.addDebugInfo() be used for it?

> moreLikeThis Degug
> --
>
> Key: SOLR-860
> URL: https://issues.apache.org/jira/browse/SOLR-860
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
> Environment: Gentoo Linux, Solr 1.4, tomcat webserver
>Reporter: Jeff
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
>
> moreLikeThis searchcomponent currently has no way to debug or see information 
> on the process.  This means that if moreLikeThis suggests another document 
> there is no way to actually view why it picked that to hone the searching.  
> Adding an explain would be extremely useful in determining the reasons why 
> solr is recommending the items.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-860) moreLikeThis Degug

2010-03-29 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-860:
---

Assignee: Koji Sekiguchi

> moreLikeThis Degug
> --
>
> Key: SOLR-860
> URL: https://issues.apache.org/jira/browse/SOLR-860
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
> Environment: Gentoo Linux, Solr 1.4, tomcat webserver
>Reporter: Jeff
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
>
> moreLikeThis searchcomponent currently has no way to debug or see information 
> on the process.  This means that if moreLikeThis suggests another document 
> there is no way to actually view why it picked that to hone the searching.  
> Adding an explain would be extremely useful in determining the reasons why 
> solr is recommending the items.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1703) Sorting by function problems on multicore (more than one core)

2010-03-11 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1703:
-

Description: 
When using sort by function (for example dist function) with multicore with 
more than one core (on multicore with one core, ie. the example deployment the 
problem doesn`t exist) there is a problem with not using the right schema. I 
think there is a problem with this portion of code:

QueryParsing.java:

{code}
public static FunctionQuery parseFunction(String func, IndexSchema schema) 
throws ParseException {
SolrCore core = SolrCore.getSolrCore();
return (FunctionQuery) (QParser.getParser(func, "func", new 
LocalSolrQueryRequest(core, new HashMap())).parse());
// return new FunctionQuery(parseValSource(new StrParser(func), schema));
}
{code}

Code above uses deprecated method to get the core sometimes getting the wrong 
core effecting in impossibility to find the right fields in index. 

  was:
When using sort by function (for example dist function) with multicore with 
more than one core (on multicore with one core, ie. the example deployment the 
problem doesn`t exist) there is a problem with not using the right schema. I 
think there is a problem with this portion of code:

QueryParsing.java:

public static FunctionQuery parseFunction(String func, IndexSchema schema) 
throws ParseException {
SolrCore core = SolrCore.getSolrCore();
return (FunctionQuery) (QParser.getParser(func, "func", new 
LocalSolrQueryRequest(core, new HashMap())).parse());
// return new FunctionQuery(parseValSource(new StrParser(func), schema));
}

Code above uses deprecated method to get the core sometimes getting the wrong 
core effecting in impossibility to find the right fields in index. 


> Sorting by function problems on multicore (more than one core)
> --
>
> Key: SOLR-1703
> URL: https://issues.apache.org/jira/browse/SOLR-1703
> Project: Solr
>  Issue Type: Bug
>  Components: multicore, search
>Affects Versions: 1.5
> Environment: Linux (debian, ubuntu), 64bits
>Reporter: Rafał Kuć
>
> When using sort by function (for example dist function) with multicore with 
> more than one core (on multicore with one core, ie. the example deployment 
> the problem doesn`t exist) there is a problem with not using the right 
> schema. I think there is a problem with this portion of code:
> QueryParsing.java:
> {code}
> public static FunctionQuery parseFunction(String func, IndexSchema schema) 
> throws ParseException {
> SolrCore core = SolrCore.getSolrCore();
> return (FunctionQuery) (QParser.getParser(func, "func", new 
> LocalSolrQueryRequest(core, new HashMap())).parse());
> // return new FunctionQuery(parseValSource(new StrParser(func), schema));
> }
> {code}
> Code above uses deprecated method to get the core sometimes getting the wrong 
> core effecting in impossibility to find the right fields in index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-03-01 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839879#action_12839879
 ] 

Koji Sekiguchi commented on SOLR-1268:
--

bq. When using Dismax, the fast vector highlighter fails to return any 
highlighting when there is more than one column in qf (eg. "qf=Name Company")...

Right. See https://issues.apache.org/jira/browse/LUCENE-2243 .


> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, 
> SOLR-1268.patch, SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1297) Enable sorting by Function Query

2010-03-01 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1297:
-

Attachment: SOLR-1297-2.patch

When I set *bit* complex function to sort parameter, I got the error:

{panel}
Must declare sort field or function
org.apache.solr.common.SolrException: Must declare sort field or function
at 
org.apache.solr.search.QueryParsing.processSort(QueryParsing.java:376)
at org.apache.solr.search.QueryParsing.parseSort(QueryParsing.java:281)
at 
org.apache.solr.search.QueryParsingTest.testSort(QueryParsingTest.java:105)
{panel}

Attached the fix and the test case.

> Enable sorting by Function Query
> 
>
> Key: SOLR-1297
> URL: https://issues.apache.org/jira/browse/SOLR-1297
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1297-2.patch, SOLR-1297.patch
>
>
> It would be nice if one could sort by FunctionQuery.  See also SOLR-773, 
> where this was first mentioned by Yonik as part of the generic solution to 
> geo-search

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833527#action_12833527
 ] 

Koji Sekiguchi edited comment on SOLR-1773 at 2/14/10 8:19 AM:
---

Oops, I've glanced at SOLR-236 related issues, but I thought it was for 
finalize response format from Description. I'll look into SOLR-1682. Thanks! :)

  was (Author: koji):
Oops, I've glanced at SOLR-236 related issues, but I wasn't awake to the 
existence. I'll look into SOLR-1682. Thanks! :)
  
> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: LOADTEST.patch, SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833527#action_12833527
 ] 

Koji Sekiguchi commented on SOLR-1773:
--

Oops, I've glanced at SOLR-236 related issues, but I wasn't awake to the 
existence. I'll look into SOLR-1682. Thanks! :)

> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: LOADTEST.patch, SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833495#action_12833495
 ] 

Koji Sekiguchi edited comment on SOLR-1773 at 2/14/10 4:54 AM:
---

Random comment on the patch:

- TimeAllowed not supported
- cache not supported
- distributed search is not supported
- sort field is hard-coded in the patch
- collapse.type=adjacent is not supported
- collapse.aggregate is not supported (but supportable)
- not yet, but collapse.sort can be supported to specify sort criteria in 
collapse group

supported parameters:

|collapse|set to on to use field collapsing|
|collapse.field|field name to collapse (required)|
|collapse.limit|maximum number of collapsed docs to return in each collapse 
group. default is 0.|
|collapse.fl|comma- or space- delimited list of fields to return. multiValued 
field and TrieField are not supported yet|


  was (Author: koji):
Random comment on the patch:

- TimeAllowed not supported
- cache not supported
- distributed search is not supported
- sort field is hard-coded in the patch
- collapse.type=adjacent is not supported
- collapse.aggregate is not supported (but supportable)
- not yet, but collapse.sort can be supported to specify sort criteria in 
collapse group

supported parameters:

|collapse|set to on to use field collapsing|
|collapse.field|field name to collapse (required)|
|collapse.limit|maximum number of collapsed docs to return in each collapse 
group|
|collapse.fl|comma- or space- delimited list of fields to return|

  
> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: LOADTEST.patch, SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833495#action_12833495
 ] 

Koji Sekiguchi edited comment on SOLR-1773 at 2/14/10 4:51 AM:
---

Random comment on the patch:

- TimeAllowed not supported
- cache not supported
- distributed search is not supported
- sort field is hard-coded in the patch
- collapse.type=adjacent is not supported
- collapse.aggregate is not supported (but supportable)
- not yet, but collapse.sort can be supported to specify sort criteria in 
collapse group

supported parameters:

|collapse|set to on to use field collapsing|
|collapse.field|field name to collapse (required)|
|collapse.limit|maximum number of collapsed docs to return in each collapse 
group|
|collapse.fl|comma- or space- delimited list of fields to return|


  was (Author: koji):
Random comment on the patch:

- TimeAllowed not supported
- cache not supported
- distributed search is not supported
- sort field is hard-coded in the patch
- collapse.type=adjacent is not supported
- collapse.aggregate is not supported (but supportable)
- not yet, but collapse.sort can be supported

supported parameters:

|collapse|set to on to use field collapsing|
|collapse.field|field name to collapse (required)|
|collapse.limit|maximum number of collapsed docs to return in each collapse 
group|
|collapse.fl|comma- or space- delimited list of fields to return|

  
> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: LOADTEST.patch, SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1773:
-

Attachment: LOADTEST.patch

A very rough/simple load test patch attached.

QTime average of 1,000 times random queries were:

||num docs in index||SOLR-236||SOLR-1773||
|1M|321 ms|185ms|
|10M|2,914 ms (*)|1,642 ms|

(*) I needed to set -Xmx1024m in this case, though 512m for other cases, to 
avoid OOM.

SOLR-1773 is 43% faster.

> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: LOADTEST.patch, SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833495#action_12833495
 ] 

Koji Sekiguchi commented on SOLR-1773:
--

Random comment on the patch:

- TimeAllowed not supported
- cache not supported
- distributed search is not supported
- sort field is hard-coded in the patch
- collapse.type=adjacent is not supported
- collapse.aggregate is not supported (but supportable)
- not yet, but collapse.sort can be supported

supported parameters:

|collapse|set to on to use field collapsing|
|collapse.field|field name to collapse (required)|
|collapse.limit|maximum number of collapsed docs to return in each collapse 
group|
|collapse.fl|comma- or space- delimited list of fields to return|


> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1773:
-

Attachment: SOLR-1773.patch

The first draft, untested patch. Use for PoC only. In this patch, I hard-coded 
sort field by using java.util.Comparator.

> Field Collapsing (lightweight version)
> --
>
> Key: SOLR-1773
> URL: https://issues.apache.org/jira/browse/SOLR-1773
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: SOLR-1773.patch
>
>
> I'd like to start another approach for field collapsing suggested by Yonik on 
> 19/Dec/09 at SOLR-236. Re-posting the idea:
> {code}
> === two pass collapsing algorithm for collapse.aggregate=max 
> 
> First pass: pretend that collapseCount=1
>   - Use a TreeSet as  a priority queue since one can remove and insert 
> entries.
>   - A HashMap will be used to map from collapse group to 
> top entry in the TreeSet
>   - compare new doc with smallest element in treeset.  If smaller discard and 
> go to the next doc.
>   - If new doc is bigger, look up it's group.  Use the Map to find if the 
> group has been added to the TreeSet and add it if not.
>   - If the new bigger doc is already in the TreeSet, compare with the 
> document in that group.  If bigger, update the node,
> remove and re-add to the TreeSet to re-sort.
> efficiency: the treeset and hashmap are both only the size of the top number 
> of docs we are looking at (10 for instance)
> We will now have the top 10 documents collapsed by the right field with a 
> collapseCount of 1.  Put another way, we have the top 10 groups.
> Second pass (if collapseCount>1):
>  - create a priority queue for each group (10) of size collapseCount
>  - re-execute the query (or if the sort within the collapse groups does not 
> involve score, we could just use the docids gathered during phase 1)
>  - for each document, find it's appropriate priority queue and insert
>  - optimization: we can use the previous info from phase1 to even avoid 
> creating a priority queue if no other items matched.
> So instead of creating collapse groups for every group in the set (as is done 
> now?), we create it for only 10 groups.
> Instead of collecting the score for every document in the set (40MB per 
> request for a 10M doc index is *big*) we re-execute the query if needed.
> We could optionally store the score as is done now... but I bet aggregate 
> throughput on large indexes would be better by just re-executing.
> Other thought: we could also cache the first phase in the query cache which 
> would allow one to quickly move to the 2nd phase for any collapseCount.
> {code}
> The restriction is:
> {quote}
> one would not be able to tell the total number of collapsed docs, or the 
> total number of hits (or the DocSet) after collapsing. So only 
> collapse.facet=before would be supported.
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1773) Field Collapsing (lightweight version)

2010-02-13 Thread Koji Sekiguchi (JIRA)

Field Collapsing (lightweight version)
--

 Key: SOLR-1773
 URL: https://issues.apache.org/jira/browse/SOLR-1773
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Priority: Minor


I'd like to start another approach for field collapsing suggested by Yonik on 
19/Dec/09 at SOLR-236. Re-posting the idea:

{code}
=== two pass collapsing algorithm for collapse.aggregate=max 

First pass: pretend that collapseCount=1
  - Use a TreeSet as  a priority queue since one can remove and insert entries.
  - A HashMap will be used to map from collapse group to top 
entry in the TreeSet
  - compare new doc with smallest element in treeset.  If smaller discard and 
go to the next doc.
  - If new doc is bigger, look up it's group.  Use the Map to find if the group 
has been added to the TreeSet and add it if not.
  - If the new bigger doc is already in the TreeSet, compare with the document 
in that group.  If bigger, update the node,
remove and re-add to the TreeSet to re-sort.

efficiency: the treeset and hashmap are both only the size of the top number of 
docs we are looking at (10 for instance)
We will now have the top 10 documents collapsed by the right field with a 
collapseCount of 1.  Put another way, we have the top 10 groups.

Second pass (if collapseCount>1):
 - create a priority queue for each group (10) of size collapseCount
 - re-execute the query (or if the sort within the collapse groups does not 
involve score, we could just use the docids gathered during phase 1)
 - for each document, find it's appropriate priority queue and insert
 - optimization: we can use the previous info from phase1 to even avoid 
creating a priority queue if no other items matched.

So instead of creating collapse groups for every group in the set (as is done 
now?), we create it for only 10 groups.
Instead of collecting the score for every document in the set (40MB per request 
for a 10M doc index is *big*) we re-execute the query if needed.
We could optionally store the score as is done now... but I bet aggregate 
throughput on large indexes would be better by just re-executing.

Other thought: we could also cache the first phase in the query cache which 
would allow one to quickly move to the 2nd phase for any collapseCount.
{code}

The restriction is:

{quote}
one would not be able to tell the total number of collapsed docs, or the total 
number of hits (or the DocSet) after collapsing. So only collapse.facet=before 
would be supported.
{quote}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-02-05 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1268:
-

Attachment: SOLR-1268.patch

The patch includes:

# eliminate hl.useHighlighter parameter
# introduce hl.useFastVectorHighlighter parameter. The default is false

Therefore, Highlighter will be used unless hl.useFastVectorHighlighter set to 
true. I'll commit in a few days.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, 
> SOLR-1268.patch, SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-02-04 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1268:
-

Attachment: SOLR-1268-0_fragsize.patch

Hmm, FVH doesn't work appropriately when fragsize=Integer.MAX_SIZE (see 
test0FragSize() in attached patch. It indicates FVH cannot produce whole 
snippet when fragsize=Integer.MAX_SIZE).

Now I think I should change the (traditional) Highlighter is default even if 
the highlighting field's termVectors/termPositions/termOffsets are all true, 
then only when hl.useFastVectorHighlighter is set to true, FVH will be used. 
hl.useFastVectorHighlighter parameter accepts per-field overrides. Plus FVH 
doesn't support 0 fragsize.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, 
> SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search

2010-02-04 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1753.
--

Resolution: Fixed

Committed revision 906781. Thanks Janne!

> StatsComponent throws java.lang.NullPointerException when getting statistics 
> for facets in distributed search
> -
>
> Key: SOLR-1753
> URL: https://issues.apache.org/jira/browse/SOLR-1753
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
> Environment: Windows
>Reporter: Janne Majaranta
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
> Attachments: SOLR-1753.patch
>
>
> When using the StatsComponent with a sharded request and getting statistics 
> over facets, a NullPointerException is thrown.
> Stacktrace:
> java.lang.NullPointerException at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) 
> at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) 
> at 
> org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>  at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>  at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>  at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) 
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>  at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) 
> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) 
> at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>  at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at 
> java.lang.Thread.run(Unknown Source) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search

2010-02-04 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829914#action_12829914
 ] 

Koji Sekiguchi commented on SOLR-1753:
--

Patch looks good! Will commit shortly.

> StatsComponent throws java.lang.NullPointerException when getting statistics 
> for facets in distributed search
> -
>
> Key: SOLR-1753
> URL: https://issues.apache.org/jira/browse/SOLR-1753
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
> Environment: Windows
>Reporter: Janne Majaranta
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
> Attachments: SOLR-1753.patch
>
>
> When using the StatsComponent with a sharded request and getting statistics 
> over facets, a NullPointerException is thrown.
> Stacktrace:
> java.lang.NullPointerException at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) 
> at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) 
> at 
> org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>  at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>  at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>  at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) 
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>  at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) 
> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) 
> at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>  at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at 
> java.lang.Thread.run(Unknown Source) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search

2010-02-04 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1753:
-

Affects Version/s: (was: 1.5)
Fix Version/s: 1.5

> StatsComponent throws java.lang.NullPointerException when getting statistics 
> for facets in distributed search
> -
>
> Key: SOLR-1753
> URL: https://issues.apache.org/jira/browse/SOLR-1753
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
> Environment: Windows
>Reporter: Janne Majaranta
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
> Attachments: SOLR-1753.patch
>
>
> When using the StatsComponent with a sharded request and getting statistics 
> over facets, a NullPointerException is thrown.
> Stacktrace:
> java.lang.NullPointerException at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) 
> at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) 
> at 
> org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>  at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>  at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>  at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) 
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>  at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) 
> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) 
> at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>  at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at 
> java.lang.Thread.run(Unknown Source) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1753) StatsComponent throws java.lang.NullPointerException when getting statistics for facets in distributed search

2010-02-04 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-1753:


Assignee: Koji Sekiguchi

> StatsComponent throws java.lang.NullPointerException when getting statistics 
> for facets in distributed search
> -
>
> Key: SOLR-1753
> URL: https://issues.apache.org/jira/browse/SOLR-1753
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
> Environment: Windows
>Reporter: Janne Majaranta
>Assignee: Koji Sekiguchi
> Fix For: 1.5
>
> Attachments: SOLR-1753.patch
>
>
> When using the StatsComponent with a sharded request and getting statistics 
> over facets, a NullPointerException is thrown.
> Stacktrace:
> java.lang.NullPointerException at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:54) 
> at 
> org.apache.solr.handler.component.StatsValues.accumulate(StatsValues.java:82) 
> at 
> org.apache.solr.handler.component.StatsComponent.handleResponses(StatsComponent.java:116)
>  at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
>  at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>  at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>  at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>  at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>  at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>  at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) 
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>  at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) 
> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) 
> at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>  at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at 
> java.lang.Thread.run(Unknown Source) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2010-02-04 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829522#action_12829522
 ] 

Koji Sekiguchi commented on SOLR-236:
-

The following snippet in CollapseComponent.doProcess():

{code}
DocListAndSet results = searcher.getDocListAndSet(rb.getQuery(),
  collapseResult == null ? rb.getFilters() : null,
  collapseResult.getCollapsedDocset(),
  rb.getSortSpec().getSort(),
  rb.getSortSpec().getOffset(),
  rb.getSortSpec().getCount(),
  rb.getFieldFlags());
{code}

2nd line implies that collapseResult may be null. If it is null, we got NPE at 
3rd line?

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
> SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
> SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-236) Field collapsing

2010-02-01 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828039#action_12828039
 ] 

Koji Sekiguchi commented on SOLR-236:
-

A random comment, don't we need to check collapse.field is indexed in 
checkCollapseField()?

{code}
protected void checkCollapseField(IndexSchema schema) {
  SchemaField schemaField = schema.getFieldOrNull(collapseField);
  if (schemaField == null) {
throw new RuntimeException("Could not collapse, because collapse field does 
not exist in the schema.");
  }

  if (schemaField.multiValued()) {
throw new RuntimeException("Could not collapse, because collapse field is 
multivalued");
  }

  if (schemaField.getType().isTokenized()) {
throw new RuntimeException("Could not collapse, because collapse field is 
tokenized");
  }
}
{code}

I accidentally specified an unindexed field for collapse.field, I got 
unexpected result without any errors.

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Emmanuel Keller
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.5
>
> Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
> collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
> field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-5.patch, field-collapse-5.patch, 
> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
> SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
> SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-01-29 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1268:
-

Attachment: SOLR-1268-0_fragsize.patch

{quote}
I have noticed an exception is thrown when using fragSize = 0 (wich should 
return the whole field highlighted):
"fragCharSize(0) is too small. It must be 18 or higher. 
java.lang.IllegalArgumentException: fragCharSize(0) is too small. It must be 18 
or higher"
{quote}

Thanks, Marc.
Solr 1.4 uses NullFragmenter that highlights whole content when you set 
fragsize to 0. But FVH doesn't have such feature because of using different 
algorithm.
In the attached patch, Solr sets fragsize to Integer.MAX_VALUE if user trys to 
set 0 when FVH is used. This prevents runtime error.
I think it is necessary in Solr level because Solr automatically switch to use 
FVH when the highlighting field is termVectors/termPositions/termOffsets are 
all true unless hl.useHighlighter set to true.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268.patch, 
> SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1731) ArrayIndexOutOfBoundsException when highlighting

2010-01-23 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804087#action_12804087
 ] 

Koji Sekiguchi commented on SOLR-1731:
--

So why don't you uni-gram on both index and query for sku field?

{code}











{code}

{quote}
As far as my application cares, those are all equivalent and should just be 
indexed as:

a1280c
{quote}

To eliminate space/period/hyphen, mapping.txt would look like:

{code}
" " => ""
"." => ""
"-" => ""
{code}



> ArrayIndexOutOfBoundsException when highlighting
> 
>
> Key: SOLR-1731
> URL: https://issues.apache.org/jira/browse/SOLR-1731
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.4
>Reporter: Tim Underwood
>Priority: Minor
>
> I'm seeing an java.lang.ArrayIndexOutOfBoundsException when trying to 
> highlight for certain queries.  The error seems to be an issue with the 
> combination of the ShingleFilterFactory, PositionFilterFactory and the 
> LengthFilterFactory. 
> Here's my fieldType definition:
>  omitNorms="true">
>   
> 
>  generateNumberParts="0" catenateWords="0" catenateNumbers="0" 
> catenateAll="1"/>
> 
> 
> 
>   
>   
>   
>outputUnigrams="true"/>
>   
>generateNumberParts="0" catenateWords="0" catenateNumbers="0" 
> catenateAll="1"/>
>   
>   
>
> 
> 
> Here's the field definition:
>  omitNorms="true"/>
> Here's a sample doc:
> 
> 
>   1
>   A 1280 C
> 
> 
> Doing a query for sku_new:"A 1280 C" and requesting highlighting throws the 
> exception (full stack trace below):  
> http://localhost:8983/solr/select/?q=sku_new%3A%22A+1280+C%22&version=2.2&start=0&rows=10&indent=on&&hl=on&hl.fl=sku_new&fl=*
> If I comment out the LengthFilterFactory from my query analyzer section 
> everything seems to work.  Commenting out just the PositionFilterFactory also 
> makes the exception go away and seems to work for this specific query.
> Full stack trace:
> java.lang.ArrayIndexOutOfBoundsException: -1
> at 
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:202)
> at 
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:414)
> at 
> org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:216)
> at 
> org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:184)
> at 
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:226)
> at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335)
> at 
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
> at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
> at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

-- 
This message is automatically generated by JIRA.
-
You can re

[jira] Commented: (SOLR-1731) ArrayIndexOutOfBoundsException when highlighting

2010-01-22 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803976#action_12803976
 ] 

Koji Sekiguchi commented on SOLR-1731:
--

Can't you use WhitespaceTokenizer for index? 

> ArrayIndexOutOfBoundsException when highlighting
> 
>
> Key: SOLR-1731
> URL: https://issues.apache.org/jira/browse/SOLR-1731
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.4
>Reporter: Tim Underwood
>Priority: Minor
>
> I'm seeing an java.lang.ArrayIndexOutOfBoundsException when trying to 
> highlight for certain queries.  The error seems to be an issue with the 
> combination of the ShingleFilterFactory, PositionFilterFactory and the 
> LengthFilterFactory. 
> Here's my fieldType definition:
>  omitNorms="true">
>   
> 
>  generateNumberParts="0" catenateWords="0" catenateNumbers="0" 
> catenateAll="1"/>
> 
> 
> 
>   
>   
>   
>outputUnigrams="true"/>
>   
>generateNumberParts="0" catenateWords="0" catenateNumbers="0" 
> catenateAll="1"/>
>   
>   
>
> 
> 
> Here's the field definition:
>  omitNorms="true"/>
> Here's a sample doc:
> 
> 
>   1
>   A 1280 C
> 
> 
> Doing a query for sku_new:"A 1280 C" and requesting highlighting throws the 
> exception (full stack trace below):  
> http://localhost:8983/solr/select/?q=sku_new%3A%22A+1280+C%22&version=2.2&start=0&rows=10&indent=on&&hl=on&hl.fl=sku_new&fl=*
> If I comment out the LengthFilterFactory from my query analyzer section 
> everything seems to work.  Commenting out just the PositionFilterFactory also 
> makes the exception go away and seems to work for this specific query.
> Full stack trace:
> java.lang.ArrayIndexOutOfBoundsException: -1
> at 
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:202)
> at 
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:414)
> at 
> org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:216)
> at 
> org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:184)
> at 
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:226)
> at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:335)
> at 
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
> at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
> at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
> at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
> at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802014#action_12802014
 ] 

Koji Sekiguchi commented on SOLR-1725:
--

I like the idea, Uri. I've not looked into the patch yet, it depends on Java 6? 
I think Solr support Java 5. There is ScriptTransformer in DIH which uses 
javax.script but it looks like ScriptEngineManager is loaded at runtime.

> Script based UpdateRequestProcessorFactory
> --
>
> Key: SOLR-1725
> URL: https://issues.apache.org/jira/browse/SOLR-1725
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.4
>Reporter: Uri Boness
> Attachments: SOLR-1725.patch, SOLR-1725.patch
>
>
> A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
> support). The main goal of this plugin is to be able to configure/write 
> update processors without the need to write and package Java code.
> The update request processor factory enables writing update processors in 
> scripts located in {{solr.solr.home}} directory. The functory accepts one 
> (mandatory) configuration parameter named {{scripts}} which accepts a 
> comma-separated list of file names. It will look for these files under the 
> {{conf}} directory in solr home. When multiple scripts are defined, their 
> execution order is defined by the lexicographical order of the script file 
> name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
> The script language is resolved based on the script file extension (that is, 
> a *.js files will be treated as a JavaScript script), therefore an extension 
> is mandatory.
> Each script file is expected to have one or more methods with the same 
> signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
> *not* required to define all methods, only those hat are required by the 
> processing logic.
> The following variables are define as global variables for each script:
>  * {{req}} - The SolrQueryRequest
>  * {{rsp}}- The SolrQueryResponse
>  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1696) Deprecate old syntax and move configuration to HighlightComponent

2010-01-09 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1696:
-

Attachment: SOLR-1696.patch

A new patch attached. Just to sync with trunk plus warning log when deprecated 
syntax is found (the idea Chris mentioned above).

> Deprecate old  syntax and move configuration to 
> HighlightComponent
> 
>
> Key: SOLR-1696
> URL: https://issues.apache.org/jira/browse/SOLR-1696
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Noble Paul
> Fix For: 1.5
>
> Attachments: SOLR-1696.patch, SOLR-1696.patch
>
>
> There is no reason why we should have a custom syntax for highlighter 
> configuration.
> It can be treated like any other SearchComponent and all the configuration 
> can go in there.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1696) Deprecate old syntax and move configuration to HighlightComponent

2010-01-08 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798312#action_12798312
 ] 

Koji Sekiguchi commented on SOLR-1696:
--

I've just committed SOLR-1268. Now I'm trying to contribute a patch for this to 
sync with trunk...

> Deprecate old  syntax and move configuration to 
> HighlightComponent
> 
>
> Key: SOLR-1696
> URL: https://issues.apache.org/jira/browse/SOLR-1696
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Noble Paul
> Fix For: 1.5
>
> Attachments: SOLR-1696.patch
>
>
> There is no reason why we should have a custom syntax for highlighter 
> configuration.
> It can be treated like any other SearchComponent and all the configuration 
> can go in there.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-01-08 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1268.
--

Resolution: Fixed

Committed revision 897383.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2010-01-08 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798271#action_12798271
 ] 

Koji Sekiguchi commented on SOLR-1653:
--

Thanks, Paul! I've just committed revision 897357.

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch, SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1696) Deprecate old syntax and move configuration to HighlightComponent

2010-01-07 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797841#action_12797841
 ] 

Koji Sekiguchi commented on SOLR-1696:
--

Noble, thank you for opening this and attaching the patch! Are you planning to 
commit this shortly? because I'm ready to commit SOLR-1268 that is using old 
style config. If you commit it, I'll rewrite SOLR-1268. Or I can assign 
SOLR-1696 to me.

> Deprecate old  syntax and move configuration to 
> HighlightComponent
> 
>
> Key: SOLR-1696
> URL: https://issues.apache.org/jira/browse/SOLR-1696
> Project: Solr
>  Issue Type: Improvement
>  Components: highlighter
>Reporter: Noble Paul
> Fix For: 1.5
>
> Attachments: SOLR-1696.patch
>
>
> There is no reason why we should have a custom syntax for highlighter 
> configuration.
> It can be treated like any other SearchComponent and all the configuration 
> can go in there.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-01-04 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796147#action_12796147
 ] 

Koji Sekiguchi commented on SOLR-1268:
--

I'll commit in a few days if nobody objects.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-01-03 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796075#action_12796075
 ] 

Koji Sekiguchi commented on SOLR-1268:
--

I'm introducing  and  new sub tags of 
 in solrconfig.xml in this patch, rather than 
. I think we can open a separate ticket for moving 
 settings to , if needed.

FYI:
http://old.nabble.com/highlighting-setting-in-solrconfig.xml-td26984003.html

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-01-03 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1268:
-

Attachment: SOLR-1268.patch

Added a few SolrFragmentsBuilders and test cases.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268.patch, SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-01-02 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1268:
-

Attachment: SOLR-1268.patch

First draft, untested patch attached.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1268.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1670) synonymfilter/map repeat bug

2009-12-19 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792928#action_12792928
 ] 

Koji Sekiguchi commented on SOLR-1670:
--

Robert, sorry, I wanted to say I agree with you regarding "the test for 
'repeats' has a flaw". Then "boost TF" was just an input, though I don't know 
it is intentional feature or side effect.

Why don't you fix the flaws in SynonymFilter test in this ticket first, then 
fix SOLR-1674? (I've not looked into SOLR-1674 yet.)

> synonymfilter/map repeat bug
> 
>
> Key: SOLR-1670
> URL: https://issues.apache.org/jira/browse/SOLR-1670
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Robert Muir
> Attachments: SOLR-1670_test.patch
>
>
> as part of converting tests for SOLR-1657, I ran into a problem with 
> synonymfilter
> the test for 'repeats' has a flaw, it uses this assertTokEqual construct 
> which does not really validate that two lists of token are equal, it just 
> stops at the shorted one.
> {code}
> // repeats
> map.add(strings("a b"), tokens("ab"), orig, merge);
> map.add(strings("a b"), tokens("ab"), orig, merge);
> assertTokEqual(getTokList(map,"a b",false), tokens("ab"));
> /* in reality the result from getTokList is ab ab ab! */
> {code}
> when converted to assertTokenStreamContents this problem surfaced. attached 
> is an additional assertion to the existing testcase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1670) synonymfilter/map repeat bug

2009-12-19 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792920#action_12792920
 ] 

Koji Sekiguchi commented on SOLR-1670:
--

bq. the test for 'repeats' has a flaw, it uses this assertTokEqual construct 
which does not really validate that two lists of token are equal, it just stops 
at the shorted one.

I agree with you regarding this part. But I'm not sure that the following 
size() should be 1 in your patch:

{code}
+assertEquals(1, getTokList(map,"a b",false).size());
{code}

If what "repeats" implies is repeating same term intentionally, I think it can 
boost tf.

> synonymfilter/map repeat bug
> 
>
> Key: SOLR-1670
> URL: https://issues.apache.org/jira/browse/SOLR-1670
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Robert Muir
> Attachments: SOLR-1670_test.patch
>
>
> as part of converting tests for SOLR-1657, I ran into a problem with 
> synonymfilter
> the test for 'repeats' has a flaw, it uses this assertTokEqual construct 
> which does not really validate that two lists of token are equal, it just 
> stops at the shorted one.
> {code}
> // repeats
> map.add(strings("a b"), tokens("ab"), orig, merge);
> map.add(strings("a b"), tokens("ab"), orig, merge);
> assertTokEqual(getTokList(map,"a b",false), tokens("ab"));
> /* in reality the result from getTokList is ab ab ab! */
> {code}
> when converted to assertTokenStreamContents this problem surfaced. attached 
> is an additional assertion to the existing testcase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1653) add PatternReplaceCharFilter

2009-12-15 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1653.
--

Resolution: Fixed

Committed revision 890798. Thanks Shalin and Noble for taking time to review 
the patch.

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch, SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790572#action_12790572
 ] 

Koji Sekiguchi commented on SOLR-1653:
--

I see that existing "PatternReplaceFilter" (not CharFilter) is using "pattern". 
But it uses "replacement", not "replaceWith". I think I use "pattern" and 
"replacement".

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch, SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1653:
-

Attachment: SOLR-1653.patch

Excuse myself, because I tried to correct offset per group in a match when I 
started the first patch, I introduced my own syntax. But, yes, now I've 
implemented the offset correction per match, so I can use standard syntax. Here 
is the new patch.

Usage:
{code:title=schema.xml}

  



  

{code}

If there is no objections, I'll commit later today.

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch, SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790127#action_12790127
 ] 

Koji Sekiguchi commented on SOLR-1653:
--

bq. I guess this can be achieved with the matcher#replaceAll() directly 

You're right if we don't correct offset of the output char stream. I need to 
process one match at a time.

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056
 ] 

Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:30 AM:


Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc=1234=5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678=abc=1234|change the order 
of the groups|


  was (Author: koji):
Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678=abc=1234|change the order 
of the groups|

  
> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056
 ] 

Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:28 AM:


Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678-abc-1234|change the order 
of the groups|


  was (Author: koji):
Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)--(\d+)--(\d+)|3,{--},1,{--},2|5678-abc-1234|change the 
order of the groups|

  
> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056
 ] 

Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:29 AM:


Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678=abc=1234|change the order 
of the groups|


  was (Author: koji):
Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)=(\d+)=(\d+)|3,{=},1,{=},2|5678-abc-1234|change the order 
of the groups|

  
> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056
 ] 

Koji Sekiguchi edited comment on SOLR-1653 at 12/14/09 9:27 AM:


Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)--(\d+)--(\d+)|3,{--},1,{--},2|5678-abc-1234|change the 
order of the groups|


  was (Author: koji):
Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)-(\d+)-(\d+)|3,{-},1,{-},2|5678-abc-1234|change the order 
of the groups|

  
> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2009-12-14 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790056#action_12790056
 ] 

Koji Sekiguchi commented on SOLR-1653:
--

Ok. I'll show you same samples ;-)

||INPUT||groupedPattern||replaceGroups||OUTPUT||comment||
|see-ing looking|(\w+)(ing)|1|see-ing look|remove "ing" from the end of word|
|see-ing looking|(\w+)ing|1|see-ing look|same as above. 2nd parentheses can be 
omitted|
|No.1 NO. no.  543|[nN][oO]\.\s*(\d+)|{#},1|#1  NO. #543|sample for 
literal. do not forget to set blockDelimiters other than period when you use 
period in groupedPattern|
|abc-1234-5678|(\w+)-(\d+)-(\d+)|3,{-},1,{-},2|5678-abc-1234|change the order 
of the groups|


> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1653) add PatternReplaceCharFilter

2009-12-13 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789957#action_12789957
 ] 

Koji Sekiguchi commented on SOLR-1653:
--

I'll commit in a few days.

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1653) add PatternReplaceCharFilter

2009-12-13 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-1653:


Assignee: Koji Sekiguchi

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter

2009-12-13 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1653:
-

Attachment: SOLR-1653.patch

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1653) add PatternReplaceCharFilter

2009-12-13 Thread Koji Sekiguchi (JIRA)

add PatternReplaceCharFilter


 Key: SOLR-1653
 URL: https://issues.apache.org/jira/browse/SOLR-1653
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Priority: Minor
 Fix For: 1.5


Add a new CharFilter that uses a regular expression for the target of replace 
string in char stream.

Usage:
{code:title=schema.xml}

  



  

{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1606) Integrate Near Realtime

2009-12-05 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786448#action_12786448
 ] 

Koji Sekiguchi commented on SOLR-1606:
--

Jason, I got a failure when running TestRefreshReader.

> Integrate Near Realtime 
> 
>
> Key: SOLR-1606
> URL: https://issues.apache.org/jira/browse/SOLR-1606
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1606.patch
>
>
> We'll integrate IndexWriter.getReader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1607) use a proper key other than IndexReader for ExternalFileField and QueryElevationCompenent to work properly when reopenReaders is set to true

2009-11-28 Thread Koji Sekiguchi (JIRA)

use a proper key other than IndexReader for ExternalFileField and 
QueryElevationCompenent to work properly when reopenReaders is set to true


 Key: SOLR-1607
 URL: https://issues.apache.org/jira/browse/SOLR-1607
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 1.5


As introducing reopenReaders feature in 1.4, this prevent reload 
external_[fieldname] and elevate.xml files in dataDir when commit is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1601) Schema browser does not indicate presence of charFilter

2009-11-25 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1601.
--

Resolution: Fixed

Committed revision 884180. Thanks, Jake.

> Schema browser does not indicate presence of charFilter
> ---
>
> Key: SOLR-1601
> URL: https://issues.apache.org/jira/browse/SOLR-1601
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Jake Brownell
>Assignee: Koji Sekiguchi
>Priority: Trivial
> Fix For: 1.5
>
> Attachments: SOLR-1601.patch
>
>
> My schema has a field defined as:
> {noformat}
>  positionIncrementGap="100">
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  words="stopwords.txt" enablePositionIncrements="true" />
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  words="stopwords.txt" enablePositionIncrements="true" />  
>  
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="0" catenateNumbers="0" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
> {noformat}
> and when I view the field in the schema browser, I see:
> {noformat}
> Tokenized:  true
> Class Name:  org.apache.solr.schema.TextField
> Index Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 1 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> Query Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt 
> expand: true ignoreCase: true }
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 0 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> {noformat}
> It's not a big deal, but I expected to see some indication of the charFilter 
> that is in place.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1601) Schema browser does not indicate presence of charFilter

2009-11-25 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1601:
-

Attachment: SOLR-1601.patch

Will commit shortly.

> Schema browser does not indicate presence of charFilter
> ---
>
> Key: SOLR-1601
> URL: https://issues.apache.org/jira/browse/SOLR-1601
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Jake Brownell
>Assignee: Koji Sekiguchi
>Priority: Trivial
> Fix For: 1.5
>
> Attachments: SOLR-1601.patch
>
>
> My schema has a field defined as:
> {noformat}
>  positionIncrementGap="100">
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  words="stopwords.txt" enablePositionIncrements="true" />
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  words="stopwords.txt" enablePositionIncrements="true" />  
>  
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="0" catenateNumbers="0" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
> {noformat}
> and when I view the field in the schema browser, I see:
> {noformat}
> Tokenized:  true
> Class Name:  org.apache.solr.schema.TextField
> Index Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 1 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> Query Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt 
> expand: true ignoreCase: true }
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 0 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> {noformat}
> It's not a big deal, but I expected to see some indication of the charFilter 
> that is in place.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1601) Schema browser does not indicate presence of charFilter

2009-11-25 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1601:
-

  Component/s: Schema and Analysis
Affects Version/s: 1.4
Fix Version/s: 1.5
 Assignee: Koji Sekiguchi

> Schema browser does not indicate presence of charFilter
> ---
>
> Key: SOLR-1601
> URL: https://issues.apache.org/jira/browse/SOLR-1601
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Jake Brownell
>Assignee: Koji Sekiguchi
>Priority: Trivial
> Fix For: 1.5
>
>
> My schema has a field defined as:
> {noformat}
>  positionIncrementGap="100">
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  words="stopwords.txt" enablePositionIncrements="true" />
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  words="stopwords.txt" enablePositionIncrements="true" />  
>  
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="0" catenateNumbers="0" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
> {noformat}
> and when I view the field in the schema browser, I see:
> {noformat}
> Tokenized:  true
> Class Name:  org.apache.solr.schema.TextField
> Index Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 1 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> Query Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt 
> expand: true ignoreCase: true }
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 0 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> {noformat}
> It's not a big deal, but I expected to see some indication of the charFilter 
> that is in place.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)

2009-11-25 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1489:
-

Attachment: SOLR-1489.patch

Attached patch fixes the above failure, but I got another failure (no expires 
header):

{code}
Testcase: testCacheVetoHandler took 3.29 sec
Testcase: testCacheVetoException took 1.395 sec
FAILED
We got no Expires header
junit.framework.AssertionFailedError: We got no Expires header
at 
org.apache.solr.servlet.CacheHeaderTest.checkVetoHeaders(CacheHeaderTest.java:73)
at 
org.apache.solr.servlet.CacheHeaderTest.testCacheVetoException(CacheHeaderTest.java:59)

Testcase: testLastModified took 1.485 sec
Testcase: testEtag took 1.577 sec
Testcase: testCacheControl took 1.035 sec
{code}


> A UTF-8 character is output twice (Bug in Jetty)
> 
>
> Key: SOLR-1489
> URL: https://issues.apache.org/jira/browse/SOLR-1489
> Project: Solr
>  Issue Type: Bug
> Environment: Jetty-6.1.3
> Jetty-6.1.21
> Jetty-7.0.0RC6
>Reporter: Jun Ohtani
>Assignee: Koji Sekiguchi
>Priority: Critical
> Attachments: error_utf8-example.xml, jetty-6.1.22.jar, 
> jetty-util-6.1.22.jar, jettybugsample.war, jsp-2.1.zip, 
> servlet-api-2.5-20081211.jar, SOLR-1489.patch
>
>
> A UTF-8 character is output twice under particular conditions.
> Attach the sample data.(error_utf8-example.xml)
> Registered only sample data, click the following URL.
> http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json
> Sample data is only "Ｂ", but response is "ＢＢ".
> When wt=phps, error occurs in PHP unsrialize() function.
> This bug is like a bug in Jetty.
> jettybugsample.war is the simplest one to reproduce the problem.
> Copy example/webapps, and start Jetty server, and click the following URL.
> http://localhost:8983/jettybugsample/filter/hoge
> Like earlier, B is output twice. Sysout only B once.
> I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6.
> (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in 
> web.xml. )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)

2009-11-24 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782335#action_12782335
 ] 

Koji Sekiguchi commented on SOLR-1489:
--

Thanks, Ohtani-san.

Using these new jetty jars (6.1.22), I run ant test, but I got a failure:

{code:title=TEST-org.apache.solr.servlet.CacheHeaderTest.txt}
Testcase: testCacheVetoHandler took 2.469 sec
Testcase: testCacheVetoException took 1.25 sec
FAILED
null expected:<[no-cache, ]no-store> but 
was:<[must-revalidate,no-cache,]no-store>
junit.framework.ComparisonFailure: null expected:<[no-cache, ]no-store> but 
was:<[must-revalidate,no-cache,]no-store>
at 
org.apache.solr.servlet.CacheHeaderTest.checkVetoHeaders(CacheHeaderTest.java:65)
at 
org.apache.solr.servlet.CacheHeaderTest.testCacheVetoException(CacheHeaderTest.java:59)

Testcase: testLastModified took 1.188 sec
Testcase: testEtag took 1.11 sec
Testcase: testCacheControl took 1.391 sec
{code}

According to SOLR-632, the cache header related test was failed when we used 
jetty-6.1.11, Lars filed https://jira.codehaus.org/browse/JETTY-646. Now the 
issue has been fixed, I thought jetty-6.1.22 should work. I've not looked into 
the details of cache header test, though.

> A UTF-8 character is output twice (Bug in Jetty)
> 
>
> Key: SOLR-1489
> URL: https://issues.apache.org/jira/browse/SOLR-1489
> Project: Solr
>  Issue Type: Bug
> Environment: Jetty-6.1.3
> Jetty-6.1.21
> Jetty-7.0.0RC6
>Reporter: Jun Ohtani
>Assignee: Koji Sekiguchi
>Priority: Critical
> Attachments: error_utf8-example.xml, jetty-6.1.22.jar, 
> jetty-util-6.1.22.jar, jettybugsample.war, jsp-2.1.zip, 
> servlet-api-2.5-20081211.jar
>
>
> A UTF-8 character is output twice under particular conditions.
> Attach the sample data.(error_utf8-example.xml)
> Registered only sample data, click the following URL.
> http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json
> Sample data is only "Ｂ", but response is "ＢＢ".
> When wt=phps, error occurs in PHP unsrialize() function.
> This bug is like a bug in Jetty.
> jettybugsample.war is the simplest one to reproduce the problem.
> Copy example/webapps, and start Jetty server, and click the following URL.
> http://localhost:8983/jettybugsample/filter/hoge
> Like earlier, B is output twice. Sysout only B once.
> I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6.
> (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in 
> web.xml. )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)

2009-11-18 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779814#action_12779814
 ] 

Koji Sekiguchi commented on SOLR-1489:
--

Ok, http://jira.codehaus.org/browse/JETTY-1122 has been marked as fixed and 
jetty 6.1.22 released. Ohtani-san, can you test the new jetty with your test 
case to see the bug is gone? Thanks.

> A UTF-8 character is output twice (Bug in Jetty)
> 
>
> Key: SOLR-1489
> URL: https://issues.apache.org/jira/browse/SOLR-1489
> Project: Solr
>  Issue Type: Bug
> Environment: Jetty-6.1.3
> Jetty-6.1.21
> Jetty-7.0.0RC6
>Reporter: Jun Ohtani
>Assignee: Koji Sekiguchi
>Priority: Critical
> Attachments: error_utf8-example.xml, jettybugsample.war
>
>
> A UTF-8 character is output twice under particular conditions.
> Attach the sample data.(error_utf8-example.xml)
> Registered only sample data, click the following URL.
> http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json
> Sample data is only "Ｂ", but response is "ＢＢ".
> When wt=phps, error occurs in PHP unsrialize() function.
> This bug is like a bug in Jetty.
> jettybugsample.war is the simplest one to reproduce the problem.
> Copy example/webapps, and start Jetty server, and click the following URL.
> http://localhost:8983/jettybugsample/filter/hoge
> Like earlier, B is output twice. Sysout only B once.
> I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6.
> (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in 
> web.xml. )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1506) Search multiple cores using MultiReader

2009-11-03 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773213#action_12773213
 ] 

Koji Sekiguchi commented on SOLR-1506:
--

bq. Commit doesn't work because reopen isn't supported by MultiReader.

Regarding MultiReader and reopen, I've set reopenReaders to false:

{code:title=solrconfig.xml}
false
  :

{code}


> Search multiple cores using MultiReader
> ---
>
> Key: SOLR-1506
> URL: https://issues.apache.org/jira/browse/SOLR-1506
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Trivial
> Fix For: 1.5
>
> Attachments: SOLR-1506.patch, SOLR-1506.patch
>
>
> I need to search over multiple cores, and SOLR-1477 is more
> complicated than expected, so here we'll create a MultiReader
> over the cores to allow searching on them.
> Maybe in the future we can add parallel searching however
> SOLR-1477, if it gets completed, provides that out of the box.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-822) CharFilter - normalize characters before tokenizer

2009-10-24 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769741#action_12769741
 ] 

Koji Sekiguchi commented on SOLR-822:
-

bq. Please update the Wiki for this feature. 

Done. :)

> CharFilter - normalize characters before tokenizer
> --
>
> Key: SOLR-822
> URL: https://issues.apache.org/jira/browse/SOLR-822
> Project: Solr
>  Issue Type: New Feature
>  Components: Analysis
>Affects Versions: 1.3
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: character-normalization.JPG, 
> japanese-h-to-k-mapping.txt, sample_mapping_ja.txt, sample_mapping_ja.txt, 
> SOLR-822-for-1.3.patch, SOLR-822-renameMethod.patch, SOLR-822.patch, 
> SOLR-822.patch, SOLR-822.patch, SOLR-822.patch, SOLR-822.patch
>
>
> A new plugin which can be placed in front of .
> {code:xml}
>  positionIncrementGap="100" >
>   
>  mapping="mapping_ja.txt" />
> 
>  words="stopwords.txt"/>
> 
>   
> 
> {code}
>  can be multiple (chained). I'll post a JPEG file to show 
> character normalization sample soon.
> MOTIVATION:
> In Japan, there are two types of tokenizers -- N-gram (CJKTokenizer) and 
> Morphological Analyzer.
> When we use morphological analyzer, because the analyzer uses Japanese 
> dictionary to detect terms,
> we need to normalize characters before tokenization.
> I'll post a patch soon, too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-551) Solr replication should include the schema also

2009-10-23 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-551:


Component/s: (was: replication (scripts))
 replication (java)

change component from scripts to java

> Solr replication should include the schema also
> ---
>
> Key: SOLR-551
> URL: https://issues.apache.org/jira/browse/SOLR-551
> Project: Solr
>  Issue Type: Improvement
>  Components: replication (java)
>Affects Versions: 1.4
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
>
> The current Solr replication just copy the data directory . So if the
> schema changes and I do a re-index it will blissfully copy the index
> and the slaves will fail because of incompatible schema.
> So the steps we follow are
>  * Stop rsync on slaves
>  * Update the master with new schema
>  * re-index data
>  * forEach slave
>  ** Kill the slave
>  ** clean the data directory
>  ** install the new schema
>  ** restart
>  ** do a manual snappull
> The amount of work the admin needs to do is quite significant
> (depending on the no:of slaves). These are manual steps and very error
> prone
> The solution :
> Make the replication mechanism handle the schema replication also. So
> all I need to do is to just change the master and the slaves synch
> automatically
> What is a good way to implement this?
> We have an idea along the following lines
> This should involve changes to the snapshooter and snappuller scripts
> and the snapinstaller components
> Everytime the snapshooter takes a snapshot it must keep the timestamps
> of schema.xml and elevate.xml (all the files which might affect the
> runtime behavior in slaves)
> For subsequent snapshots if the timestamps of any of them is changed
> it must copy the all of them also for replication.
> The snappuller copies the new directory as usual
> The snapinstaller checks if these config files are present ,
> if yes,
>  * It can create a temporary core
>  * install the changed index and configuration
>  * load it completely and swap it out with the original core

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-561) Solr replication by Solr (for windows also)

2009-10-23 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-561:


Component/s: (was: replication (scripts))
 replication (java)

change component from scripts to java

> Solr replication by Solr (for windows also)
> ---
>
> Key: SOLR-561
> URL: https://issues.apache.org/jira/browse/SOLR-561
> Project: Solr
>  Issue Type: New Feature
>  Components: replication (java)
>Affects Versions: 1.4
> Environment: All
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.4
>
> Attachments: deletion_policy.patch, SOLR-561-core.patch, 
> SOLR-561-fixes.patch, SOLR-561-fixes.patch, SOLR-561-fixes.patch, 
> SOLR-561-full.patch, SOLR-561-full.patch, SOLR-561-full.patch, 
> SOLR-561-full.patch, SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, 
> SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, 
> SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, SOLR-561.patch, 
> SOLR-561.patch, SOLR-561.patch, SOLR-561.patch
>
>
> The current replication strategy in solr involves shell scripts . The 
> following are the drawbacks with the approach
> *  It does not work with windows
> * Replication works as a separate piece not integrated with solr.
> * Cannot control replication from solr admin/JMX
> * Each operation requires manual telnet to the host
> Doing the replication in java has the following advantages
> * Platform independence
> * Manual steps can be completely eliminated. Everything can be driven from 
> solrconfig.xml .
> ** Adding the url of the master in the slaves should be good enough to enable 
> replication. Other things like frequency of
> snapshoot/snappull can also be configured . All other information can be 
> automatically obtained.
> * Start/stop can be triggered from solr/admin or JMX
> * Can get the status/progress while replication is going on. It can also 
> abort an ongoing replication
> * No need to have a login into the machine 
> * From a development perspective, we can unit test it
> This issue can track the implementation of solr replication in java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1099) FieldAnalysisRequestHandler

2009-10-20 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1099.
--

Resolution: Fixed

Committed revision 827032. Thanks.

> FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1099
> URL: https://issues.apache.org/jira/browse/SOLR-1099
> Project: Solr
>  Issue Type: New Feature
>  Components: Analysis
>Affects Versions: 1.3
>Reporter: Uri Boness
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: AnalisysRequestHandler_refactored.patch, 
> analysis_request_handlers_incl_solrj.patch, 
> AnalysisRequestHandler_refactored1.patch, 
> FieldAnalysisRequestHandler_incl_test.patch, 
> SOLR-1099-ordered-TokenizerChain.patch, SOLR-1099.patch, SOLR-1099.patch, 
> SOLR-1099.patch
>
>
> The FieldAnalysisRequestHandler provides the analysis functionality of the 
> web admin page as a service. This handler accepts a filetype/fieldname 
> parameter and a value and as a response returns a breakdown of the analysis 
> process. It is also possible to send a query value which will use the 
> configured query analyzer as well as a showmatch parameter which will then 
> mark every matched token as a match.
> If this handler is added to the code base, I also recommend to rename the 
> current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
> them both inherit from one AnalysisRequestHandlerBase class which provides 
> the common functionality of the analysis breakdown and its translation to 
> named lists. This will also enhance the current AnalysisRequestHandler which 
> right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1099) FieldAnalysisRequestHandler

2009-10-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1099:
-

Attachment: SOLR-1099-ordered-TokenizerChain.patch

I'd like to use NamedList rather than SimpleOrderedMap. If there is no 
objections, I'll commit soon. All tests pass.

> FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1099
> URL: https://issues.apache.org/jira/browse/SOLR-1099
> Project: Solr
>  Issue Type: New Feature
>  Components: Analysis
>Affects Versions: 1.3
>Reporter: Uri Boness
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: AnalisysRequestHandler_refactored.patch, 
> analysis_request_handlers_incl_solrj.patch, 
> AnalysisRequestHandler_refactored1.patch, 
> FieldAnalysisRequestHandler_incl_test.patch, 
> SOLR-1099-ordered-TokenizerChain.patch, SOLR-1099.patch, SOLR-1099.patch, 
> SOLR-1099.patch
>
>
> The FieldAnalysisRequestHandler provides the analysis functionality of the 
> web admin page as a service. This handler accepts a filetype/fieldname 
> parameter and a value and as a response returns a breakdown of the analysis 
> process. It is also possible to send a query value which will use the 
> configured query analyzer as well as a showmatch parameter which will then 
> mark every matched token as a match.
> If this handler is added to the code base, I also recommend to rename the 
> current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
> them both inherit from one AnalysisRequestHandlerBase class which provides 
> the common functionality of the analysis breakdown and its translation to 
> named lists. This will also enhance the current AnalysisRequestHandler which 
> right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (SOLR-1099) FieldAnalysisRequestHandler

2009-10-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reopened SOLR-1099:
--

  Assignee: Koji Sekiguchi  (was: Shalin Shekhar Mangar)

Hmm, I think the order of Tokenizer/TokenFilters in response is unconsidered. 
For example, I cannot take out Tokenizer/TokenFilters from ruby response in 
order...

> FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1099
> URL: https://issues.apache.org/jira/browse/SOLR-1099
> Project: Solr
>  Issue Type: New Feature
>  Components: Analysis
>Affects Versions: 1.3
>Reporter: Uri Boness
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: AnalisysRequestHandler_refactored.patch, 
> analysis_request_handlers_incl_solrj.patch, 
> AnalysisRequestHandler_refactored1.patch, 
> FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, 
> SOLR-1099.patch, SOLR-1099.patch
>
>
> The FieldAnalysisRequestHandler provides the analysis functionality of the 
> web admin page as a service. This handler accepts a filetype/fieldname 
> parameter and a value and as a response returns a breakdown of the analysis 
> process. It is also possible to send a query value which will use the 
> configured query analyzer as well as a showmatch parameter which will then 
> mark every matched token as a match.
> If this handler is added to the code base, I also recommend to rename the 
> current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
> them both inherit from one AnalysisRequestHandlerBase class which provides 
> the common functionality of the analysis breakdown and its translation to 
> named lists. This will also enhance the current AnalysisRequestHandler which 
> right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1515) Javadoc typo in SolrQueryResponse

2009-10-17 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1515.
--

Resolution: Fixed

Committed revision 826321. Thanks.

> Javadoc typo in SolrQueryResponse
> -
>
> Key: SOLR-1515
> URL: https://issues.apache.org/jira/browse/SOLR-1515
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.3
> Environment: my local MacBook pro
>Reporter: Chris A. Mattmann
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1515.101709.Mattmann.patch.txt
>
>
> There is a minute typo in the javadoc for 
> o.a.s.request.SolrQueryResponse.java. This patch fixes that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1515) Javadoc typo in SolrQueryResponse

2009-10-17 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1515:
-

Fix Version/s: (was: 1.5)
   1.4

> Javadoc typo in SolrQueryResponse
> -
>
> Key: SOLR-1515
> URL: https://issues.apache.org/jira/browse/SOLR-1515
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.3
> Environment: my local MacBook pro
>Reporter: Chris A. Mattmann
>Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1515.101709.Mattmann.patch.txt
>
>
> There is a minute typo in the javadoc for 
> o.a.s.request.SolrQueryResponse.java. This patch fixes that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-670) UpdateHandler must provide a rollback feature

2009-10-12 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-670.
-

Resolution: Fixed

Committed revision 824380.

> UpdateHandler must provide a rollback feature
> -
>
> Key: SOLR-670
> URL: https://issues.apache.org/jira/browse/SOLR-670
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-670-revert-cumulative-counts.patch, SOLR-670.patch, 
> SOLR-670.patch, SOLR-670.patch, SOLR-670.patch, SOLR-670.patch
>
>
> Lucene IndexWriter already has a rollback method. There should be a 
> counterpart for the same in _UpdateHandler_  so that users can do a rollback 
> over http 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-670) UpdateHandler must provide a rollback feature

2009-10-12 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-670:


Attachment: SOLR-670-revert-cumulative-counts.patch

The fix and test case. I'll commit soon.

> UpdateHandler must provide a rollback feature
> -
>
> Key: SOLR-670
> URL: https://issues.apache.org/jira/browse/SOLR-670
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-670-revert-cumulative-counts.patch, SOLR-670.patch, 
> SOLR-670.patch, SOLR-670.patch, SOLR-670.patch, SOLR-670.patch
>
>
> Lucene IndexWriter already has a rollback method. There should be a 
> counterpart for the same in _UpdateHandler_  so that users can do a rollback 
> over http 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (SOLR-670) UpdateHandler must provide a rollback feature

2009-10-12 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reopened SOLR-670:
-

  Assignee: Koji Sekiguchi  (was: Shalin Shekhar Mangar)

Rollback should reset not only adds/deletesById/deletesByQuery counts but also 
cumulative counts of them.

> UpdateHandler must provide a rollback feature
> -
>
> Key: SOLR-670
> URL: https://issues.apache.org/jira/browse/SOLR-670
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-670.patch, SOLR-670.patch, SOLR-670.patch, 
> SOLR-670.patch, SOLR-670.patch
>
>
> Lucene IndexWriter already has a rollback method. There should be a 
> counterpart for the same in _UpdateHandler_  so that users can do a rollback 
> over http 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1504) empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.

2009-10-11 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1504.
--

Resolution: Fixed

Committed revision 824045.

> empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp 
> and co.
> ---
>
> Key: SOLR-1504
> URL: https://issues.apache.org/jira/browse/SOLR-1504
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1504.patch
>
>
> If you have the following mapping rule in mapping.txt:
> {code}
> # destination can be empty
> "NULL" => ""
> {code}
> you can get AIOOBE by specifying NULL for either index or query data in the 
> input form of analysis.jsp (and co. i.e. DocumentAnalysisRequestHandler and 
> FieldAnalysisRequestHandler).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1504) empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.

2009-10-11 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1504:
-

Attachment: SOLR-1504.patch

A patch for the fix. Will commit soon.

> empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp 
> and co.
> ---
>
> Key: SOLR-1504
> URL: https://issues.apache.org/jira/browse/SOLR-1504
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1504.patch
>
>
> If you have the following mapping rule in mapping.txt:
> {code}
> # destination can be empty
> "NULL" => ""
> {code}
> you can get AIOOBE by specifying NULL for either index or query data in the 
> input form of analysis.jsp (and co. i.e. DocumentAnalysisRequestHandler and 
> FieldAnalysisRequestHandler).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1504) empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and co.

2009-10-11 Thread Koji Sekiguchi (JIRA)

empty char mapping can cause ArrayIndexOutOfBoundsException in analysis.jsp and 
co.
---

 Key: SOLR-1504
 URL: https://issues.apache.org/jira/browse/SOLR-1504
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 1.4


If you have the following mapping rule in mapping.txt:

{code}
# destination can be empty
"NULL" => ""
{code}

you can get AIOOBE by specifying NULL for either index or query data in the 
input form of analysis.jsp (and co. i.e. DocumentAnalysisRequestHandler and 
FieldAnalysisRequestHandler).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2009-10-09 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1268:
-

Fix Version/s: 1.5

Mark it to 1.5 because there is no patches.

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2009-10-09 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-1268:


Assignee: Koji Sekiguchi

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)

2009-10-03 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-1489:


Assignee: Koji Sekiguchi

> A UTF-8 character is output twice (Bug in Jetty)
> 
>
> Key: SOLR-1489
> URL: https://issues.apache.org/jira/browse/SOLR-1489
> Project: Solr
>  Issue Type: Bug
> Environment: Jetty-6.1.3
> Jetty-6.1.21
> Jetty-7.0.0RC6
>Reporter: Jun Ohtani
>Assignee: Koji Sekiguchi
>Priority: Critical
> Attachments: error_utf8-example.xml, jettybugsample.war
>
>
> A UTF-8 character is output twice under particular conditions.
> Attach the sample data.(error_utf8-example.xml)
> Registered only sample data, click the following URL.
> http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json
> Sample data is only "Ｂ", but response is "ＢＢ".
> When wt=phps, error occurs in PHP unsrialize() function.
> This bug is like a bug in Jetty.
> jettybugsample.war is the simplest one to reproduce the problem.
> Copy example/webapps, and start Jetty server, and click the following URL.
> http://localhost:8983/jettybugsample/filter/hoge
> Like earlier, B is output twice. Sysout only B once.
> I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6.
> (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in 
> web.xml. )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1489) A UTF-8 character is output twice (Bug in Jetty)

2009-10-03 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761900#action_12761900
 ] 

Koji Sekiguchi commented on SOLR-1489:
--

Good catch, Otani-san! I can reproduce the problem with the data and the filter 
you attached when running it on Jetty. And thank you for opening the JIRA 
ticket in Jetty.
Now we are closing to releasing 1.4, I don't want this to be a blocker because 
this is not a Solr bug as you said. You can run Solr on arbitrary servlet 
containers other than Jetty if you'd like.
I'd like to keep this opening, and watching  
http://jira.codehaus.org/browse/JETTY-1122 . Thanks.

> A UTF-8 character is output twice (Bug in Jetty)
> 
>
> Key: SOLR-1489
> URL: https://issues.apache.org/jira/browse/SOLR-1489
> Project: Solr
>  Issue Type: Bug
> Environment: Jetty-6.1.3
> Jetty-6.1.21
> Jetty-7.0.0RC6
>Reporter: Jun Ohtani
>Priority: Critical
> Attachments: error_utf8-example.xml, jettybugsample.war
>
>
> A UTF-8 character is output twice under particular conditions.
> Attach the sample data.(error_utf8-example.xml)
> Registered only sample data, click the following URL.
> http://localhost:8983/solr/select?q=*%3A*&version=2.2&start=0&rows=10&omitHeader=true&fl=attr_json&wt=json
> Sample data is only "Ｂ", but response is "ＢＢ".
> When wt=phps, error occurs in PHP unsrialize() function.
> This bug is like a bug in Jetty.
> jettybugsample.war is the simplest one to reproduce the problem.
> Copy example/webapps, and start Jetty server, and click the following URL.
> http://localhost:8983/jettybugsample/filter/hoge
> Like earlier, B is output twice. Sysout only B once.
> I have tested this on Jetty 6.1.3 and 6.1.21, 7.0.0rc6.
> (When testing with 6.1.21or 7.0.0rc6, change "bufsize" from 128 to 512 in 
> web.xml. )

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1481) phps writer ignores omitHeader parameter

2009-10-01 Thread Koji Sekiguchi (JIRA)

phps writer ignores omitHeader parameter


 Key: SOLR-1481
 URL: https://issues.apache.org/jira/browse/SOLR-1481
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Koji Sekiguchi
Priority: Trivial
 Fix For: 1.4


My co-worker found this one. I'm expecting a patch will be attached soon by 
him. :)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others

2009-09-18 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1423.
--

Resolution: Fixed

Committed revision 816502. Thanks, Uwe!

> Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & 
> others
> 
>
> Key: SOLR-1423
> URL: https://issues.apache.org/jira/browse/SOLR-1423
> Project: Solr
>  Issue Type: Task
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Uwe Schindler
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-1423-FieldType.patch, 
> SOLR-1423-fix-empty-tokens.patch, SOLR-1423-fix-empty-tokens.patch, 
> SOLR-1423-with-empty-tokens.patch, SOLR-1423.patch, SOLR-1423.patch, 
> SOLR-1423.patch
>
>
> Because of some backwards compatibility problems (LUCENE-1906) we changed the 
> CharStream/CharFilter API a little bit. Tokenizer now only has a input field 
> of type java.io.Reader (as before the CharStream code). To correct offsets, 
> it is now needed to call the Tokenizer.correctOffset(int) method, which 
> delegates to the CharStream (if input is subclass of CharStream), else 
> returns an uncorrected offset. Normally it is enough to change all occurences 
> of input.correctOffset() to this.correctOffset() in Tokenizers. It should 
> also be checked, if custom Tokenizers in Solr do correct their offsets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others

2009-09-17 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756923#action_12756923
 ] 

Koji Sekiguchi commented on SOLR-1423:
--

The patch looks good! Will commit shortly.

> Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & 
> others
> 
>
> Key: SOLR-1423
> URL: https://issues.apache.org/jira/browse/SOLR-1423
> Project: Solr
>  Issue Type: Task
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Uwe Schindler
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-1423-FieldType.patch, 
> SOLR-1423-fix-empty-tokens.patch, SOLR-1423-fix-empty-tokens.patch, 
> SOLR-1423-with-empty-tokens.patch, SOLR-1423.patch, SOLR-1423.patch, 
> SOLR-1423.patch
>
>
> Because of some backwards compatibility problems (LUCENE-1906) we changed the 
> CharStream/CharFilter API a little bit. Tokenizer now only has a input field 
> of type java.io.Reader (as before the CharStream code). To correct offsets, 
> it is now needed to call the Tokenizer.correctOffset(int) method, which 
> delegates to the CharStream (if input is subclass of CharStream), else 
> returns an uncorrected offset. Normally it is enough to change all occurences 
> of input.correctOffset() to this.correctOffset() in Tokenizers. It should 
> also be checked, if custom Tokenizers in Solr do correct their offsets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others

2009-09-14 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1423:
-

Attachment: SOLR-1423.patch

The patch that is Uwe's one with replacing split()/group() methods.

bq. Why does the PatternTokenizer does not have the methods newToken and so on 
in its own class
Yeah, I'd realized it immediately after posting the patch, but I was going to 
be out.

And thank you for adapting it for new TokenStream API.

bq. I searched for setOffset() in Solr source code and found one additional 
occurence of it without offset correcting in FieldType.java. This patch fixes 
this.
Good catch, Uwe! I slipped over it.

I think the empty tokens is a bug and should be omitted in this patch.

bq. A second thing: Lucene has a new BaseTokenStreamTest class for checking 
tokens without Token instances (which would no loger work, when Lucene 3.0 
switches to Attributes only). Maybe you should update these test and use 
assertAnalyzesTo from the new base class instead.
Very nice! Can you open a separate ticket?

> Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & 
> others
> 
>
> Key: SOLR-1423
> URL: https://issues.apache.org/jira/browse/SOLR-1423
> Project: Solr
>  Issue Type: Task
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Uwe Schindler
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-1423-FieldType.patch, SOLR-1423.patch, 
> SOLR-1423.patch, SOLR-1423.patch
>
>
> Because of some backwards compatibility problems (LUCENE-1906) we changed the 
> CharStream/CharFilter API a little bit. Tokenizer now only has a input field 
> of type java.io.Reader (as before the CharStream code). To correct offsets, 
> it is now needed to call the Tokenizer.correctOffset(int) method, which 
> delegates to the CharStream (if input is subclass of CharStream), else 
> returns an uncorrected offset. Normally it is enough to change all occurences 
> of input.correctOffset() to this.correctOffset() in Tokenizers. It should 
> also be checked, if custom Tokenizers in Solr do correct their offsets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others

2009-09-13 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1423:
-

Attachment: SOLR-1423.patch

I thought I call tokenizer.correctOffset() in newToken() method, but I couldn't 
because the method is protected. In this patch, I converted the anonymous 
Tokenizer class to PatternTokenizer, and PatternTokenizer has the following:

{code}
+public int correct( int currentOffset ){   
+  return correctOffset( currentOffset );   
+}  
{code}


> Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & 
> others
> 
>
> Key: SOLR-1423
> URL: https://issues.apache.org/jira/browse/SOLR-1423
> Project: Solr
>  Issue Type: Task
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Uwe Schindler
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
> Attachments: SOLR-1423.patch
>
>
> Because of some backwards compatibility problems (LUCENE-1906) we changed the 
> CharStream/CharFilter API a little bit. Tokenizer now only has a input field 
> of type java.io.Reader (as before the CharStream code). To correct offsets, 
> it is now needed to call the Tokenizer.correctOffset(int) method, which 
> delegates to the CharStream (if input is subclass of CharStream), else 
> returns an uncorrected offset. Normally it is enough to change all occurences 
> of input.correctOffset() to this.correctOffset() in Tokenizers. It should 
> also be checked, if custom Tokenizers in Solr do correct their offsets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1423) Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others

2009-09-11 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1423:
-

Affects Version/s: 1.4
Fix Version/s: 1.4
 Assignee: Koji Sekiguchi

I'd like to check it before 1.4 release. I'll look into it once RC4 is checked 
in Solr.

> Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & 
> others
> 
>
> Key: SOLR-1423
> URL: https://issues.apache.org/jira/browse/SOLR-1423
> Project: Solr
>  Issue Type: Task
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Uwe Schindler
>Assignee: Koji Sekiguchi
> Fix For: 1.4
>
>
> Because of some backwards compatibility problems (LUCENE-1906) we changed the 
> CharStream/CharFilter API a little bit. Tokenizer now only has a input field 
> of type java.io.Reader (as before the CharStream code). To correct offsets, 
> it is now needed to call the Tokenizer.correctOffset(int) method, which 
> delegates to the CharStream (if input is subclass of CharStream), else 
> returns an uncorrected offset. Normally it is enough to change all occurences 
> of input.correctOffset() to this.correctOffset() in Tokenizers. It should 
> also be checked, if custom Tokenizers in Solr do correct their offsets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1404) Random failures with highlighting

2009-09-10 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753686#action_12753686
 ] 

Koji Sekiguchi commented on SOLR-1404:
--

bq. A better fix, perhaps, would be implementing reset(CharStream input) in 
CharTokenizer in Lucene. 

Will LUCENE-1906 fix it (in an alternate way)?

> Random failures with highlighting
> -
>
> Key: SOLR-1404
> URL: https://issues.apache.org/jira/browse/SOLR-1404
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis, highlighter
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
> Fix For: 1.4
>
> Attachments: SOLR-1404.patch
>
>
> With a recent Solr nightly, we started getting errors when highlighting.
> I have not been able to reduce our real setup to a minimal one that is 
> failing, but the same error seems to pop up with the configuration below. 
> Note that the QUERY will mostly fail, but it will work sometimes. Notably, 
> after running "java -jar start.jar", the QUERY will work the first time, but 
> then start failing for a while. Seems that something is not being reset 
> properly.
> The example uses the deprecated HTMLStripWhitespaceTokenizerFactory but the 
> problem apparently also exists with other tokenizers; I was just unable to 
> create a minimal example with other configurations.
> SCHEMA
> 
> 
>   
> 
> 
>   
> 
>   
> 
>  
>  
>
>
>  
>  id
> 
> INDEX
> URL=http://localhost:8983/solr/update
> curl $URL --data-binary '1 name="test">test' -H 'Content-type:text/xml; 
> charset=utf-8'
> curl $URL --data-binary '' -H 'Content-type:text/xml; charset=utf-8'
> QUERY
> curl 'http://localhost:8983/solr/select/?hl.fl=test&hl=true&q=id:1'
> ERROR
> org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test 
> exceeds length of provided text sized 4
> org.apache.solr.common.SolrException: 
> org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token test 
> exceeds length of provided text sized 4
>   at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:328)
>   at 
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:89)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>   at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>   at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>   at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>   at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>   at org.mortbay.jetty.Server.handle(Server.java:285)
>   at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>   at 
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
>   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
>   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>   at 
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>   at 
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: 
> Token test exceeds length of provided text sized 4
>   at 
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:254)
>   at 
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:321)
>   ... 23 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1398) PatternTokenizerFactory ignores offset corrections

2009-09-05 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1398.
--

Resolution: Fixed

Committed revision 811753.

> PatternTokenizerFactory ignores offset corrections
> --
>
> Key: SOLR-1398
> URL: https://issues.apache.org/jira/browse/SOLR-1398
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1398.patch, SOLR-1398.patch
>
>
> I have an analyzer with a MappingCharFilterFactory followed by a 
> PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights.
> Replacing the tokenizer with WhitespaceTokenizerFactory gives correct 
> offsets, so I expect the problem to be with PatternTokenizerFactory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1398) PatternTokenizerFactory ignores offset corrections

2009-09-05 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1398:
-

Attachment: SOLR-1398.patch

a new patch with test case. Will commit shortly.

> PatternTokenizerFactory ignores offset corrections
> --
>
> Key: SOLR-1398
> URL: https://issues.apache.org/jira/browse/SOLR-1398
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1398.patch, SOLR-1398.patch
>
>
> I have an analyzer with a MappingCharFilterFactory followed by a 
> PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights.
> Replacing the tokenizer with WhitespaceTokenizerFactory gives correct 
> offsets, so I expect the problem to be with PatternTokenizerFactory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1398) PatternTokenizerFactory ignores offset corrections

2009-09-02 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750369#action_12750369
 ] 

Koji Sekiguchi commented on SOLR-1398:
--

Anders, thank you for testing the patch and reporting the result. Yes, I think 
the error is a separate issue. Can you show the procedure (schema.xml, indexed 
data and request parameters) to reproduce the error? I tried to index "G& 
uuml;nther G& uuml;nther is here" and search "Günther", but I could get a 
highlighted result successfully.

> PatternTokenizerFactory ignores offset corrections
> --
>
> Key: SOLR-1398
> URL: https://issues.apache.org/jira/browse/SOLR-1398
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1398.patch
>
>
> I have an analyzer with a MappingCharFilterFactory followed by a 
> PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights.
> Replacing the tokenizer with WhitespaceTokenizerFactory gives correct 
> offsets, so I expect the problem to be with PatternTokenizerFactory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1398) PatternTokenizerFactory ignores offset corrections

2009-08-31 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1398:
-

 Priority: Minor  (was: Major)
Fix Version/s: 1.4
 Assignee: Koji Sekiguchi

> PatternTokenizerFactory ignores offset corrections
> --
>
> Key: SOLR-1398
> URL: https://issues.apache.org/jira/browse/SOLR-1398
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1398.patch
>
>
> I have an analyzer with a MappingCharFilterFactory followed by a 
> PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights.
> Replacing the tokenizer with WhitespaceTokenizerFactory gives correct 
> offsets, so I expect the problem to be with PatternTokenizerFactory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1398) PatternTokenizerFactory ignores offset corrections

2009-08-31 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1398:
-

Attachment: SOLR-1398.patch

Anders, can you apply the patch and see the highlighted result?

> PatternTokenizerFactory ignores offset corrections
> --
>
> Key: SOLR-1398
> URL: https://issues.apache.org/jira/browse/SOLR-1398
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
> Attachments: SOLR-1398.patch
>
>
> I have an analyzer with a MappingCharFilterFactory followed by a 
> PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights.
> Replacing the tokenizer with WhitespaceTokenizerFactory gives correct 
> offsets, so I expect the problem to be with PatternTokenizerFactory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1398) PatternTokenizerFactory ignores offset corrections

2009-08-31 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749493#action_12749493
 ] 

Koji Sekiguchi commented on SOLR-1398:
--

Anders, thank you for reporting the problem. Can you show a concrete case so I 
can reproduce the problem?

> PatternTokenizerFactory ignores offset corrections
> --
>
> Key: SOLR-1398
> URL: https://issues.apache.org/jira/browse/SOLR-1398
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Anders Melchiorsen
>
> I have an analyzer with a MappingCharFilterFactory followed by a 
> PatternTokenizerFactory. This causes wrong offsets, and thus wrong highlights.
> Replacing the tokenizer with WhitespaceTokenizerFactory gives correct 
> offsets, so I expect the problem to be with PatternTokenizerFactory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler

2009-08-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1370.
--

Resolution: Fixed

Thanks Erik! Committed revision 805880.

> call CharFilters in FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1370
> URL: https://issues.apache.org/jira/browse/SOLR-1370
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1370.patch
>
>
> Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if 
> CharFilters are defined for the fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler

2009-08-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1370:
-

Fix Version/s: 1.4

> call CharFilters in FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1370
> URL: https://issues.apache.org/jira/browse/SOLR-1370
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1370.patch
>
>
> Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if 
> CharFilters are defined for the fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler

2009-08-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-1370:


Assignee: Koji Sekiguchi

> call CharFilters in FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1370
> URL: https://issues.apache.org/jira/browse/SOLR-1370
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1370.patch
>
>
> Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if 
> CharFilters are defined for the fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler

2009-08-19 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-1370:
-

Attachment: SOLR-1370.patch

The fix and test code.

> call CharFilters in FieldAnalysisRequestHandler
> ---
>
> Key: SOLR-1370
> URL: https://issues.apache.org/jira/browse/SOLR-1370
> Project: Solr
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Priority: Minor
> Attachments: SOLR-1370.patch
>
>
> Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if 
> CharFilters are defined for the fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-1370) call CharFilters in FieldAnalysisRequestHandler

2009-08-18 Thread Koji Sekiguchi (JIRA)

call CharFilters in FieldAnalysisRequestHandler
---

 Key: SOLR-1370
 URL: https://issues.apache.org/jira/browse/SOLR-1370
 Project: Solr
  Issue Type: Bug
  Components: Analysis
Affects Versions: 1.4
Reporter: Koji Sekiguchi
Priority: Minor


Currently, FieldAnalysisRequestHandler doesn't call CharFilters even if 
CharFilters are defined for the fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-1347) StatsComponent throws error for single-valued numeric fields.

2009-08-07 Thread Koji Sekiguchi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi resolved SOLR-1347.
--

Resolution: Invalid

> StatsComponent throws error for single-valued numeric fields.
> -
>
> Key: SOLR-1347
> URL: https://issues.apache.org/jira/browse/SOLR-1347
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.4
> Environment: MAC OSX
>Reporter: sumit biyani
>
> Hi ,
> Search component is throwing incorrect error while running below query on 
> sample data.
> http://localhost:8983/solr/select?q=*:*&rows=0&indent=true&stats=on&stats.field=price
> HTTP ERROR: 400
> Stats are valid for single valued numeric values.  not: 
> price[float{class=org.apache.solr.schema.TrieFloatField,analyzer=org.apache.solr.analysis.TokenizerChain,args={precisionStep=0,
>  positionIncrementGap=0, omitNorms=true}}]
> Here , price is single valued float type. 
>
> I also tried to change type to "pfloat" , but it gave error on parseDouble 
> method.
> This is run using Solr nightly build @ 07-Aug
> Please check and  let me know if I am missing something here.
> Thanks & Regards,
> Sumit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1 2 3 4 >

1 - 100 of 338 matches

Mail list logo