[jira] Updated: (SOLR-486) Support binary formats for QueryresponseWriter

2008-06-16 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-486:


Attachment: SOLR-486.patch

This include changes for making Binary format the default for SolrJ and the 
changes for optimized write of Field names in Documents. So , for a response  
with 5 fields and 10 records only 5 names are written instead of 50. there is 
an overhead of an extra byte per unique string (total 5 bytes in this case) 

 Support binary formats for QueryresponseWriter
 --

 Key: SOLR-486
 URL: https://issues.apache.org/jira/browse/SOLR-486
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, search
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 1.3

 Attachments: SOLR-486.patch, solr-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch


 QueryResponse writer only allows text data to be written.
 So it is not possible to implement a binary protocol . Create another 
 interface which has a method 
 write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-563) Contrib area for Solr

2008-06-16 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605256#action_12605256
 ] 

Shalin Shekhar Mangar commented on SOLR-563:


Otis, we can work on the maven issue separately. I've tested the current patch 
both with and without the DataImportHandler contrib patches and it works fine. 
At the very least, it doesn't break any of the existing functionality. So, we 
should be ok to commit this.

We can work on the maven issue separately as part of SOLR-586 once this gets 
committed.

 Contrib area for Solr
 -

 Key: SOLR-563
 URL: https://issues.apache.org/jira/browse/SOLR-563
 Project: Solr
  Issue Type: Task
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Otis Gospodnetic
 Fix For: 1.3

 Attachments: SOLR-563.patch


 Add a contrib area for Solr and modify existing build.xml to build, package 
 and distribute contrib projects also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-469) Data Import RequestHandler

2008-06-16 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-469:
---

Attachment: SOLR-469-contrib.patch

*Changes*
* Updated the build.xml to compile Solr before building DataImportHandler and 
place DataImportHandler's javadoc jar to solr/dist folder so that the javadocs 
are available in Solr nightly builds
* Removed @author Javadoc tags from all source files in accordance with Solr 
coding conventions
* Improved Javadocs for a lot of classes especially the public interfaces
* Formatted code using the Eclipse codestyle xml given at HowToContribute wiki 
page
* Added @since solr 1.3 to all source files
* I've verified that the Apache license text is present in all the source files

No changes have been made to the code (in terms of functionality)

Note -- The SOLR-563 patch must be applied before this patch to build Solr with 
DataImportHandler as a contrib project.

A lot of people are using this patch and it would be easier for them if 
DataImportHandler is available in the nightly builds. Also, this patch has 
become huge and enhancements and bug fixes would also be easier if it were 
committed. Grant -- We feel that this is ready to be committed now whenever you 
can take a look.

 Data Import RequestHandler
 --

 Key: SOLR-469
 URL: https://issues.apache.org/jira/browse/SOLR-469
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Grant Ingersoll
 Fix For: 1.3

 Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
 SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469.patch, 
 SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
 SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch


 We need a RequestHandler Which can import data from a DB or other dataSources 
 into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
 (SOLR-103).
 The way it works is as follows.
 * Provide a configuration file (xml) to the Handler which takes in the 
 necessary SQL queries and mappings to a solr schema
   - It also takes in a properties file for the data source 
 configuraution
 * Given the configuration it can also generate the solr schema.xml
 * It is registered as a RequestHandler which can take two commands 
 do-full-import, do-delta-import
   -  do-full-import - dumps all the data from the Database into the 
 index (based on the SQL query in configuration)
   - do-delta-import - dumps all the data that has changed since last 
 import. (We assume a modified-timestamp column in tables)
 * It provides a admin page
   - where we can schedule it to be run automatically at regular 
 intervals
   - It shows the status of the Handler (idle, full-import, 
 delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-16 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605293#action_12605293
 ] 

Grant Ingersoll commented on SOLR-572:
--

OK, I'd like to commit this tomorrow or Wednesday.  I am going to open another 
issue to bring in LUCENE-1297 to the configuration

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-06-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605306#action_12605306
 ] 

Yonik Seeley commented on SOLR-486:
---

Thanks Noble, this looks pretty good.

I had previously considered caching strings via some kind of sliding window... 
keep track of the last 100 or so string values written under some certain 
length, and then if you see a string again in that window, write a reference 
(an index that says how many values ago it was seen).

For Solr responses in general, it seems like the main duplication will be in 
field names (which you have taken care of).  The only other duplication I can 
think of would be the id field values (used as a key in other maps such as 
highlighting), and any duplication that is custom to the collection (such as 
string values for a type field, etc).

Thoughts?  I'd be happy to commit this version, or give you time to try out an 
alternative if you think it might be worth it (but I don't currently have time 
myself to implement the alternative).

 Support binary formats for QueryresponseWriter
 --

 Key: SOLR-486
 URL: https://issues.apache.org/jira/browse/SOLR-486
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, search
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 1.3

 Attachments: SOLR-486.patch, solr-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch


 QueryResponse writer only allows text data to be written.
 So it is not possible to implement a binary protocol . Create another 
 interface which has a method 
 write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-599) Lightweight SolrJ client

2008-06-16 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-599:
---

Description: 
SolrJ provides a SolrServer implementation backed by commons-httpclient which 
introduces many dependency jars (commons-codec, commons-io and 
commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
which introduces dependencies like stax-api, stax and stax-utils.

This enhancement will add a SolrServer implementation backed by 
java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
response parser. Using this basic implementation out of the box would require 
no dependencies on either commons-httpclient or StAX. The only dependency would 
be on solr-commons and commons-logging making this a very lightweight and 
distribution friendly Java client for Solr.

  was:
SolrJ provides a SolrServer implementation backed by commons-httpclient which 
introduces many dependency jars (commons-codec, commons-io and 
commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
which introduces dependencies like stax-api, stax and stax-utils.

This enhancement will add a SolrServer implementation backed by 
java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
response parser. Using this basic implementation out of the box would require 
no dependencies on either commons-httpclient or StAX. The only dependency would 
be on solr-commons making this a very lightweight and distribution friendly 
Java client for Solr.


 Lightweight SolrJ client
 

 Key: SOLR-599
 URL: https://issues.apache.org/jira/browse/SOLR-599
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.3


 SolrJ provides a SolrServer implementation backed by commons-httpclient which 
 introduces many dependency jars (commons-codec, commons-io and 
 commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
 which introduces dependencies like stax-api, stax and stax-utils.
 This enhancement will add a SolrServer implementation backed by 
 java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
 response parser. Using this basic implementation out of the box would require 
 no dependencies on either commons-httpclient or StAX. The only dependency 
 would be on solr-commons and commons-logging making this a very lightweight 
 and distribution friendly Java client for Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-599) Lightweight SolrJ client

2008-06-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605325#action_12605325
 ] 

Yonik Seeley commented on SOLR-599:
---

I thought commons-logging was only a dependency because HTTPClient used it.

 Lightweight SolrJ client
 

 Key: SOLR-599
 URL: https://issues.apache.org/jira/browse/SOLR-599
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.3


 SolrJ provides a SolrServer implementation backed by commons-httpclient which 
 introduces many dependency jars (commons-codec, commons-io and 
 commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
 which introduces dependencies like stax-api, stax and stax-utils.
 This enhancement will add a SolrServer implementation backed by 
 java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
 response parser. Using this basic implementation out of the box would require 
 no dependencies on either commons-httpclient or StAX. The only dependency 
 would be on solr-commons and commons-logging making this a very lightweight 
 and distribution friendly Java client for Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



protected QParser.parse() and subclasses

2008-06-16 Thread Julien PIQUOT
Hi everyone,
I have the following class :

package org.apache.solr.search;
class MyQParser extends QParser
{
protected Query parse()
{
// do stuff
Query q = subQuery(qs, QParserPlugin.DEFAULT_QTYPE).parse();
// do stuff
}
}

As QParser.parse is protected and QParser.subQuery is public, everything works 
fine when I run parse() myself (through unit tests). But when I try to run it 
through a Solr server, I get :
method: parse signature: ()Lorg/apache/lucene/search/Query;) Bad access to 
protected data java.lang.VerifyError
Which is normal, the class loader of QParser and MyQParser are different 
(MyQParser is inside the lib/ directory).
A public scope for the QParser.parse() would resolve this issue. What do you 
think about it?

Julien


[jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-06-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605329#action_12605329
 ] 

Noble Paul commented on SOLR-486:
-

Another  level of efficiency can be brought in by preloading the string table 
with well known strings like responseHeader , QTime etc . That did not look 
very elegant to me.  

I guess this can go into trunk and give it enough time to 'settle' before the 
release.

 Support binary formats for QueryresponseWriter
 --

 Key: SOLR-486
 URL: https://issues.apache.org/jira/browse/SOLR-486
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, search
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 1.3

 Attachments: SOLR-486.patch, solr-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch


 QueryResponse writer only allows text data to be written.
 So it is not possible to implement a binary protocol . Create another 
 interface which has a method 
 write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-486) Support binary formats for QueryresponseWriter

2008-06-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605126#action_12605126
 ] 

noble.paul edited comment on SOLR-486 at 6/16/08 9:04 AM:
--

 If we take a look at the data that is written down by NamedListCodec there are 
a lot of names which are repeated.  If we could avoid the repetitions we can 
achieve better optimization. 
Can we have another type EXTERN_STRING 
The NamedListCodec maintains a MapString,Integer  of EXTERN_STRING vs index 
as it is written out. When the same string is written it checks up in the List 
whether it already has a reference.

While decoding all the EXTERN_STRING values are copied into a List String. 
When an EXTERN_STRING with an index comes it is copied from the List.

{code:title=NamedListCodec.java}
  private int stringsCount  =  0;
  private MapString,Integer stringsMap;
  private ListString  stringsList;
  public void writeExternString(String s) throws IOException {
if(s == null) {
  writeTag(NULL) ;
  return;
}
Integer idx = stringsMap == null ? null : stringsMap.get(s);
if(idx == null) idx =0;
writeTag(EXTERN_STRING,idx);
if(idx == 0){
  writeStr(s);
  if(stringsMap == null) stringsMap = new HashMapString, Integer();
  stringsMap.put(s,++stringsCount);
}

  }
  public String  readExternString(FastInputStream fis) throws IOException {
int idx = readSize(fis);
if (idx != 0) {// idx != 0 is the index of the extern string
  return stringsList.get(idx-1);
} else {// idx == 0 means it has a string value
  String s = (String) readVal(fis);
  if(stringsList == null ) stringsList = new ArrayListString();
  stringsList.add(s);
  return s;
}
  }
{code}

  was (Author: noble.paul):
 If we take a look at the data that is written down by NamedListCodec there 
are a lot of names which are repeated.  If we could avoid the repetitions we 
can achieve better optimization. 
Can we have another type EXTERN_STRING 
The NamedListCodec maintains a MapString,Integer  of EXTERN_STRING vs index 
as it is written out. When the same string is written it checks up in the List 
whether it already has a reference.

While decoding all the EXTERN_STRING values are copied into a List String. 
When an EXTERN_STRING with an index comes it is copied from the List.

{code:title=NamedListCodec.java}
private int stringsCount  =  -1;
  private MapString,Integer stringsMap;
  private ListString  stringsList;
  public void writeExternString(String s) throws IOException {
writeTag(EXTERN_STRING);
if(s == null) {
  writeTag(NULL) ;
  return;
}
if(stringsMap.containsKey(s)){
  writeInt(stringsMap.get(s));
} else {
  writeStr(s);
  stringsCount++;
  if(stringsMap == null) stringsMap = new HashMapString, Integer();
  stringsMap.put(s,stringsCount);
}

  }
  public String  readExternString(FastInputStream fis) throws IOException {
Object o = readVal(fis);
if(o == null) return null;
if (o instanceof String) {
  String s = (String) o;
  if(stringsList == null ) stringsList = new ArrayListString();
  stringsList.add(s);
  return s;
} else {// this must be an integer
  int index = (Integer)o;
  return stringsList.get(index);
}
  }

{code}
  
 Support binary formats for QueryresponseWriter
 --

 Key: SOLR-486
 URL: https://issues.apache.org/jira/browse/SOLR-486
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, search
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 1.3

 Attachments: SOLR-486.patch, solr-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch


 QueryResponse writer only allows text data to be written.
 So it is not possible to implement a binary protocol . Create another 
 interface which has a method 
 write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-599) Lightweight SolrJ client

2008-06-16 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605335#action_12605335
 ] 

Shalin Shekhar Mangar commented on SOLR-599:


Right, I'll edit the description (again)

 Lightweight SolrJ client
 

 Key: SOLR-599
 URL: https://issues.apache.org/jira/browse/SOLR-599
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.3


 SolrJ provides a SolrServer implementation backed by commons-httpclient which 
 introduces many dependency jars (commons-codec, commons-io and 
 commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
 which introduces dependencies like stax-api, stax and stax-utils.
 This enhancement will add a SolrServer implementation backed by 
 java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
 response parser. Using this basic implementation out of the box would require 
 no dependencies on either commons-httpclient or StAX. The only dependency 
 would be on solr-commons and commons-logging making this a very lightweight 
 and distribution friendly Java client for Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-599) Lightweight SolrJ client

2008-06-16 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-599:
---

Description: 
SolrJ provides a SolrServer implementation backed by commons-httpclient which 
introduces many dependency jars (commons-codec, commons-io and 
commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
which introduces dependencies like stax-api, stax and stax-utils.

This enhancement will add a SolrServer implementation backed by 
java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
response parser. Using this basic implementation out of the box would require 
no dependencies on either commons-httpclient or StAX. The only dependency would 
be on solr-commons making this a very lightweight and distribution friendly 
Java client for Solr.

  was:
SolrJ provides a SolrServer implementation backed by commons-httpclient which 
introduces many dependency jars (commons-codec, commons-io and 
commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
which introduces dependencies like stax-api, stax and stax-utils.

This enhancement will add a SolrServer implementation backed by 
java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
response parser. Using this basic implementation out of the box would require 
no dependencies on either commons-httpclient or StAX. The only dependency would 
be on solr-commons and commons-logging making this a very lightweight and 
distribution friendly Java client for Solr.


 Lightweight SolrJ client
 

 Key: SOLR-599
 URL: https://issues.apache.org/jira/browse/SOLR-599
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.3


 SolrJ provides a SolrServer implementation backed by commons-httpclient which 
 introduces many dependency jars (commons-codec, commons-io and 
 commons-logging). Apart from that SolrJ also uses StAX API for XML parsing 
 which introduces dependencies like stax-api, stax and stax-utils.
 This enhancement will add a SolrServer implementation backed by 
 java.net.HttpUrlConnection and will use BinaryResponseParser as the default 
 response parser. Using this basic implementation out of the box would require 
 no dependencies on either commons-httpclient or StAX. The only dependency 
 would be on solr-commons making this a very lightweight and distribution 
 friendly Java client for Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-486) Support binary formats for QueryresponseWriter

2008-06-16 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605329#action_12605329
 ] 

noble.paul edited comment on SOLR-486 at 6/16/08 9:13 AM:
--

Another  level of efficiency can be brought in by preloading the string table 
with well known strings like responseHeader , QTime etc . That did not look 
very elegant to me.  

The sliding window approach is also good . But we do not have too many repeated 
strings unless we use highlighting etc . 

I guess this can go into trunk and give it enough time to 'settle' before the 
release.

  was (Author: noble.paul):
Another  level of efficiency can be brought in by preloading the string 
table with well known strings like responseHeader , QTime etc . That did not 
look very elegant to me.  

I guess this can go into trunk and give it enough time to 'settle' before the 
release.
  
 Support binary formats for QueryresponseWriter
 --

 Key: SOLR-486
 URL: https://issues.apache.org/jira/browse/SOLR-486
 Project: Solr
  Issue Type: Improvement
  Components: clients - java, search
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 1.3

 Attachments: SOLR-486.patch, solr-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, SOLR-486.patch, 
 SOLR-486.patch, SOLR-486.patch, SOLR-486.patch


 QueryResponse writer only allows text data to be written.
 So it is not possible to implement a binary protocol . Create another 
 interface which has a method 
 write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605367#action_12605367
 ] 

Yonik Seeley commented on SOLR-572:
---

For those who are just casually following this issue, is there a good summary 
of current  input options and example output?

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-16 Thread Shalin Shekhar Mangar
Grant created a wiki page at
http://wiki.apache.org/solr/SpellCheckComponentwhich has some
documentation on the configuration. I'll try to add more
documentation when I try this out tomorrow.

On Tue, Jun 17, 2008 at 12:03 AM, Yonik Seeley (JIRA) [EMAIL PROTECTED]
wrote:


[
 https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605367#action_12605367]

 Yonik Seeley commented on SOLR-572:
 ---

 For those who are just casually following this issue, is there a good
 summary of current  input options and example output?

  Spell Checker as a Search Component
  ---
 
  Key: SOLR-572
  URL: https://issues.apache.org/jira/browse/SOLR-572
  Project: Solr
   Issue Type: New Feature
   Components: spellchecker
 Affects Versions: 1.3
 Reporter: Shalin Shekhar Mangar
 Assignee: Grant Ingersoll
 Priority: Minor
  Fix For: 1.3
 
  Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch
 
 
  Expose the Lucene contrib SpellChecker as a Search Component. Provide the
 following features:
  * Allow creating a spell index on a given field and make it possible to
 have multiple spell indices -- one for each field
  * Give suggestions on a per-field basis
  * Given a multi-word query, give only one consistent suggestion
  * Process the query with the same analyzer specified for the source field
 and process each token separately
  * Allow the user to specify minimum length for a token (optional)
  Consistency criteria for a multi-word query can consist of the following:
  * Preserve the correct words in the original query as it is
  * Never give duplicate words in a suggestion

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.




-- 
Regards,
Shalin Shekhar Mangar.


[jira] Commented: (SOLR-14) Add the ability to preserve the original term when using WordDelimiterFilter

2008-06-16 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605403#action_12605403
 ] 

Mike Klaas commented on SOLR-14:


Note that it is very easy to use an external TokenFilter, so you could just cp 
WDF into your own class and make the changes.

(Though I'm not saying that this _shouldn't_ make it in for 1.3)

 Add the ability to preserve the original term when using WordDelimiterFilter
 

 Key: SOLR-14
 URL: https://issues.apache.org/jira/browse/SOLR-14
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Richard Trey Hyde
 Attachments: TokenizerFactory.java, WordDelimiterFilter.patch, 
 WordDelimiterFilter.patch


 When doing prefix searching, you need to hang on to the original term 
 othewise you'll miss many matches you should be making.
 Data: ABC-12345
 WordDelimiterFitler may change this into
 ABC 12345 ABC12345
 A user may enter a search such as 
  ABC\-123*
 Which will fail to find a match given the above scenario.
 The attached patch will allow the use of the preserveOriginal option to 
 WordDelimiterFilter and will analyse as
 ABC 12345 ABC12345  ABC-12345 
 in which case we will get a postive match.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-14) Add the ability to preserve the original term when using WordDelimiterFilter

2008-06-16 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605410#action_12605410
 ] 

Mike Klaas commented on SOLR-14:


Also, voting for an issue is a good way to increase its visibility

 Add the ability to preserve the original term when using WordDelimiterFilter
 

 Key: SOLR-14
 URL: https://issues.apache.org/jira/browse/SOLR-14
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Richard Trey Hyde
 Attachments: TokenizerFactory.java, WordDelimiterFilter.patch, 
 WordDelimiterFilter.patch


 When doing prefix searching, you need to hang on to the original term 
 othewise you'll miss many matches you should be making.
 Data: ABC-12345
 WordDelimiterFitler may change this into
 ABC 12345 ABC12345
 A user may enter a search such as 
  ABC\-123*
 Which will fail to find a match given the above scenario.
 The attached patch will allow the use of the preserveOriginal option to 
 WordDelimiterFilter and will analyse as
 ABC 12345 ABC12345  ABC-12345 
 in which case we will get a postive match.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-243) Create a hook to allow custom code to create custom IndexReaders

2008-06-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605424#action_12605424
 ] 

Hoss Man commented on SOLR-243:
---

bq. Hoss has marked the issue for 1.3, so it will be in the release.

for the record, i marked it as 1.3 because itwould be nice to see this in 1.3 
... but as i said in my 2008-03-13 comment: we need unit tests and example 
configuration before i'm willing to commit.



 Create a hook to allow custom code to create custom IndexReaders
 

 Key: SOLR-243
 URL: https://issues.apache.org/jira/browse/SOLR-243
 Project: Solr
  Issue Type: Improvement
  Components: search
 Environment: Solr core
Reporter: John Wang
Assignee: Hoss Man
 Fix For: 1.3

 Attachments: indexReaderFactory.patch, indexReaderFactory.patch, 
 indexReaderFactory.patch, indexReaderFactory.patch, indexReaderFactory.patch, 
 indexReaderFactory.patch, indexReaderFactory.patch


 I have a customized IndexReader and I want to write a Solr plugin to use my 
 derived IndexReader implementation. Currently IndexReader instantiation is 
 hard coded to be: 
 IndexReader.open(path)
 It would be really useful if this is done thru a plugable factory that can be 
 configured, e.g. IndexReaderFactory
 interface IndexReaderFactory{
  IndexReader newReader(String name,String path);
 }
 the default implementation would just return: IndexReader.open(path)
 And in the newSearcher and getSearcher methods in SolrCore class can call the 
 current factory implementation to get the IndexReader instance and then build 
 the SolrIndexSearcher by passing in the reader.
 It would be really nice to add this improvement soon (This seems to be a 
 trivial addition) as our project really depends on this.
 Thanks
 -John

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-243) Create a hook to allow custom code to create custom IndexReaders

2008-06-16 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605447#action_12605447
 ] 

John Wang commented on SOLR-243:


Sorry, I didn't see Hoss's earlier comments.
Thanks

-John




 Create a hook to allow custom code to create custom IndexReaders
 

 Key: SOLR-243
 URL: https://issues.apache.org/jira/browse/SOLR-243
 Project: Solr
  Issue Type: Improvement
  Components: search
 Environment: Solr core
Reporter: John Wang
Assignee: Hoss Man
 Fix For: 1.3

 Attachments: indexReaderFactory.patch, indexReaderFactory.patch, 
 indexReaderFactory.patch, indexReaderFactory.patch, indexReaderFactory.patch, 
 indexReaderFactory.patch, indexReaderFactory.patch


 I have a customized IndexReader and I want to write a Solr plugin to use my 
 derived IndexReader implementation. Currently IndexReader instantiation is 
 hard coded to be: 
 IndexReader.open(path)
 It would be really useful if this is done thru a plugable factory that can be 
 configured, e.g. IndexReaderFactory
 interface IndexReaderFactory{
  IndexReader newReader(String name,String path);
 }
 the default implementation would just return: IndexReader.open(path)
 And in the newSearcher and getSearcher methods in SolrCore class can call the 
 current factory implementation to get the IndexReader instance and then build 
 the SolrIndexSearcher by passing in the reader.
 It would be really nice to add this improvement soon (This seems to be a 
 trivial addition) as our project really depends on this.
 Thanks
 -John

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: protected QParser.parse() and subclasses

2008-06-16 Thread Chris Hostetter

: As QParser.parse is protected and QParser.subQuery is public, everything 
: works fine when I run parse() myself (through unit tests). But when I 
: try to run it through a Solr server, I get :

all of the concrete impls of QParser in the solr code base declare the 
parse() method as public ... i'm not sure why it's protected in the 
abstract class ... seems wrong to me.


-Hoss



[jira] Created: (SOLR-600) XML parser stops working under heavy load

2008-06-16 Thread John Smith (JIRA)
XML parser stops working under heavy load
-

 Key: SOLR-600
 URL: https://issues.apache.org/jira/browse/SOLR-600
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.3
 Environment: Linux 2.6.19.7-ss0 #4 SMP Wed Mar 12 02:56:42 GMT 2008 
x86_64 Intel(R) Xeon(R) CPU X5450 @ 3.00GHz GenuineIntel GNU/Linux
Tomcat 6.0.16
SOLR nightly 16 Jun 2008, and versions prior
JRE 1.6.0
Reporter: John Smith


Under heavy load, the following is spat out for every update:

org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at java.util.AbstractList$SimpleListIterator.hasNext(Unknown Source)
at 
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:225)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:66)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:196)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:965)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:735)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.