[jira] Created: (SOLR-1847) Solrj doesn't know if PDF was actually parsed by Tika

2010-03-26 Thread elsadek (JIRA)
Solrj doesn't know if PDF was actually parsed by Tika
-

 Key: SOLR-1847
 URL: https://issues.apache.org/jira/browse/SOLR-1847
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 1.5
 Environment: TOMCAT 6.0.24, SOLR 1.5Dev, Solrj1.5Dev Tika
Reporter: elsadek


When posting pdf files using solrj the only response we get from Solr is only 
server response status, but never know whether
pdf was actually parsed or not, checking the log I found that  Tika wasn't able
to succeed with some pdf files because of content nature (texts in images only) 
or are corrupted:

 25 mars 2010 14:54:07 org.apache.pdfbox.util.PDFStreamEngine 
processOperator
 INFO: unsupported/disabled operation: EI
   
 25 mars 2010 14:54:02 org.apache.pdfbox.filter.FlateFilter decode
 GRAVE: Stop reading corrupt stream


The question is how can I catch these kinds of exceptions through Solrj ?



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: build.xml and lucene test code

2010-03-26 Thread Mark Miller
Yeah, there are things we can do to improve some of this (uptodatetask  
or something?) - Uwe has some ideas the other day.


- Mark

http://www.lucidimagination.com (mobile)

On Mar 26, 2010, at 1:53 AM, Robert Muir rcm...@gmail.com wrote:


I noticed that for whatever reason, solr's build.xml doesnt detect if
lucene's test code is out of date.

(I am fooling around with LUCENE-1709 where we will try to do the same
parallel test execution for Lucene, as in Solr, and was moving the
special formatter to lucene when i noticed this).

Don't have any ideas how to fix, but just wanted to mention it so its
not forgotten.

worst case, if/when we resolve LUCENE-1709, you will have to run ant
clean first... but I am sure there is some better ant trickery to
detect this situation, maybe just another task dependency.

--
Robert Muir
rcm...@gmail.com


RE: ZooKeeper Logging

2010-03-26 Thread Zhou, Yaning
Yes,

That's exactly what we did here:)

Yaning

-Original Message-
From: Igor Motov [mailto:imo...@gmail.com] 
Sent: Thursday, March 25, 2010 8:13 PM
To: solr-dev@lucene.apache.org
Subject: Re: ZooKeeper Logging

I agree, this is concerning. There is a similar situation with JCL in SOLR
1.4 (http://www.mail-archive.com/solr-dev@lucene.apache.org/msg13600.html).
With log4j-over-slf4j-X.Y.Z.jar in the .war file, whoever will need to
switch the logging system will have to make sure that an appropriate set of
*-over-slf4j and slf4j-* files is in the classpath.

So, it might make sense to change the last paragraph of
http://wiki.apache.org/solr/SolrLogging to something like this:


Users who want an alternate logging implementation (log4j, logback, jcl,
etc) will need to repackage the .war file and replace slf4j-jdk14-X.Y.Z.jar
with an alternate implementation. They should also remove
jcl-over-slf4j-X.Y.Z.jar if they are switching to JCL and
log4j-over-slf4j-X.Y.Z.jar if they are using cloud branch and switching to
log4j. Having both jcl-over-slf4j and slf4j-jcl.jar or log4j-over-slf4j.jar
and slf4j-log4j.jar in the classpath may result in an infinite loop.


There is already a link to slf4j site on the page. But, it might be also
useful to add a link directly to the page that describes SLF4J bridges:
http://www.slf4j.org/legacy.html

Igor

On Thu, Mar 25, 2010 at 4:26 PM, Mark Miller markrmil...@gmail.com wrote:

 On 03/25/2010 11:27 AM, Mark Miller wrote:

 On 03/25/2010 11:22 AM, Igor Motov wrote:

 I wonder if it would make sense to replace log4j-1.2.15.jar in the lib
 directory of the cloud branch with log4j-over-slf4j-1.5.5.jar. SOLR is
 already using SLF4J for its logging and log4j-over-slf4j bridge would
 redirect all ZooKeeper log messages into the same SLF4J logging stream. I
 did it in my setup and and it helped me a lot when I ran into issues with
 my
 ZooKeeper configuration.

 Thank you,

 Igor

  Yeah, this is a great idea - been meaning to look into this since Uwe
 mentioned to me that it was possible the other week. Hadn't realized these
 bridge jars existed previously. Yonik had started looking into getting the
 ZooKeeper project to switch to SLF4J, but this is a great solution for now.


 Hmm - I've done this for now - but this is going to need some good
 documentation at the least. It works great for Solr in it's default state
 (using util logging) - but you don't want this jar if you choose to use the
 log4j connector. Otherwise they loop back and forth with each other.

 --
 - Mark

 http://www.lucidimagination.com






[jira] Updated: (SOLR-1395) Integrate Katta

2010-03-26 Thread Thomas Koch (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Koch updated SOLR-1395:
--

Attachment: solr-1395-1431-katta0.6.patch

This patch implements searching over a set of indices specified by a regular 
expression (in the shards= parameter of the query). For this patch to work, you 
also need to patch katta: http://oss.101tec.com/jira/browse/KATTA-91

 Integrate Katta
 ---

 Key: SOLR-1395
 URL: https://issues.apache.org/jira/browse/SOLR-1395
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar, 
 katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, 
 solr-1395-1431-3.patch, solr-1395-1431-4.patch, 
 solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, 
 solr-1395-1431.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, 
 test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar

   Original Estimate: 336h
  Remaining Estimate: 336h

 We'll integrate Katta into Solr so that:
 * Distributed search uses Hadoop RPC
 * Shard/SolrCore distribution and management
 * Zookeeper based failover
 * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1672) RFE: facet reverse sort count

2010-03-26 Thread Peter Sturge (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850159#action_12850159
 ] 

Peter Sturge commented on SOLR-1672:


I agree there's some refactoring to do to bring it in line with current 
FacetParams conventions. At the same time, it would be good to look at wrapping 
up the functionality into a method, and covering all the code paths in the way 
you describe.

I've been wanting to get to finishing off this patch, but I'm in the throws of 
a product release myself, so I've not had many spare cycles.

You mention termenum, fieldcache, uninverted - presumably, these are among the 
code paths that need to cater for facet counts. If you know them, can you add a 
comment here that lists all the areas that need to be catered for, so that none 
are left out (if it's more than those 3).

Thanks!
Peter


 RFE: facet reverse sort count
 -

 Key: SOLR-1672
 URL: https://issues.apache.org/jira/browse/SOLR-1672
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
 Environment: Java, Solrj, http
Reporter: Peter Sturge
Priority: Minor
 Attachments: SOLR-1672.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 As suggested by Chris Hosstetter, I have added an optional Comparator to the 
 BoundedTreeSetLong in the UnInvertedField class.
 This optional comparator is used when a new (and also optional) field facet 
 parameter called 'facet.sortorder' is set to the string 'dsc' 
 (e.g. f.facetname.facet.sortorder=dsc for per field, or 
 facet.sortorder=dsc for all facets).
 Note that this parameter has no effect if facet.method=enum.
 Any value other than 'dsc' (including no value) reverts the BoundedTreeSet to 
 its default behaviour.
  
 This change affects 2 source files:
  UnInvertedField.java
 [line 438] The getCounts() method signature is modified to add the 
 'facetSortOrder' parameter value to the end of the argument list.
  
 DIFF UnInvertedField.java:
 - public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int 
 offset, int limit, Integer mincount, boolean missing, String sort, String 
 prefix) throws IOException {
 + public NamedList getCounts(SolrIndexSearcher searcher, DocSet baseDocs, int 
 offset, int limit, Integer mincount, boolean missing, String sort, String 
 prefix, String facetSortOrder) throws IOException {
 [line 556] The getCounts() method is modified to create an overridden 
 BoundedTreeSetLong(int, Comparator) if the 'facetSortOrder' parameter 
 equals 'dsc'.
 DIFF UnInvertedField.java:
 - final BoundedTreeSetLong queue = new BoundedTreeSetLong(maxsize);
 + final BoundedTreeSetLong queue = (sort.equals(count) || 
 sort.equals(true)) ? (facetSortOrder.equals(dsc) ? new 
 BoundedTreeSetLong(maxsize, new Comparator()
 { @Override
 public int compare(Object o1, Object o2)
 {
   if (o1 == null || o2 == null)
 return 0;
   int result = ((Long) o1).compareTo((Long) o2);
   return (result != 0 ? result  0 ? -1 : 1 : 0); //lowest number first sort
 }}) : new BoundedTreeSetLong(maxsize)) : null;
  SimpleFacets.java
 [line 221] A getFieldParam(field, facet.sortorder, asc); is added to 
 retrieve the new parameter, if present. 'asc' used as a default value.
 DIFF SimpleFacets.java:
 + String facetSortOrder = params.getFieldParam(field, facet.sortorder, 
 asc);
  
 [line 253] The call to uif.getCounts() in the getTermCounts() method is 
 modified to pass the 'facetSortOrder' value string.
 DIFF SimpleFacets.java:
 - counts = uif.getCounts(searcher, base, offset, limit, 
 mincount,missing,sort,prefix);
 + counts = uif.getCounts(searcher, base, offset, limit, 
 mincount,missing,sort,prefix, facetSortOrder);
 Implementation Notes:
 I have noted in testing that I was not able to retrieve any '0' counts as I 
 had expected.
 I believe this could be because there appear to be some optimizations in 
 SimpleFacets/count caching such that zero counts are not iterated (at least 
 not by default)
 as a performance enhancement.
 I could be wrong about this, and zero counts may appear under some other as 
 yet untested circumstances. Perhaps an expert familiar with this part of the 
 code can clarify.
 In fact, this is not such a bad thing (at least for my requirements), as a 
 whole bunch of zero counts is not necessarily useful (for my requirements, 
 starting at '1' is just right).
  
 There may, however, be instances where someone *will* want zero counts - e.g. 
 searching for zero product stock counts (e.g. 'what have we run out of'). I 
 was envisioning the facet.mincount field
 being the preferred place to set where the 'lowest value' begins (e.g. 0 or 1 
 or possibly higher), but because of the caching/optimization, the behaviour 
 is somewhat different than expected.

-- 
This 

[jira] Commented: (SOLR-1804) Upgrade Carrot2 to 3.2.0

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850204#action_12850204
 ] 

Grant Ingersoll commented on SOLR-1804:
---

We should be able to go through with this now, right?

 Upgrade Carrot2 to 3.2.0
 

 Key: SOLR-1804
 URL: https://issues.apache.org/jira/browse/SOLR-1804
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Clustering
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll

 http://project.carrot2.org/release-3.2.0-notes.html
 Carrot2 is now LGPL free, which means we should be able to bundle the binary!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Grant Ingersoll (JIRA)
Add example Query page to the example
-

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial


I've wired up a static jetty context and hooked in a simple HTML page that 
shows off a bunch of the different types of queries people can do w/ the 
Example data.  Browse to it at http://localhost:8983/example/queries.html

Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850226#action_12850226
 ] 

Grant Ingersoll commented on SOLR-1848:
---

Make that http://localhost:8983/solr/example/queries.html

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850251#action_12850251
 ] 

Yonik Seeley commented on SOLR-1848:


What's the motivation for including them in the solr webapp?
Stuff like this works fine from the tutorial on the website, or from a wiki 
page.

And I've been trying to get rid of the extra directories in example, not add 
more :-)

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850254#action_12850254
 ] 

Grant Ingersoll commented on SOLR-1848:
---

B/c you don't always have access to those.  This is nice and handy and concise 
and included in the example w/o having to go looking all around.  If anything, 
the tutorial should be shipped w/ the example.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850256#action_12850256
 ] 

Yonik Seeley commented on SOLR-1848:


I believe the tutorial is already shipped in the solr download.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850257#action_12850257
 ] 

Yonik Seeley commented on SOLR-1848:


This also complicates setting up with different servlet containers.  Someone 
can't drop the solr.war into tomcat, or in a different jetty container, and 
follow along with the tutorial anymore.  I think we should revert this and keep 
things simple.


 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850263#action_12850263
 ] 

Grant Ingersoll commented on SOLR-1848:
---

Seriously, Yonik?  This is worth the discussion?  It's a jetty context file and 
a static HTML page that contains some handy examples of how to work with Solr 
w/o going all over the place. 

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1835) speed up and improve tests

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850267#action_12850267
 ] 

Yonik Seeley commented on SOLR-1835:


As a further attempt to clean up example and make it one server rather than 
many, I think it makes sense to remove the multicore directory.  Our standard 
example is now multicore enabled already.

This will also involve making the multicore example tests not depend on 
example, but on a test config (or making them create the cores dynamically 
from example).

 speed up and improve tests
 --

 Key: SOLR-1835
 URL: https://issues.apache.org/jira/browse/SOLR-1835
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Fix For: 3.1

 Attachments: SOLR-1835-ignoreExceptions.patch, 
 SOLR-1835-ignoreExceptions.patch, SOLR-1835.patch, 
 SOLR-1835_example_junit4.patch, SOLR-1835_parallel.patch, 
 SOLR-1835_parallel.patch, SOLR-1835_parallel.patch, SOLR-1835_parallel.patch


 General test improvements.
 We should use @BeforeClass where possible to avoid per test method overhead, 
 and reuse lucene test utils where possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850269#action_12850269
 ] 

Yonik Seeley commented on SOLR-1848:


If it's not worth the discussion, hopefully you won't mind if it's reverted 
then?

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850272#action_12850272
 ] 

Grant Ingersoll commented on SOLR-1848:
---

Whatever.  Do as you wish.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850272#action_12850272
 ] 

Grant Ingersoll edited comment on SOLR-1848 at 3/26/10 6:17 PM:


Whatever.  Do as you wish.  Just b/c you don't find something useful doesn't 
mean others won't.

  was (Author: gsingers):
Whatever.  Do as you wish.
  
 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850274#action_12850274
 ] 

Yonik Seeley commented on SOLR-1848:


I guess an argument could also be made for putting the whole tutorial page in 
the example server.
But it's certainly something that warrants discussion.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850276#action_12850276
 ] 

Yonik Seeley commented on SOLR-1848:


bq. Just b/c you don't find something useful doesn't mean others won't. 

Of course... but one could use such an argument to support anything.
In this specific case, it doesn't seem like there is enough benefit to outweigh 
the additional complexity.


 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1835) speed up and improve tests

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850289#action_12850289
 ] 

Yonik Seeley commented on SOLR-1835:


Mark pointed out that removing multicore is related to (and perhaps already 
implemented as part of)  https://issues.apache.org/jira/browse/SOLR-1770

 speed up and improve tests
 --

 Key: SOLR-1835
 URL: https://issues.apache.org/jira/browse/SOLR-1835
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Fix For: 3.1

 Attachments: SOLR-1835-ignoreExceptions.patch, 
 SOLR-1835-ignoreExceptions.patch, SOLR-1835.patch, 
 SOLR-1835_example_junit4.patch, SOLR-1835_parallel.patch, 
 SOLR-1835_parallel.patch, SOLR-1835_parallel.patch, SOLR-1835_parallel.patch


 General test improvements.
 We should use @BeforeClass where possible to avoid per test method overhead, 
 and reuse lucene test utils where possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1848) Add example Query page to the example

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850329#action_12850329
 ] 

Yonik Seeley commented on SOLR-1848:


OK, I've reverted this for now.
If people want changes to the current tutorial structure, we can have more 
discussion to hash out the best way to achieve that.

 Add example Query page to the example
 -

 Key: SOLR-1848
 URL: https://issues.apache.org/jira/browse/SOLR-1848
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Trivial

 I've wired up a static jetty context and hooked in a simple HTML page that 
 shows off a bunch of the different types of queries people can do w/ the 
 Example data.  Browse to it at http://localhost:8983/example/queries.html
 Will commit shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Making a 'isPartialResults' getter for ResponseBuilder / SolrQueryResponse

2010-03-26 Thread Kaktu Chakarabati
Hey,
While working with some custom query components we've came across a
situation that I thought might
be worthwhile of a small patch:

The issue is that some components in the query handling chain might want to
know whether the results
returned by the IndexSearcher are partial (e.g because of use of the
timeAllowed parameter).
After some tracing through the solr code, i noticed this is possible in a
somewhat contrived way, e.g using
rb.rsp.getResponseHeader().get(partialResults).

Does anyone else think it might be useful to add a 'isPartialResults()'
method to ResponseBuilder / QueryResponse?
this can be set e.g in ResponseBuilder.setResult()

Thanks,
-Chak


[jira] Commented: (SOLR-1849) ant luke target in Solr build no longer works

2010-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850357#action_12850357
 ] 

Grant Ingersoll commented on SOLR-1849:
---

For the record, the current workaround is to download the source and copy in 
the standalone luke into the luke directory.

 ant luke target in Solr build no longer works
 -

 Key: SOLR-1849
 URL: https://issues.apache.org/jira/browse/SOLR-1849
 Project: Solr
  Issue Type: Bug
Reporter: Grant Ingersoll
Priority: Trivial

 Here's a fix:
 {code}
 property  name=luke.version value=1.0.0/
   available file=luke/luke-${luke.version}.jar property=luke.jar.exists 
 /
   target name=luke-download unless=luke.jar.exists depends=proxy.setup
 mkdir dir=luke/
 get src=http://luke.googlecode.com/files/luke-${luke.version}.jar;
 dest=luke/luke-${luke.version}.jar/
   /target
   
   target name=luke depends=luke-download
 java fork=true 
   classname=org.getopt.luke.Luke
   logError=true
   failonerror=true
   classpath
 fileset dir=luke
   include name=luke-${luke.version}.jar/
 /fileset
 path refid=lucene.classpath/
 path refid=test.run.classpath/
/classpath
 /java
   /target
 {code}
 But it requires there to be a standalone, downloadable version of Luke w/o 
 any Lucene bundled in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1849) ant luke target in Solr build no longer works

2010-03-26 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1849:
--

Description: 
Here's a fix:
{code}
property  name=luke.version value=1.0.0/
  available file=luke/luke-${luke.version}.jar property=luke.jar.exists /
  target name=luke-download unless=luke.jar.exists depends=proxy.setup
mkdir dir=luke/
get src=http://luke.googlecode.com/files/luke-${luke.version}.jar;
dest=luke/luke-${luke.version}.jar/
  /target
  
  target name=luke depends=luke-download
java fork=true 
  classname=org.getopt.luke.Luke
  logError=true
  failonerror=true
  classpath
fileset dir=luke
  include name=luke-${luke.version}.jar/
/fileset
path refid=lucene.classpath/
path refid=test.run.classpath/
   /classpath
/java
  /target
{code}

But it requires there to be a standalone, downloadable version of Luke w/o any 
Lucene bundled in.

  was:
Here's a fix:
{code}
property  name=luke.version value=1.0.0/
  available file=luke/luke-${luke.version}.jar property=luke.jar.exists /
  target name=luke-download unless=luke.jar.exists depends=proxy.setup
mkdir dir=luke/
get src=http://luke.googlecode.com/files/lukeall-${luke.version}.jar;
dest=luke/luke-${luke.version}.jar/
  /target
  
  target name=luke depends=luke-download
java fork=true 
  classname=org.getopt.luke.Luke
  logError=true
  failonerror=true
  classpath
fileset dir=luke
  include name=luke-${luke.version}.jar/
/fileset
path refid=lucene.classpath/
path refid=test.run.classpath/
   /classpath
/java
  /target
{code}

But it requires there to be a standalone, downloadable version of Luke w/o any 
Lucene bundled in.


 ant luke target in Solr build no longer works
 -

 Key: SOLR-1849
 URL: https://issues.apache.org/jira/browse/SOLR-1849
 Project: Solr
  Issue Type: Bug
Reporter: Grant Ingersoll
Priority: Trivial

 Here's a fix:
 {code}
 property  name=luke.version value=1.0.0/
   available file=luke/luke-${luke.version}.jar property=luke.jar.exists 
 /
   target name=luke-download unless=luke.jar.exists depends=proxy.setup
 mkdir dir=luke/
 get src=http://luke.googlecode.com/files/luke-${luke.version}.jar;
 dest=luke/luke-${luke.version}.jar/
   /target
   
   target name=luke depends=luke-download
 java fork=true 
   classname=org.getopt.luke.Luke
   logError=true
   failonerror=true
   classpath
 fileset dir=luke
   include name=luke-${luke.version}.jar/
 /fileset
 path refid=lucene.classpath/
 path refid=test.run.classpath/
/classpath
 /java
   /target
 {code}
 But it requires there to be a standalone, downloadable version of Luke w/o 
 any Lucene bundled in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1849) ant luke target in Solr build no longer works

2010-03-26 Thread Grant Ingersoll (JIRA)
ant luke target in Solr build no longer works
-

 Key: SOLR-1849
 URL: https://issues.apache.org/jira/browse/SOLR-1849
 Project: Solr
  Issue Type: Bug
Reporter: Grant Ingersoll
Priority: Trivial


Here's a fix:
{code}
property  name=luke.version value=1.0.0/
  available file=luke/luke-${luke.version}.jar property=luke.jar.exists /
  target name=luke-download unless=luke.jar.exists depends=proxy.setup
mkdir dir=luke/
get src=http://luke.googlecode.com/files/lukeall-${luke.version}.jar;
dest=luke/luke-${luke.version}.jar/
  /target
  
  target name=luke depends=luke-download
java fork=true 
  classname=org.getopt.luke.Luke
  logError=true
  failonerror=true
  classpath
fileset dir=luke
  include name=luke-${luke.version}.jar/
/fileset
path refid=lucene.classpath/
path refid=test.run.classpath/
   /classpath
/java
  /target
{code}

But it requires there to be a standalone, downloadable version of Luke w/o any 
Lucene bundled in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1568) Implement Spatial Filter

2010-03-26 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1568:
--

Attachment: SOLR-1568.patch

Compiles and is much closer, but still doesn't work exactly right.  Currently 
focusing on filtering for LatLonType.  Haven't tested filter creation for the 
other types yet.  Need to write unit tests.

 Implement Spatial Filter
 

 Key: SOLR-1568
 URL: https://issues.apache.org/jira/browse/SOLR-1568
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: CartesianTierQParserPlugin.java, 
 SOLR-1568.Mattmann.031010.patch.txt, SOLR-1568.patch, SOLR-1568.patch, 
 SOLR-1568.patch, SOLR-1568.patch


 Given an index with spatial information (either as a geohash, 
 SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be 
 able to pass in a filter query that takes in the field name, lat, lon and 
 distance and produces an appropriate Filter (i.e. one that is aware of the 
 underlying field type for use by Solr. 
 The interface _could_ look like:
 {code}
 fq={!sfilt dist=20}location:49.32,-79.0
 {code}
 or it could be:
 {code}
 fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20}
 {code}
 or:
 {code}
 fq={!sfilt p=49.32,-79.0 f=location dist=20}
 {code}
 or:
 {code}
 fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20}
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1850) KeepWordFilter can be slow at query time if wordlist is large

2010-03-26 Thread John Wang (JIRA)
KeepWordFilter can be slow at query time if wordlist is large
-

 Key: SOLR-1850
 URL: https://issues.apache.org/jira/browse/SOLR-1850
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: John Wang


In the case when SetString words is large, constructing a KeepWordFilter at 
query time is very costly because of the construction (copy) of the set, e.g.:

this.words = new CharArraySet(words, ignoreCase);

This call does an addAll on the set, and is done for each query, and is the 
same work.

Suggestion: overload the constructor and expose the CharArraySet, e.g.:

  public KeepWordFilter(TokenStream in, CharArraySet words ) {
super(in);
this.words = words;
this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
  }

This allows the ability to have CharArraySet to be constructed once staticly 
for the application instead at query time.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1850) KeepWordFilter can be slow at query time if wordlist is large

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850367#action_12850367
 ] 

Yonik Seeley commented on SOLR-1850:


Thanks for catching this John, copying the whole set each time is bad enough, 
I'd be tempted to classify it as a bug.

 KeepWordFilter can be slow at query time if wordlist is large
 -

 Key: SOLR-1850
 URL: https://issues.apache.org/jira/browse/SOLR-1850
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: John Wang

 In the case when SetString words is large, constructing a KeepWordFilter 
 at query time is very costly because of the construction (copy) of the set, 
 e.g.:
 this.words = new CharArraySet(words, ignoreCase);
 This call does an addAll on the set, and is done for each query, and is the 
 same work.
 Suggestion: overload the constructor and expose the CharArraySet, e.g.:
   public KeepWordFilter(TokenStream in, CharArraySet words ) {
 super(in);
 this.words = words;
 this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
   }
 This allows the ability to have CharArraySet to be constructed once staticly 
 for the application instead at query time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r928069 - in /lucene/dev/trunk: lucene/ lucene/backwards/src/test/org/apache/lucene/util/ lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/ lucene/contrib/benchmark/src/

2010-03-26 Thread Robert Muir
heads up, if you have an old checkout you might want to run 'ant clean'.

this is because the Solr Junit formatter was moved here to lucene, but
I don't think Solr yet detects dependencies being uptodate correctly.

On Fri, Mar 26, 2010 at 5:55 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Fri Mar 26 21:55:57 2010
 New Revision: 928069

 URL: http://svn.apache.org/viewvc?rev=928069view=rev
 Log:
 LUCENE-1709: Parallelize Tests

 Added:
    
 lucene/dev/trunk/lucene/backwards/src/test/org/apache/lucene/util/LuceneJUnitResultFormatter.java
      - copied, changed from r927697, 
 lucene/dev/trunk/solr/src/test/org/apache/solr/SolrJUnitResultFormatter.java
    
 lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/LuceneJUnitResultFormatter.java
      - copied, changed from r927697, 
 lucene/dev/trunk/solr/src/test/org/apache/solr/SolrJUnitResultFormatter.java
 Removed:
    
 lucene/dev/trunk/solr/src/test/org/apache/solr/SolrJUnitResultFormatter.java
 Modified:
    lucene/dev/trunk/lucene/build.xml
    lucene/dev/trunk/lucene/common-build.xml
    
 lucene/dev/trunk/lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/BenchmarkTestCase.java
    
 lucene/dev/trunk/lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/quality/TestQualityRun.java
    lucene/dev/trunk/solr/build.xml
    lucene/dev/trunk/solr/common-build.xml

 Copied: 
 lucene/dev/trunk/lucene/backwards/src/test/org/apache/lucene/util/LuceneJUnitResultFormatter.java
  (from r927697, 
 lucene/dev/trunk/solr/src/test/org/apache/solr/SolrJUnitResultFormatter.java)
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/backwards/src/test/org/apache/lucene/util/LuceneJUnitResultFormatter.java?p2=lucene/dev/trunk/lucene/backwards/src/test/org/apache/lucene/util/LuceneJUnitResultFormatter.javap1=lucene/dev/trunk/solr/src/test/org/apache/solr/SolrJUnitResultFormatter.javar1=927697r2=928069rev=928069view=diff
 ==
 --- 
 lucene/dev/trunk/solr/src/test/org/apache/solr/SolrJUnitResultFormatter.java 
 (original)
 +++ 
 lucene/dev/trunk/lucene/backwards/src/test/org/apache/lucene/util/LuceneJUnitResultFormatter.java
  Fri Mar 26 21:55:57 2010
 @@ -16,8 +16,9 @@
  *
  */

 -package org.apache.solr;
 +package org.apache.lucene.util;

 +import java.io.File;
  import java.io.IOException;
  import java.io.OutputStream;
  import java.text.NumberFormat;
 @@ -25,6 +26,8 @@ import java.text.NumberFormat;
  import junit.framework.AssertionFailedError;
  import junit.framework.Test;

 +import org.apache.lucene.store.LockReleaseFailedException;
 +import org.apache.lucene.store.NativeFSLockFactory;
  import org.apache.tools.ant.taskdefs.optional.junit.JUnitResultFormatter;
  import org.apache.tools.ant.taskdefs.optional.junit.JUnitTest;
  import org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner;
 @@ -37,9 +40,11 @@ import org.apache.tools.ant.util.StringU
  * At this point, the output is written at once in synchronized fashion.
  * This way tests can run in parallel without interleaving output.
  */
 -public class SolrJUnitResultFormatter implements JUnitResultFormatter {
 +public class LuceneJUnitResultFormatter implements JUnitResultFormatter {
   private static final double ONE_SECOND = 1000.0;

 +  private NativeFSLockFactory lockFactory;
 +
   /** Where to write the log to. */
   private OutputStream out;

 @@ -55,8 +60,21 @@ public class SolrJUnitResultFormatter im
   /** Buffer output until the end of the test */
   private StringBuilder sb;

 +  private org.apache.lucene.store.Lock lock;
 +
   /** Constructor for SolrJUnitResultFormatter. */
 -  public SolrJUnitResultFormatter() {
 +  public LuceneJUnitResultFormatter() {
 +    File lockDir = new File(System.getProperty(java.io.tmpdir), 
 lucene_junit_lock);
 +    lockDir.mkdirs();
 +    if(!lockDir.exists()) {
 +      throw new RuntimeException(Could not make Lock directory: + lockDir);
 +    }
 +    try {
 +      lockFactory = new NativeFSLockFactory(lockDir);
 +      lock = lockFactory.makeLock(junit_lock);
 +    } catch (IOException e) {
 +      throw new RuntimeException(e);
 +    }
     sb = new StringBuilder();
   }

 @@ -135,8 +153,17 @@ public class SolrJUnitResultFormatter im

     if (out != null) {
       try {
 -        out.write(sb.toString().getBytes());
 -        out.flush();
 +        lock.obtain(5000);
 +        try {
 +          out.write(sb.toString().getBytes());
 +          out.flush();
 +        } finally {
 +          try {
 +            lock.release();
 +          } catch(LockReleaseFailedException e) {
 +            // well lets pretend its released anyway
 +          }
 +        }
       } catch (IOException e) {
         throw new RuntimeException(unable to write results, e);
       } finally {
 @@ -227,3 +254,4 @@ public class SolrJUnitResultFormatter im
     sb.append(StringUtils.LINE_SEP);
   }
  }
 +

 Modified: lucene/dev/trunk/lucene/build.xml
 URL: 
 

[jira] Commented: (SOLR-1850) KeepWordFilter can be slow at query time if wordlist is large

2010-03-26 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850377#action_12850377
 ] 

John Wang commented on SOLR-1850:
-

Hi Yonk:

 No problem! Do you think overloading the constructor is the right thing to 
do here?

-John

 KeepWordFilter can be slow at query time if wordlist is large
 -

 Key: SOLR-1850
 URL: https://issues.apache.org/jira/browse/SOLR-1850
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: John Wang

 In the case when SetString words is large, constructing a KeepWordFilter 
 at query time is very costly because of the construction (copy) of the set, 
 e.g.:
 this.words = new CharArraySet(words, ignoreCase);
 This call does an addAll on the set, and is done for each query, and is the 
 same work.
 Suggestion: overload the constructor and expose the CharArraySet, e.g.:
   public KeepWordFilter(TokenStream in, CharArraySet words ) {
 super(in);
 this.words = words;
 this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
   }
 This allows the ability to have CharArraySet to be constructed once staticly 
 for the application instead at query time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1850) KeepWordFilter can be slow at query time if wordlist is large

2010-03-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850392#action_12850392
 ] 

Yonik Seeley commented on SOLR-1850:


Yes, that's definitely the way to go.

 KeepWordFilter can be slow at query time if wordlist is large
 -

 Key: SOLR-1850
 URL: https://issues.apache.org/jira/browse/SOLR-1850
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: John Wang

 In the case when SetString words is large, constructing a KeepWordFilter 
 at query time is very costly because of the construction (copy) of the set, 
 e.g.:
 this.words = new CharArraySet(words, ignoreCase);
 This call does an addAll on the set, and is done for each query, and is the 
 same work.
 Suggestion: overload the constructor and expose the CharArraySet, e.g.:
   public KeepWordFilter(TokenStream in, CharArraySet words ) {
 super(in);
 this.words = words;
 this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
   }
 This allows the ability to have CharArraySet to be constructed once staticly 
 for the application instead at query time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1850) KeepWordFilter can be slow at query time if wordlist is large

2010-03-26 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-1850.


   Resolution: Fixed
Fix Version/s: 3.1

Thanks John, I've committed this suggestion along with a testcase fix.

 KeepWordFilter can be slow at query time if wordlist is large
 -

 Key: SOLR-1850
 URL: https://issues.apache.org/jira/browse/SOLR-1850
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: John Wang
 Fix For: 3.1


 In the case when SetString words is large, constructing a KeepWordFilter 
 at query time is very costly because of the construction (copy) of the set, 
 e.g.:
 this.words = new CharArraySet(words, ignoreCase);
 This call does an addAll on the set, and is done for each query, and is the 
 same work.
 Suggestion: overload the constructor and expose the CharArraySet, e.g.:
   public KeepWordFilter(TokenStream in, CharArraySet words ) {
 super(in);
 this.words = words;
 this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
   }
 This allows the ability to have CharArraySet to be constructed once staticly 
 for the application instead at query time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.