[jira] Commented: (SOLR-1688) Inner class FieldCacheSources should be refactored into their own classes

2010-03-10 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843924#action_12843924
 ] 

Chris A. Mattmann commented on SOLR-1688:
-

Any comments on this guys? Compromise? Standoff? White flag? :P

> Inner class FieldCacheSources should be refactored into their own classes
> -
>
> Key: SOLR-1688
> URL: https://issues.apache.org/jira/browse/SOLR-1688
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
> Environment: indep. of env.
>Reporter: Chris A. Mattmann
> Fix For: 1.5
>
> Attachments: SOLR-1688.Mattmann.122609.patch.txt
>
>
> While working on SOLR-1586 I noticed that outside of class level access (or 
> package level), you can't really reference FieldCacheSources that are defined 
> inside of their FieldType constituents (e.g., in the case of StrFieldSource 
> as defined in StrField). What's more troubling is that the 
> FieldType/FieldCacheSources are defined in an inconsistent fashion: some are 
> done as inner classes, e.g., StrFieldSource and SortableFloatFieldSource, 
> while others are defined as individual classes (e.g., FloatFIeldSource). This 
> patch will make them all consistent and define each FieldCacheSource as an 
> outside class, present in o.a.solr.search.function.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1802) Make Solr work with IndexReaderFactory implementations that return MultiReader

2010-03-10 Thread John Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843922#action_12843922
 ] 

John Wang commented on SOLR-1802:
-

Thanks Mark for the pointer!

> Make Solr work with IndexReaderFactory implementations that return MultiReader
> --
>
> Key: SOLR-1802
> URL: https://issues.apache.org/jira/browse/SOLR-1802
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: John Wang
>
> When an IndexReaderFactory returns an instance of MultiReader, various places 
> in Solr try to call reader.directory() and reader.getVersion, which results 
> an UnsupportedOperationException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1642) Change name of SOLR749Test

2010-03-10 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843916#action_12843916
 ] 

Chris A. Mattmann commented on SOLR-1642:
-

Guys, any feedback on this?

> Change name of SOLR749Test
> --
>
> Key: SOLR-1642
> URL: https://issues.apache.org/jira/browse/SOLR-1642
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 1.3, 1.4
> Environment: My local MacBook pro.
>Reporter: Chris A. Mattmann
>Priority: Trivial
> Fix For: 1.5
>
> Attachments: SOLR-1642.Mattmann.121009.patch.txt
>
>
> The test class named SOLR749Test is inconsistent with all of the rest of the 
> unit tests, which aren't tied to their JIRA issue names. Some examples of 
> best practices:
> http://googletesting.blogspot.com/2007/02/tott-naming-unit-tests-responsibly.html
> http://blog.taragana.com/index.php/archive/java-unit-testing-best-practices/
> http://junit.sourceforge.net/doc/faq/faq.htm#best

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1602) Refactor SOLR package structure to include o.a.solr.response and move QueryResponseWriters in there

2010-03-10 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1602:


Attachment: SOLR-1602.Mattmann.wrapup.031010.patch.txt

- so I finally found that bit of time I was looking for to wrap this up. Ryan, 
this should take care of 1 and 2 that we were waiting on to close this out. My 
+1 to wrap it up.

> Refactor SOLR package structure to include o.a.solr.response and move 
> QueryResponseWriters in there
> ---
>
> Key: SOLR-1602
> URL: https://issues.apache.org/jira/browse/SOLR-1602
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Affects Versions: 1.2, 1.3, 1.4
> Environment: independent of environment (code structure)
>Reporter: Chris A. Mattmann
>Assignee: Ryan McKinley
> Fix For: 1.5
>
> Attachments: SOLR-1602.Mattmann.112509.patch.txt, 
> SOLR-1602.Mattmann.112509_02.patch.txt, 
> SOLR-1602.Mattmann.wrapup.031010.patch.txt, upgrade_solr_config
>
>
> Currently all o.a.solr.request.QueryResponseWriter implementations are 
> curiously located in the o.a.solr.request package. Not only is this package 
> getting big (30+ classes), a lot of them are misplaced. There should be a 
> first-class o.a.solr.response package, and the response related classes 
> should be given a home there. Patch forthcoming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-1568) Implement Spatial Filter

2010-03-10 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843902#action_12843902
 ] 

Chris A. Mattmann edited comment on SOLR-1568 at 3/11/10 4:28 AM:
--

OK, this guy compiles, and I tried to guess in a couple areas (e.g., please 
look at Haversine) where variables were missing. One nice thing you can take 
out of this is the normalize functions for lat and lon in DistanceUtils -- 
those will probably be generally useful.

I'll also look to bring some of this over to SIS, as we start to flesh it out.

I saw an error during unit tests in 
org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest, but it seems 
unrelated (so suspicious -- is this a real bug?):

{noformat}
[chipotle:solr/build/test-results] mattmann% more 
TEST-org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest.txt 
Testsuite: org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec

Testcase: testContentStreamRequest took 0.003 sec
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report does not re
flect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please n
ote the time in the report does not reflect the time until the VM exit.

[chipotle:solr/build/test-results] mattmann% 
{noformat}


  was (Author: chrismattmann):
OK, this guy compiles, and I tried to guess in a couple areas (e.g., please 
look at Haversine) where variables were missing. One nice thing you can take 
out of this is the normalize functions for lat and lon in DistanceUtils -- 
those will probably be generally useful.

I'll also look to bring some of this over to SIS, as we start to flesh it out.

I saw an error during unit tests in 
org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest, but it seems 
unrelated (so suspicious -- is this a real bug?):

[chipotle:solr/build/test-results] mattmann% more 
TEST-org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest.txt 
Testsuite: org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec

Testcase: testContentStreamRequest took 0.003 sec
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report does not re
flect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please n
ote the time in the report does not reflect the time until the VM exit.

[chipotle:solr/build/test-results] mattmann% 

  
> Implement Spatial Filter
> 
>
> Key: SOLR-1568
> URL: https://issues.apache.org/jira/browse/SOLR-1568
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: CartesianTierQParserPlugin.java, 
> SOLR-1568.Mattmann.031010.patch.txt, SOLR-1568.patch, SOLR-1568.patch
>
>
> Given an index with spatial information (either as a geohash, 
> SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be 
> able to pass in a filter query that takes in the field name, lat, lon and 
> distance and produces an appropriate Filter (i.e. one that is aware of the 
> underlying field type for use by Solr. 
> The interface _could_ look like:
> {code}
> &fq={!sfilt dist=20}location:49.32,-79.0
> {code}
> or it could be:
> {code}
> &fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt p=49.32,-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1568) Implement Spatial Filter

2010-03-10 Thread Chris A. Mattmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann updated SOLR-1568:


Attachment: SOLR-1568.Mattmann.031010.patch.txt

OK, this guy compiles, and I tried to guess in a couple areas (e.g., please 
look at Haversine) where variables were missing. One nice thing you can take 
out of this is the normalize functions for lat and lon in DistanceUtils -- 
those will probably be generally useful.

I'll also look to bring some of this over to SIS, as we start to flesh it out.

I saw an error during unit tests in 
org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest, but it seems 
unrelated (so suspicious -- is this a real bug?):

[chipotle:solr/build/test-results] mattmann% more 
TEST-org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest.txt 
Testsuite: org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest
Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec

Testcase: testContentStreamRequest took 0.003 sec
Caused an ERROR
Forked Java VM exited abnormally. Please note the time in the report does not re
flect the time until the VM exit.
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please n
ote the time in the report does not reflect the time until the VM exit.

[chipotle:solr/build/test-results] mattmann% 


> Implement Spatial Filter
> 
>
> Key: SOLR-1568
> URL: https://issues.apache.org/jira/browse/SOLR-1568
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: CartesianTierQParserPlugin.java, 
> SOLR-1568.Mattmann.031010.patch.txt, SOLR-1568.patch, SOLR-1568.patch
>
>
> Given an index with spatial information (either as a geohash, 
> SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be 
> able to pass in a filter query that takes in the field name, lat, lon and 
> distance and produces an appropriate Filter (i.e. one that is aware of the 
> underlying field type for use by Solr. 
> The interface _could_ look like:
> {code}
> &fq={!sfilt dist=20}location:49.32,-79.0
> {code}
> or it could be:
> {code}
> &fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt p=49.32,-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1568) Implement Spatial Filter

2010-03-10 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843883#action_12843883
 ] 

Chris A. Mattmann commented on SOLR-1568:
-

Hey Grant:

I'll take a look at your latest patch and try to iterate on it (at least make 
sure it compiles, and take a pass with javadocs, run the unit tests, etc.). 
Should have something up in the next few hours.

Cheers,
Chris


> Implement Spatial Filter
> 
>
> Key: SOLR-1568
> URL: https://issues.apache.org/jira/browse/SOLR-1568
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: CartesianTierQParserPlugin.java, SOLR-1568.patch, 
> SOLR-1568.patch
>
>
> Given an index with spatial information (either as a geohash, 
> SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be 
> able to pass in a filter query that takes in the field name, lat, lon and 
> distance and produces an appropriate Filter (i.e. one that is aware of the 
> underlying field type for use by Solr. 
> The interface _could_ look like:
> {code}
> &fq={!sfilt dist=20}location:49.32,-79.0
> {code}
> or it could be:
> {code}
> &fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt p=49.32,-79.0 f=location dist=20}
> {code}
> or:
> {code}
> &fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1815) SolrJ doesn't preserve the order of facet queries returned from solr

2010-03-10 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1815:
---

Description: 
Using Solrj, I wanted to sort the response of a range query based on some 
specific labels. For instance, using the query:

{noformat}
facet=true
&facet.query={!key= Less than 100}[* TO 99]
&facet.query={!key=100 - 200}[100 TO 200]
&facet.query={!key=200 +}[201 TO *]
{noformat}

I wanted to display the response in the following order:

{noformat}
Less than 100 (x)
100 - 200 (y)
201 + (z)
{noformat}

independently on the values of x, y, z which are the numbers of the retrieved 
documents for each range.

While Solr itself produces correctly the desired order (as specified in my 
query), SolrJ doesn't preserve it. 

RE: Yonik, a solution could be just to change
{code}
_facetQuery = new HashMap();
...to...
_facetQuery = new Linked HashMap();
{code}
 


  was:
Using Solrj, I wanted to sort the response of a range query based on some 
specific labels. For instance, using the query:

facet=true
&facet.query={!key= Less than 100}[* TO 99]
&facet.query={!key=100 - 200}[100 TO 200]
&facet.query={!key=200 +}[201 TO *]

I wanted to display the response in the following order:

Less than 100 (x)
100 - 200 (y)
201 + (z)

independently on the values of x, y, z which are the numbers of the retrieved 
documents for each range.

While Solr itself produces correctly the desired order (as specified in my 
query), SolrJ doesn't preserve it. 

RE: Yonik, a solution could be just to change

_facetQuery = new HashMap();
to
_facetQuery = new Linked HashMap();

 


 Issue Type: Bug  (was: Improvement)
Summary: SolrJ doesn't preserve the order of facet queries returned 
from solr  (was: Sorting range queries: SolrJ doesn't preserve the order 
produced by Solr)

revising summary to clarify the problem, reclassifying as a bug, reformating 
description to include noformat & code tags so it doesn't try to render 
emoticons.

> SolrJ doesn't preserve the order of facet queries returned from solr
> 
>
> Key: SOLR-1815
> URL: https://issues.apache.org/jira/browse/SOLR-1815
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.4
>Reporter: Steve Radhouani
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Using Solrj, I wanted to sort the response of a range query based on some 
> specific labels. For instance, using the query:
> {noformat}
> facet=true
> &facet.query={!key= Less than 100}[* TO 99]
> &facet.query={!key=100 - 200}[100 TO 200]
> &facet.query={!key=200 +}[201 TO *]
> {noformat}
> I wanted to display the response in the following order:
> {noformat}
> Less than 100 (x)
> 100 - 200 (y)
> 201 + (z)
> {noformat}
> independently on the values of x, y, z which are the numbers of the retrieved 
> documents for each range.
> While Solr itself produces correctly the desired order (as specified in my 
> query), SolrJ doesn't preserve it. 
> RE: Yonik, a solution could be just to change
> {code}
> _facetQuery = new HashMap();
> ...to...
> _facetQuery = new Linked HashMap();
> {code}
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1802) Make Solr work with IndexReaderFactory implementations that return MultiReader

2010-03-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843855#action_12843855
 ] 

Mark Miller commented on SOLR-1802:
---

Hey John,

Depending on what you are trying to do, you may look at the work around that 
was used in SOLR-1366. Its not generic, but it may work for your use case.

> Make Solr work with IndexReaderFactory implementations that return MultiReader
> --
>
> Key: SOLR-1802
> URL: https://issues.apache.org/jira/browse/SOLR-1802
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: John Wang
>
> When an IndexReaderFactory returns an instance of MultiReader, various places 
> in Solr try to call reader.directory() and reader.getVersion, which results 
> an UnsupportedOperationException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Spatial work

2010-03-10 Thread David Smiley @MITRE.org

Another question...
  How do I query for documents that have a point in a lat-lon box (no
scoring/ranking requirements)?  If my documents had no more than one point
then I could use current Solr features with a pair of range queries each
going over a float lat field and a lon field.  But my documents have
multiple points so this won't work because the two filters wouldn't
necessarily correspond to the same point.  Is there a solution for this
problem in place with the spatial work committed already?  If so what would
the query look like for this?

None of the filters here quite address this (but would my filter query was a
circle): http://wiki.apache.org/solr/SpatialSearch#Filtering

~ David Smiley
-- 
View this message in context: 
http://old.nabble.com/Spatial-work-tp27817321p27857162.html
Sent from the Solr - Dev mailing list archive at Nabble.com.



[jira] Updated: (SOLR-64) strict hierarchical facets

2010-03-10 Thread Benjamin Armintor (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Armintor updated SOLR-64:
--

Attachment: SOLR-64.patch

If token streams are being re-used, the token streams produced by the 
HierarchicalTokenFilterFactory need to respond to being reset().

In particular, the StringBuilder it uses to build the facet values needs to be 
cleared on reset.  I'm attaching a patch that resets the StringBuilder and the 
delegate TokenStream.  It also includes some junit tests that cover a basic 
hierarchical facet search, and the handling of facet.depth and facet.prefix.  
It builds and tests against trunk rev 921562.  I think this fixes the 1.4 issue.

> strict hierarchical facets
> --
>
> Key: SOLR-64
> URL: https://issues.apache.org/jira/browse/SOLR-64
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Yonik Seeley
> Fix For: 1.5
>
> Attachments: SOLR-64.patch, SOLR-64.patch, SOLR-64.patch, 
> SOLR-64.patch
>
>
> Strict Facet Hierarchies... each tag has at most one parent (a tree).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1659) Get off deprecated Lucene API's to clear the way for a move to Lucene 3.0 +

2010-03-10 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-1659:
--

Attachment: SOLR-1659.patch

To trunk - updates things that have changed, and some previous workarounds not 
needed with Lucene 3.01 as opposed to 3.0

> Get off deprecated Lucene API's to clear the way for a move to Lucene 3.0 +
> ---
>
> Key: SOLR-1659
> URL: https://issues.apache.org/jira/browse/SOLR-1659
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Miller
> Attachments: SOLR-1659.patch, SOLR-1659.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-10 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843632#action_12843632
 ] 

Ted Dunning commented on SOLR-1814:
---


Trove is GPL.

The Mahout project has a partial set of replacements for Trove collections in 
case you want to go forward with this.  Our plan is to consider breaking out 
the collections package from Mahout at some point in case you don't want to 
drag along the rest of Mahout.


> select count(distinct fieldname) in SOLR
> 
>
> Key: SOLR-1814
> URL: https://issues.apache.org/jira/browse/SOLR-1814
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Affects Versions: 1.4, 1.5, 1.6, 2.0
>Reporter: Marcus Herou
> Fix For: 1.4, 1.5, 1.6, 2.0
>
> Attachments: CountComponent.java
>
>
> I have seen questions on the mailinglist about having the functionality for 
> counting distinct on a field. We at Tailsweep as well want to that in for 
> example our blogsearch.
> Example:
> "You had 1345 hits on 244 blogs"
> The 244 part is not possible in SOLR today (correct me if I am wrong). So 
> I've written a component which does this. Attaching it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1813) Support Arabic PDF extraction

2010-03-10 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-1813.
---

   Resolution: Fixed
Fix Version/s: 1.5

> Support Arabic PDF extraction
> -
>
> Key: SOLR-1813
> URL: https://issues.apache.org/jira/browse/SOLR-1813
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Affects Versions: 1.4
>Reporter: Robert Muir
>Assignee: Grant Ingersoll
> Fix For: 1.5
>
> Attachments: arabic.pdf, icu4j-4_2_1.jar, SOLR-1813.patch
>
>
> Extraction of Arabic text from PDF files is supported by tika/pdfbox, but we 
> don't have the optional dependency to do it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1815) Sorting range queries: SolrJ doesn't preserve the order produced by Solr

2010-03-10 Thread Steve Radhouani (JIRA)
Sorting range queries: SolrJ doesn't preserve the order produced by Solr


 Key: SOLR-1815
 URL: https://issues.apache.org/jira/browse/SOLR-1815
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 1.4
Reporter: Steve Radhouani


Using Solrj, I wanted to sort the response of a range query based on some 
specific labels. For instance, using the query:

facet=true
&facet.query={!key= Less than 100}[* TO 99]
&facet.query={!key=100 - 200}[100 TO 200]
&facet.query={!key=200 +}[201 TO *]

I wanted to display the response in the following order:

Less than 100 (x)
100 - 200 (y)
201 + (z)

independently on the values of x, y, z which are the numbers of the retrieved 
documents for each range.

While Solr itself produces correctly the desired order (as specified in my 
query), SolrJ doesn't preserve it. 

RE: Yonik, a solution could be just to change

_facetQuery = new HashMap();
to
_facetQuery = new Linked HashMap();

 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1674) improve analysis tests, cut over to new API

2010-03-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843611#action_12843611
 ] 

Mark Miller commented on SOLR-1674:
---

I've committed the speed up patch, thanks Robert!

Leaving open for posInc tests

> improve analysis tests, cut over to new API
> ---
>
> Key: SOLR-1674
> URL: https://issues.apache.org/jira/browse/SOLR-1674
> Project: Solr
>  Issue Type: Test
>  Components: Schema and Analysis
>Reporter: Robert Muir
>Assignee: Mark Miller
> Attachments: SOLR-1674.patch, SOLR-1674.patch, SOLR-1674_speedup.patch
>
>
> This patch
> * converts all analysis tests to use the new tokenstream api
> * converts most tests to use the more stringent assertion mechanisms from 
> lucene
> * adds new tests to improve coverage
> Most bugs found by more stringent testing have been fixed, with the exception 
> of SynonymFilter.
> The problems with this filter are more serious, the previous tests were 
> essentially a no-op.
> The new tests for SynonymFilter test the current behavior, but have FIXMEs 
> with what I think the old test wanted to expect in the comments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-1813) Support Arabic PDF extraction

2010-03-10 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll reassigned SOLR-1813:
-

Assignee: Grant Ingersoll

> Support Arabic PDF extraction
> -
>
> Key: SOLR-1813
> URL: https://issues.apache.org/jira/browse/SOLR-1813
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Solr Cell (Tika extraction)
>Affects Versions: 1.4
>Reporter: Robert Muir
>Assignee: Grant Ingersoll
> Attachments: arabic.pdf, icu4j-4_2_1.jar, SOLR-1813.patch
>
>
> Extraction of Arabic text from PDF files is supported by tika/pdfbox, but we 
> don't have the optional dependency to do it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-10 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843574#action_12843574
 ] 

Erik Hatcher commented on SOLR-1814:


I'm a bit confused here, but maybe don't quite understand what you've 
implemented.  Doesn't faceting give you the counts you're after here?   I'm 
assuming "blogs" in your example is a value of a "type" field or something like 
that.  Faceting on the type field would give you that count, or doing a 
facet.query=type:blogs would give you just that count (for any arbitrary query).



> select count(distinct fieldname) in SOLR
> 
>
> Key: SOLR-1814
> URL: https://issues.apache.org/jira/browse/SOLR-1814
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Affects Versions: 1.4, 1.5, 1.6, 2.0
>Reporter: Marcus Herou
> Fix For: 1.4, 1.5, 1.6, 2.0
>
> Attachments: CountComponent.java
>
>
> I have seen questions on the mailinglist about having the functionality for 
> counting distinct on a field. We at Tailsweep as well want to that in for 
> example our blogsearch.
> Example:
> "You had 1345 hits on 244 blogs"
> The 244 part is not possible in SOLR today (correct me if I am wrong). So 
> I've written a component which does this. Attaching it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-10 Thread Marcus Herou (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Herou updated SOLR-1814:
---

Fix Version/s: 1.4
   2.0
   1.6
   1.5
Affects Version/s: 2.0
   1.6
   1.5

> select count(distinct fieldname) in SOLR
> 
>
> Key: SOLR-1814
> URL: https://issues.apache.org/jira/browse/SOLR-1814
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Affects Versions: 1.4, 1.5, 1.6, 2.0
>Reporter: Marcus Herou
> Fix For: 1.4, 1.5, 1.6, 2.0
>
> Attachments: CountComponent.java
>
>
> I have seen questions on the mailinglist about having the functionality for 
> counting distinct on a field. We at Tailsweep as well want to that in for 
> example our blogsearch.
> Example:
> "You had 1345 hits on 244 blogs"
> The 244 part is not possible in SOLR today (correct me if I am wrong). So 
> I've written a component which does this. Attaching it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: CountComponent

2010-03-10 Thread Marcus Herou
Filed it here.
https://issues.apache.org/jira/browse/SOLR-1814

Did not change prio from major since that I guess is up to the community.

Cheers

/Marcus


On Tue, Mar 9, 2010 at 7:48 PM, Steven A Rowe  wrote:

> Hi Marcus,
>
> http://wiki.apache.org/solr/HowToContribute
>
> Or were you asking a different question?
>
> Steve
>
> On 03/09/2010 at 10:52 AM, Marcus Herou wrote:
> > Hi.
> >
> > I've developed a SearchComponent named "CountComponent" which emulates
> > the SQL equiv select count(distinct field)... I think that it perhaps
> > should be put in contrib or such. How can I get this piece of code into
> > Solr ?
> >
> > Cheers
> >
> > //Marcus Herou
> >
> > --
> > Marcus Herou CTO and co-founder Tailsweep AB
> > +46702561312
> > marcus.he...@tailsweep.com
> > http://www.tailsweep.com/
>
>
>


-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.he...@tailsweep.com
http://www.tailsweep.com/


[jira] Updated: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-10 Thread Marcus Herou (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Herou updated SOLR-1814:
---

Attachment: CountComponent.java

It has dependencies to GNU Trove tested against v 2.0.2
http://sourceforge.net/projects/trove4j/files/trove/archived/trove-2.0.2/trove-2.0.2.tar.gz/download

Trove have more memory efficient data structures so I used those instead. 
Perhaps should be broken out.

solrconfig.xml


  count   






> select count(distinct fieldname) in SOLR
> 
>
> Key: SOLR-1814
> URL: https://issues.apache.org/jira/browse/SOLR-1814
> Project: Solr
>  Issue Type: New Feature
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Marcus Herou
> Attachments: CountComponent.java
>
>
> I have seen questions on the mailinglist about having the functionality for 
> counting distinct on a field. We at Tailsweep as well want to that in for 
> example our blogsearch.
> Example:
> "You had 1345 hits on 244 blogs"
> The 244 part is not possible in SOLR today (correct me if I am wrong). So 
> I've written a component which does this. Attaching it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1814) select count(distinct fieldname) in SOLR

2010-03-10 Thread Marcus Herou (JIRA)
select count(distinct fieldname) in SOLR


 Key: SOLR-1814
 URL: https://issues.apache.org/jira/browse/SOLR-1814
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Affects Versions: 1.4
Reporter: Marcus Herou


I have seen questions on the mailinglist about having the functionality for 
counting distinct on a field. We at Tailsweep as well want to that in for 
example our blogsearch.

Example:
"You had 1345 hits on 244 blogs"

The 244 part is not possible in SOLR today (correct me if I am wrong). So I've 
written a component which does this. Attaching it.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.