[jira] Created: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-07 Thread Hoss Man (JIRA)
Enhance QueryUtils and CheckHIts to wrap everything they check in 
MultiReader/MultiSearcher
---

 Key: LUCENE-1791
 URL: https://issues.apache.org/jira/browse/LUCENE-1791
 Project: Lucene - Java
  Issue Type: Test
Reporter: Hoss Man


methods in CheckHits  QueryUtils are in a good position to take any Searcher 
they are given and not only test it, but also test MultiReader  MultiSearcher 
constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740439#action_12740439
 ] 

Bill Bell commented on LUCENE-1781:
---

Michael - Please rerun your tests.

The 2 normalization functions probably are now not needed, but they are there 
as an added check...

I am using the algorithm from Destination point given distance and bearing 
from start point at http://www.movable-type.co.uk/scripts/latlong.html 

Thanks.

 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
 at 
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
 at 
 org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740439#action_12740439
 ] 

Bill Bell edited comment on LUCENE-1781 at 8/7/09 1:03 AM:
---

Michael - Please rerun your tests.

The 2 normalization functions probably are now not needed, but they are there 
as an added check...

This algorithm is standard, several web sites use it from Haversine. One 
example is at  Destination point given distance and bearing from start point 
at http://www.movable-type.co.uk/scripts/latlong.html 

Thanks.

  was (Author: billnbell):
Michael - Please rerun your tests.

The 2 normalization functions probably are now not needed, but they are there 
as an added check...

I am using the algorithm from Destination point given distance and bearing 
from start point at http://www.movable-type.co.uk/scripts/latlong.html 

Thanks.
  
 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
 at 
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
 at 
 org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740478#action_12740478
 ] 

Michael McCandless commented on LUCENE-1781:


Thanks for the updated patch Bill!

That's a good improvement (taking into account the varying miles per lng, 
depending on lat), but isn't that fix orthogonal to the normalization issue?  
Ie, one could still easily overflow lat or lng with a large enough miles.  EG, 
I added 6000 miles as a testcase in TestCartesian, and if I turn off the 
normalization, it hits the same exception (Illegal lattitude value 
94.77745787739758).

And I'm still concerned that the normalization fails to properly cross the 
north (or south) pole, by flipping the lng whenever the lat is too high; 
instead it seems to incorrectly bounce the point back?  Am I missing 
something?

 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
 at 
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
 at 
 org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740481#action_12740481
 ] 

Michael McCandless commented on LUCENE-1782:


bq. I did not see the readme.txt for the StandardSyntaxParser.jj, but 
everything else looks good

It's README.javacc, under contrib/queryparser.

OK I'll commit shortly!

 Rename OriginalQueryParserHelper
 

 Key: LUCENE-1782
 URL: https://issues.apache.org/jira/browse/LUCENE-1782
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Affects Versions: 2.9
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1782.patch, LUCENE-1782.patch


 We should rename the new QueryParser so it's clearer that it's
 Lucene's default QueryParser, going forward, and not just a temporary
 bridge to a future new QueryParser.
 How about we rename oal.queryParser.original --
 oal.queryParser.standard (can't use default: it's a Java keyword)?
 Then, leave the OriginalQueryParserHelper under that package, but
 simply rename it to QueryParser?
 This way if we create other sub-packages in the future, eg
 ComplexPhraseQueryParser, they too can have a QueryParser class under
 them, to make it clear that's the top class you use to parse
 queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1628) Persian Analyzer

2009-08-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740482#action_12740482
 ] 

Robert Muir commented on LUCENE-1628:
-

I have been looking this over, I think this one is ready. any 
comments/concerns? 


 Persian Analyzer
 

 Key: LUCENE-1628
 URL: https://issues.apache.org/jira/browse/LUCENE-1628
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/analyzers
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1628.patch, LUCENE-1628.patch, LUCENE-1628.patch, 
 LUCENE-1628.patch, LUCENE-1628.patch, LUCENE-1628.txt


 A simple persian analyzer.
 i measured trec scores with the benchmark package below against 
 http://ece.ut.ac.ir/DBRG/Hamshahri/ :
 SimpleAnalyzer:
 SUMMARY
   Search Seconds: 0.012
   DocName Seconds:0.020
   Num Points:   981.015
   Num Good Points:   33.738
   Max Good Points:   36.185
   Average Precision:  0.374
   MRR:0.667
   Recall: 0.905
   Precision At 1: 0.585
   Precision At 2: 0.531
   Precision At 3: 0.513
   Precision At 4: 0.496
   Precision At 5: 0.486
   Precision At 6: 0.487
   Precision At 7: 0.479
   Precision At 8: 0.465
   Precision At 9: 0.458
   Precision At 10:0.460
   Precision At 11:0.453
   Precision At 12:0.453
   Precision At 13:0.445
   Precision At 14:0.438
   Precision At 15:0.438
   Precision At 16:0.438
   Precision At 17:0.429
   Precision At 18:0.429
   Precision At 19:0.419
   Precision At 20:0.415
 PersianAnalyzer:
 SUMMARY
   Search Seconds: 0.004
   DocName Seconds:0.011
   Num Points:   987.692
   Num Good Points:   36.123
   Max Good Points:   36.185
   Average Precision:  0.481
   MRR:0.833
   Recall: 0.998
   Precision At 1: 0.754
   Precision At 2: 0.715
   Precision At 3: 0.646
   Precision At 4: 0.646
   Precision At 5: 0.631
   Precision At 6: 0.621
   Precision At 7: 0.593
   Precision At 8: 0.577
   Precision At 9: 0.573
   Precision At 10:0.566
   Precision At 11:0.572
   Precision At 12:0.562
   Precision At 13:0.554
   Precision At 14:0.549
   Precision At 15:0.542
   Precision At 16:0.538
   Precision At 17:0.533
   Precision At 18:0.527
   Precision At 19:0.525
   Precision At 20:0.518

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740492#action_12740492
 ] 

Michael McCandless commented on LUCENE-1789:


It is nice that DocValues gives us the freedom to do this, but I'm
not sure we should, because it's a sizable performance trap.

Ie, we'll be silently inserting a call to ReaderUtil.subSearcher on
every doc value lookup (vs previously when it was a single top-level
array lookup).

While client code that has relied on this in the past will nicely
continue to function properly, if we make this change, its performance
is going to silently take a [possibly sizable] hit.

In general, with Lucene, we can do the per-segment switching up high
(which is what the core now does, exclusively), or we can do it down
low (creating MultiTermDocs, MultiTermEnum, MultiTermPositions,
MultiDocValues, etc.), which has sizable performance costs.  It's also
costly for us because we'll have N different places where we must
create  maintain a MultiXXX class.  I would love to someday deprecate
all of the down low switching classes :)

In the core I think we should always switch up high.  We've already
done this w/ searching and collection/sorting.  In LUCENE-1771 we're
fixing IndexSearcher.explain to do so as well.

With external code, I'd like over time to strongly encourage only
switching up high as well.

Maybe it'd be best if we could somehow allow this down low switching
for 2.9, but 1) warn that you'll see a performance hit right off, 2)
deprecate it, and 3) and somehow state that in 3.0 you'll have to send
only a SegmentReader to this API, instead.

EG, imagine an app that created an external custom HitCollector that
calls say FloatFieldSource on the top reader in order to use of a
float value per doc in each collect() call.  On upgrading to 2.9, this
app will already have to make the switch to the Collector API, which'd
be a great time for them to also then switch to pulling these float
values per-segment.  But, if we make the proposed change here, the app
could in fact just keep working off the top-level values (eg if the
ctor in their class is pulling these values), thinking everything is
fine when in fact there is a sizable, silent perf hit.  I'd prefer in
2.9 for them to also switch their DocValues lookup to be per segment.

[Aside: once we gain clarity on LUCENE-831, hopefully we can do away
with oal.search.function.FieldCacheSource,
{Byte,Short,Int,Ord,ReverseOrd}FieldSource, etc.  Ie these classes
basically copy what FieldCache does, but expose a per-doc method call
instead of a fixed array lookup.]



 getDocValues should provide a MultiReader DocValues abstraction
 ---

 Key: LUCENE-1789
 URL: https://issues.apache.org/jira/browse/LUCENE-1789
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


 When scoring a ValueSourceQuery, the scoring code calls 
 ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
 instances are backed by the individual FieldCache entries of the subreaders 
 -- but if Client code were to inadvertently  called getValues() on a 
 MultiReader (or DirectoryReader) they would wind up using the outer 
 FieldCache.
 Since getValues(IndexReader) returns DocValues, we have an advantage here 
 that we don't have with FieldCache API (which is required to provide direct 
 array access). getValues(IndexReader) could be implimented so that *IF* some 
 a caller inadvertently passes in a reader with non-null subReaders, getValues 
 could generate a DocValues instance for each of the subReaders, and then wrap 
 them in a composite MultiDocValues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740494#action_12740494
 ] 

Michael McCandless commented on LUCENE-1789:


Or... how about if we made a separate helper class, whose purpose
was to accept a top-level reader and do down low switching to this
new MultiDocValues class.  This class would be deprecated, ie, exist
only in 2.9 to help external usage of the DocValues API migrate to up
high switching.

However, you'd have to explicitly create this class.  EG, in the
normal DocValues classes we throw an exception if you pass in a
top-level reader, noting clearly that you could 1) switch to this
helper class (at a sizable per-lookup performance hit), or 2) switch
to looking up your values per-segment?

This way at least it'd be much clearer to the external consumer the
cost of using the down low switching class.  It'd make the decision
explicit, not silent, on upgrading to 2.9.


 getDocValues should provide a MultiReader DocValues abstraction
 ---

 Key: LUCENE-1789
 URL: https://issues.apache.org/jira/browse/LUCENE-1789
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


 When scoring a ValueSourceQuery, the scoring code calls 
 ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
 instances are backed by the individual FieldCache entries of the subreaders 
 -- but if Client code were to inadvertently  called getValues() on a 
 MultiReader (or DirectoryReader) they would wind up using the outer 
 FieldCache.
 Since getValues(IndexReader) returns DocValues, we have an advantage here 
 that we don't have with FieldCache API (which is required to provide direct 
 array access). getValues(IndexReader) could be implimented so that *IF* some 
 a caller inadvertently passes in a reader with non-null subReaders, getValues 
 could generate a DocValues instance for each of the subReaders, and then wrap 
 them in a composite MultiDocValues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1782) Rename OriginalQueryParserHelper

2009-08-07 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1782.


Resolution: Fixed

 Rename OriginalQueryParserHelper
 

 Key: LUCENE-1782
 URL: https://issues.apache.org/jira/browse/LUCENE-1782
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Affects Versions: 2.9
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1782.patch, LUCENE-1782.patch


 We should rename the new QueryParser so it's clearer that it's
 Lucene's default QueryParser, going forward, and not just a temporary
 bridge to a future new QueryParser.
 How about we rename oal.queryParser.original --
 oal.queryParser.standard (can't use default: it's a Java keyword)?
 Then, leave the OriginalQueryParserHelper under that package, but
 simply rename it to QueryParser?
 This way if we create other sub-packages in the future, eg
 ComplexPhraseQueryParser, they too can have a QueryParser class under
 them, to make it clear that's the top class you use to parse
 queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740328#action_12740328
 ] 

Hoss Man edited comment on LUCENE-1789 at 8/7/09 7:16 AM:
--

This idea orriginated in LUCENE-1749, see these comments...

https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740155#action_12740155
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740256#action_12740256
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740278#action_12740278


I've marked this for 2.9 for now  i think it's a nice to have in 2.9, 
because unlike general FieldCache usage, the API is abstract enough we can 
protect our users from mistakes; but i don't personally think it's critical 
that we do this if no one else wants to take a stab at it.

(EDIT: shorter versions of URLs to prevent horizontal scroll)

  was (Author: hossman):
This idea orriginated in LUCENE-1749, see these comments...

https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740155page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740155
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740256page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740256
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740278page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740278


I've marked this for 2.9 for now  i think it's a nice to have in 2.9, 
because unlike general FieldCache usage, the API is abstract enough we can 
protect our users from mistakes; but i don't personally think it's critical 
that we do this if no one else wants to take a stab at it.
  
 getDocValues should provide a MultiReader DocValues abstraction
 ---

 Key: LUCENE-1789
 URL: https://issues.apache.org/jira/browse/LUCENE-1789
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


 When scoring a ValueSourceQuery, the scoring code calls 
 ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
 instances are backed by the individual FieldCache entries of the subreaders 
 -- but if Client code were to inadvertently  called getValues() on a 
 MultiReader (or DirectoryReader) they would wind up using the outer 
 FieldCache.
 Since getValues(IndexReader) returns DocValues, we have an advantage here 
 that we don't have with FieldCache API (which is required to provide direct 
 array access). getValues(IndexReader) could be implimented so that *IF* some 
 a caller inadvertently passes in a reader with non-null subReaders, getValues 
 could generate a DocValues instance for each of the subReaders, and then wrap 
 them in a composite MultiDocValues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



JDK 1.5 in Analyzers

2009-08-07 Thread Grant Ingersoll
Looks like more 1.5 in contrib/analyzers, even though the smartcn  
build says 1.4:


compile-core:
[mkdir] Created dir: /lucene/java/lucene-clean/build/contrib/ 
analyzers/smartcn/classes/java
[javac] Compiling 18 source files to /lucene/java/lucene-clean/ 
build/contrib/analyzers/smartcn/classes/java
[javac] /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/ 
java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java:94:  
hashCode() in java.lang.Object cannot be applied to (char[])

[javac] result = prime * result + Arrays.hashCode(charArray);
[javac] ^
[javac] /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/ 
java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java:94:  
incompatible types

[javac] found   : java.lang.String
[javac] required: int
[javac] result = prime * result + Arrays.hashCode(charArray);
[javac] ^
[javac] /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/ 
java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java:54:  
hashCode() in java.lang.Object cannot be applied to (char[])

[javac] result = prime * result + Arrays.hashCode(charArray);
[javac] ^
[javac] /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/ 
java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java:54:  
incompatible types

[javac] found   : java.lang.String
[javac] required: int
[javac] result = prime * result + Arrays.hashCode(charArray);
[javac] ^
[javac] Note: /lucene/java/lucene-clean/contrib/analyzers/smartcn/ 
src/java/org/apache/lucene/analysis/cn/SmartChineseAnalyzer.java uses  
or overrides a deprecated API.

[javac] Note: Recompile with -deprecation for details.
[javac] 4 errors


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740575#action_12740575
 ] 

Yonik Seeley commented on LUCENE-1768:
--

It feels like going that route would add much code and complexity.

If the user already knows how to create a range query in code, it's much more 
straightforward to just do

{code}
if (money.equals(field)) return new NumericRangeQuery(field,...)
else return super.getRangeQuery(field,...)
{code}

 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740574#action_12740574
 ] 

Hoss Man commented on LUCENE-1789:
--

{quote}
While client code that has relied on this in the past will nicely
continue to function properly, if we make this change, its performance
is going to silently take a [possibly sizable] hit.
{quote}

Correct: a change like this could cause 2.9 to introduce a _time_ based 
performance hit from the added method call to resolve the sub(reader|docvalue) 
on each method call ... but if we don't have a change like this, 2.9 could 
introduce a _memory_ based performance hit from the other FieldCache changes as 
it client code accessing DocValues for the  top level reader will create a 
duplication of the whole array.

Incidently: I'm willing to believe you that the time based perf hit would be 
high, but my instinct is that it wouldn't be that bad: the DocValues API 
already introduces at least one method call per doc lookup (two depending on 
datatype).  adding a second method call to delegate to a sub-DocValues isntance 
doesn't seem that bad (especially since a new MultDocValues class could get the 
subReader list and compute the docId offsets on init, and then reuse them on 
each method call)

bq. In the core I think we should always switch up high.

(In case there is any confusion: wasn't suggesting that we stop using up high 
switching on DocValues in code included in the Lucene dist, i was suggesting 
that if someone uses DocValues directly in their code (against a top level 
reader) then we help them out by giving them the down low switching ... so 
expected usages wouldn't pay the added time based hit, just unexpected 
usages (which would be saved from the memory hit))

{quote}
Maybe it'd be best if we could somehow allow this down low switching
for 2.9, but 1) warn that you'll see a performance hit right off, 2)
deprecate it, and 3) and somehow state that in 3.0 you'll have to send
only a SegmentReader to this API, instead.
{quote}

that would get into really sticky territory for people writting custom 
IndexReaders (or using FilteredIndexReader)

bq. But, if we make the proposed change here, the app could in fact just keep 
working off the top-level values (eg if the ctor in their class is pulling 
these values), thinking everything is fine when in fact there is a sizable, 
silent perf hit.

I agree ... but unless i'm missing something about the code on the trunk, that 
situation already exists: the developer might switch to using the Collector 
API, but nothing about the   current trunk will prevent/warn him that this...

{code}
ValueSource vs = new ValueSource(aFieldIAlsoSortOn);
IndexReader r = getCurrentReaderThatCouldBeAMultiReader();
DocValues vals = vs.getDocValues(r);
{code}

...could have a sizable, silent, _memory_ perf hit in 2.9

(ValueSource.getValues has a javadoc indicating that caching will be done on 
the IndexReader passed in, but your comment suggests that if 2.9 were released 
today (with hte current trunk) people upgrading would have some obvious way of 
noticing that they need to pass a sub reader to getValues)






 getDocValues should provide a MultiReader DocValues abstraction
 ---

 Key: LUCENE-1789
 URL: https://issues.apache.org/jira/browse/LUCENE-1789
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


 When scoring a ValueSourceQuery, the scoring code calls 
 ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
 instances are backed by the individual FieldCache entries of the subreaders 
 -- but if Client code were to inadvertently  called getValues() on a 
 MultiReader (or DirectoryReader) they would wind up using the outer 
 FieldCache.
 Since getValues(IndexReader) returns DocValues, we have an advantage here 
 that we don't have with FieldCache API (which is required to provide direct 
 array access). getValues(IndexReader) could be implimented so that *IF* some 
 a caller inadvertently passes in a reader with non-null subReaders, getValues 
 could generate a DocValues instance for each of the subReaders, and then wrap 
 them in a composite MultiDocValues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1790) Boosting Function Term Query

2009-08-07 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-1790:


Description: Similar to the BoostingTermQuery, the 
BoostingFunctionTermQuery is a SpanTermQuery, but the difference is the payload 
score for a doc is not the average of all the payloads, but applies a function 
to them instead.  BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
AveragePayloadFunction applied to it.  (was: Similar to the BoostingTermQuery, 
the BoostingMaxTermQuery is a SpanTermQuery, but the difference is the payload 
score for a doc is not the average of all the payloads, but the maximum 
instead.)
Summary: Boosting Function Term Query  (was: Boosting Max Term Query)

 Boosting Function Term Query
 

 Key: LUCENE-1790
 URL: https://issues.apache.org/jira/browse/LUCENE-1790
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1790.patch


 Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
 SpanTermQuery, but the difference is the payload score for a doc is not the 
 average of all the payloads, but applies a function to them instead.  
 BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
 AveragePayloadFunction applied to it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1790) Boosting Function Term Query

2009-08-07 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-1790:


Attachment: LUCENE-1790.patch

Refactors BoostingTermQuery to be a BoostingFunctionQuery.  Adds in several 
PayloadFunction implementations.  All tests pass

Will commit today or tomorrow.

 Boosting Function Term Query
 

 Key: LUCENE-1790
 URL: https://issues.apache.org/jira/browse/LUCENE-1790
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1790.patch, LUCENE-1790.patch


 Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
 SpanTermQuery, but the difference is the payload score for a doc is not the 
 average of all the payloads, but applies a function to them instead.  
 BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
 AveragePayloadFunction applied to it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740604#action_12740604
 ] 

Michael McCandless commented on LUCENE-1789:



bq. Correct: a change like this could cause 2.9 to introduce a time based 
performance hit from the added method call to resolve the sub(reader|docvalue) 
on each method call ... but if we don't have a change like this, 2.9 could 
introduce a memory based performance hit from the other FieldCache changes as 
it client code accessing DocValues for the top level reader will create a 
duplication of the whole array.

True, and of the two, I agree a hidden time cost is the lesser evil.

But I'd prefer to not hide the cost, ie, encourage/force an explicit
choice when users upgrade to 2.9.  If we can't think of some realistic
way to do that, then I agree we should go forward with the current
approach...

bq. Incidently: I'm willing to believe you that the time based perf hit would 
be high, but my instinct is that it wouldn't be that bad: the DocValues API 
already introduces at least one method call per doc lookup (two depending on 
datatype). adding a second method call to delegate to a sub-DocValues isntance 
doesn't seem that bad (especially since a new MultDocValues class could get the 
subReader list and compute the docId offsets on init, and then reuse them on 
each method call)

It's the added binary search in ReaderUtil.subSearcher that worries
me.

{quote} 
bq. In the core I think we should always switch up high.

(In case there is any confusion: wasn't suggesting that we stop using up high 
switching on DocValues in code included in the Lucene dist, i was suggesting 
that if someone uses DocValues directly in their code (against a top level 
reader) then we help them out by giving them the down low switching ... so 
expected usages wouldn't pay the added time based hit, just unexpected 
usages (which would be saved from the memory hit))
{quote}

Understood.  We are only talking about external usages of these APIs,
and even then, exceptionally advance usage.  Ie, users who make their
own ValueSourceQuery and then run it against an IndexSearcher will be
fine.  It's only people who directly invoke getValues, w/ some random
reader, that hit the hidden cost.

{quote}
bq. But, if we make the proposed change here, the app could in fact just keep 
working off the top-level values (eg if the ctor in their class is pulling 
these values), thinking everything is fine when in fact there is a sizable, 
silent perf hit.

I agree ... but unless i'm missing something about the code on the trunk, that 
situation already exists: the developer might switch to using the Collector 
API, but nothing about the current trunk will prevent/warn him that this...

ValueSource vs = new ValueSource(aFieldIAlsoSortOn);
IndexReader r = getCurrentReaderThatCouldBeAMultiReader();
DocValues vals = vs.getDocValues(r);
...could have a sizable, silent, memory perf hit in 2.9

(ValueSource.getValues has a javadoc indicating that caching will be done on 
the IndexReader passed in, but your comment suggests that if 2.9 were released 
today (with hte current trunk) people upgrading would have some obvious way of 
noticing that they need to pass a sub reader to getValues)
{quote}

How about this: we add a new param to the ctors of the value sources,
called (say) acceptMultiReader.  It has 3 values:

  - NO means an exception is thrown on seeing a top reader (where top
reader means any reader whose getSequentialSubReaders is
non-null)

  - YES_BURN_TIME means accept the top reader and make a
 MultiDocValues

  - YES_BURN_MEMORY means use the top reader against the field cache

We deprecate the existing ctors, so on moving to 3.0 you have to make
an explicit choice, but default it to YES_BURN_TIME.

One benefit of making the choice explicit is for those apps that have
memory to burn they may in fact choose to burn it.

Would this give a clean migration path forward?


 getDocValues should provide a MultiReader DocValues abstraction
 ---

 Key: LUCENE-1789
 URL: https://issues.apache.org/jira/browse/LUCENE-1789
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


 When scoring a ValueSourceQuery, the scoring code calls 
 ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
 instances are backed by the individual FieldCache entries of the subreaders 
 -- but if Client code were to inadvertently  called getValues() on a 
 MultiReader (or DirectoryReader) they would wind up using the outer 
 FieldCache.
 Since getValues(IndexReader) returns DocValues, we have an advantage here 
 that we don't have with FieldCache API (which is required to provide direct 
 

[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740608#action_12740608
 ] 

Michael McCandless commented on LUCENE-1768:


bq. You could still do something similar by simply override 
RangeQueryNodeBuilder.build(QueryNode queryNode), but this is not clean (it is 
kind of a hack).

What's the cleaner way to do this?  EG could I make my own 
ParametricRangeQueryNodeProcessor, subclassing the current one in the 
standard.processors package, that overrides postProcessNode to do its own 
conversion?

 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: JDK 1.5 in Analyzers

2009-08-07 Thread Michael McCandless
I am SO looking forward to 3.0 ;)

I'll fix.

Mike

On Fri, Aug 7, 2009 at 10:34 AM, Grant Ingersollgsing...@apache.org wrote:
 Looks like more 1.5 in contrib/analyzers, even though the smartcn build says
 1.4:

 compile-core:
    [mkdir] Created dir:
 /lucene/java/lucene-clean/build/contrib/analyzers/smartcn/classes/java
    [javac] Compiling 18 source files to
 /lucene/java/lucene-clean/build/contrib/analyzers/smartcn/classes/java
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java:94:
 hashCode() in java.lang.Object cannot be applied to (char[])
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                                     ^
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java:94:
 incompatible types
    [javac] found   : java.lang.String
    [javac] required: int
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                             ^
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java:54:
 hashCode() in java.lang.Object cannot be applied to (char[])
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                                     ^
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java:54:
 incompatible types
    [javac] found   : java.lang.String
    [javac] required: int
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                             ^
    [javac] Note:
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/SmartChineseAnalyzer.java
 uses or overrides a deprecated API.
    [javac] Note: Recompile with -deprecation for details.
    [javac] 4 errors


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740624#action_12740624
 ] 

Michael McCandless commented on LUCENE-1749:


Maybe we should simply print a warning, eg to System.err, on detecting that 2X 
RAM usage has occurred, pointing people to the sanity checker?  We could eg do 
it once only so we don't spam the stderr logs...

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Sorting cleanup and FieldCacheImpl.Entry confusion

2009-08-07 Thread Michael McCandless
I don't know why Entry has int type and String locale, either.  I
agree it'd be cleaner for FieldSortedHitQueue to store these on its
own, privately.

Note that FieldSortedHitQueue is deprecated in favor of
FieldValueHitQueue, and that FieldValueHitQueue doesn't cache
comparators anymore.

Mike

On Thu, Aug 6, 2009 at 8:07 PM, Chris Hostetterhossman_luc...@fucit.org wrote:

 Hey everybody, over in LUCENE-1749 i'm trying to make sanity checking of the
 FieldCache possible, and i'm banging my head into a few walls, and hoping
 people can help me fill in the gaps about how sorting w/FieldCache is
 *suppose* to work.

 For starters: i was getting confused why some debugging code wasn't showing
 the Locale specified when getting the String[] cache for Locale.US.

 Looking at FieldSortedHitQueue.comparatorStringLocale, i see that it calls
 FieldCache.DEFAULT.getStrings(reader, field) and doesn't pass the Locale at
 all -- which makes me wonder why FieldCacheImpl.Entry bothers having a
 locale member at all? ... it seems like the only purpose is so
 FieldSortedHitQueue can abuse the Entry object as a key for it's own static
 final FieldCacheImpl.Cache Comparators ... but couldn't it just use it's on
 key object and keep FieldCacheImpl.Entry simpler?

 Ditto for the int type property of FieldCacheImpl.Entry, which has the
 comment // which SortField type ... it's used by FieldSortedHitQueue in
 it's Comparators cache (and getCachedComparator) but FieldCacheImpl never
 uses it, but the time the FieldCache is access, the type has already been
 translated into the appropriate method (getInts, getBytes, etc...)


 if FieldSortedHitQueue used it's own private inner class for it's comparator
 cache, the FieldCacheImpl.Entry code could eliminate a lot of cruft, and the
 class would get much simpler.

 Does anyone know a good reason *why* it's implemented the way it currently
 is? or is this simply the end result of code gradually being refactored out
 of FieldCcaheImpl and into FieldSortedHitQueue ?




 -Hoss


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: JDK 1.5 in Analyzers

2009-08-07 Thread Simon Willnauer
On Fri, Aug 7, 2009 at 6:17 PM, Michael
McCandlessluc...@mikemccandless.com wrote:
 I am SO looking forward to 3.0 ;)
Oh man! Me too!


 I'll fix.

 Mike

 On Fri, Aug 7, 2009 at 10:34 AM, Grant Ingersollgsing...@apache.org wrote:
 Looks like more 1.5 in contrib/analyzers, even though the smartcn build says
 1.4:

 compile-core:
    [mkdir] Created dir:
 /lucene/java/lucene-clean/build/contrib/analyzers/smartcn/classes/java
    [javac] Compiling 18 source files to
 /lucene/java/lucene-clean/build/contrib/analyzers/smartcn/classes/java
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java:94:
 hashCode() in java.lang.Object cannot be applied to (char[])
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                                     ^
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java:94:
 incompatible types
    [javac] found   : java.lang.String
    [javac] required: int
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                             ^
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java:54:
 hashCode() in java.lang.Object cannot be applied to (char[])
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                                     ^
    [javac]
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java:54:
 incompatible types
    [javac] found   : java.lang.String
    [javac] required: int
    [javac]     result = prime * result + Arrays.hashCode(charArray);
    [javac]                             ^
    [javac] Note:
 /lucene/java/lucene-clean/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/SmartChineseAnalyzer.java
 uses or overrides a deprecated API.
    [javac] Note: Recompile with -deprecation for details.
    [javac] 4 errors


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: svn commit: r802085 - in /lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/smart/hhmm: SegToken.java SegTokenPair.java

2009-08-07 Thread Uwe Schindler
By the way: o.a.l.util.ArrayUtil contains a hashCode impl for char arrays.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: mikemcc...@apache.org [mailto:mikemcc...@apache.org]
 Sent: Friday, August 07, 2009 6:48 PM
 To: java-comm...@lucene.apache.org
 Subject: svn commit: r802085 - in
 /lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/an
 alysis/cn/smart/hhmm: SegToken.java SegTokenPair.java
 
 Author: mikemccand
 Date: Fri Aug  7 16:48:09 2009
 New Revision: 802085
 
 URL: http://svn.apache.org/viewvc?rev=802085view=rev
 Log:
 fix smartcn to be JDK 1.4 only
 
 Modified:
 
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegToken.java
 
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegTokenPair.java
 
 Modified:
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegToken.java
 URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/contrib/analyzers/smartcn/s
 rc/java/org/apache/lucene/analysis/cn/smart/hhmm/SegToken.java?rev=802085
 r1=802084r2=802085view=diff
 ==
 
 ---
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegToken.java (original)
 +++
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegToken.java Fri Aug  7 16:48:09 2009
 @@ -91,7 +91,9 @@
public int hashCode() {
  final int prime = 31;
  int result = 1;
 -result = prime * result + Arrays.hashCode(charArray);
 +for(int i=0;icharArray.length;i++) {
 +  result = prime * result + charArray[i];
 +}
  result = prime * result + endOffset;
  result = prime * result + index;
  result = prime * result + startOffset;
 
 Modified:
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegTokenPair.java
 URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/contrib/analyzers/smartcn/s
 rc/java/org/apache/lucene/analysis/cn/smart/hhmm/SegTokenPair.java?rev=802
 085r1=802084r2=802085view=diff
 ==
 
 ---
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegTokenPair.java (original)
 +++
 lucene/java/trunk/contrib/analyzers/smartcn/src/java/org/apache/lucene/ana
 lysis/cn/smart/hhmm/SegTokenPair.java Fri Aug  7 16:48:09 2009
 @@ -51,7 +51,9 @@
public int hashCode() {
  final int prime = 31;
  int result = 1;
 -result = prime * result + Arrays.hashCode(charArray);
 +for(int i=0;icharArray.length;i++) {
 +  result = prime * result + charArray[i];
 +}
  result = prime * result + from;
  result = prime * result + to;
  long temp;
 



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1607) String.intern() faster alternative

2009-08-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740638#action_12740638
 ] 

Uwe Schindler commented on LUCENE-1607:
---

Committed rev 802095.

 String.intern() faster alternative
 --

 Key: LUCENE-1607
 URL: https://issues.apache.org/jira/browse/LUCENE-1607
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Earwin Burrfoot
Assignee: Yonik Seeley
 Fix For: 2.9

 Attachments: intern.patch, LUCENE-1607-contrib.patch, 
 LUCENE-1607.patch, LUCENE-1607.patch, LUCENE-1607.patch, LUCENE-1607.patch, 
 LUCENE-1607.patch, LUCENE-1607.patch, LUCENE-1607.patch, LUCENE-1607.patch, 
 LUCENE-1607.patch, LUCENE-1607.patch


 By using our own interned string pool on top of default, String.intern() can 
 be greatly optimized.
 On my setup (java 6) this alternative runs ~15.8x faster for already interned 
 strings, and ~2.2x faster for 'new String(interned)'
 For java 5 and 4 speedup is lower, but still considerable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Luis Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740643#action_12740643
 ] 

Luis Alves commented on LUCENE-1768:


Hi Yonik,

As I said before you can do that in the RangeQueryNodeBuilder.build(QueryNode 
queryNode),
but it's ugly and this is not what we intended when using the new flexible 
query parser.

The new flexible query parser does not follow the concept of method 
overwriting has the old one.
So solutions that worked in the old queryparser, like overwriting a method, 
have to be implemented
using a programmatic way.

Your approach requires creating a new class, overwrite a method.
you still need to create a instance of your QueryParser and is not reusable.

Here is a sample of what your approach is:
{code}

Class YonikQueryParser extends QueryParser{

  Query getRangeQuery(field,...) {
if (money.equals(field)) return new NumericRangeQuery(field,...)
else return super.getRangeQuery(field,...)
  }
}

...
 QueryParser yqp = new YonikQueryParser(...);
yqp.parser(query);
{code}

 vs

What I am proposing:

{code}
MapCharSequence, RangeTools.Type rangeTypes =  new HashMapCharSequence, 
RangeTools.Type();

rangeTypes.put(money, RangeUtils.getType(RangeUtils.NUMERIC,  
RangeUtils.NumericType.Type.FLOAT, NumericUtils.PRECISION_STEP_DEFAULT) );

StandardQueryParser qp = new StandardQueryParser();
qp.setRangeTypes(rangeTypes);

qp.parser(query);
{code}

The second approach is programmatic does not require a new class, 
or the overwrite of a method and is reusable by other users, and it's
backward compatible, meaning we can integrate this on the current 
Flexible query parser and deliver this feature on 2.9 without affecting
any current usecase.

Your approach is not compatible, it does require new class, and is not 
programmatic,
It's not reusable by other users (we can't commit your code to lucene), 
since fields are hard-coded.

Also the approach I proposing is very similar to setFieldsBoost 
setDateResolution,
already available on the old QP and the new flexible query parser.

I also want to say, that extending the old QP vs extending the New flexible 
Query Parser approaches
are never going to be similar, they completely different implementations.



 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740659#action_12740659
 ] 

Yonik Seeley commented on LUCENE-1768:
--

bq. It's not reusable by other users (we can't commit your code to lucene)

Neither is your version with rangeTypes.put(money, 
RangeUtils.getType(RangeUtils.NUMERIC...
That's the application specific configuration code and doesn't need (or want) 
to be committed.

Directly instantiating the query you want is simple, ultimately configurable, 
and avoids adding a ton of unnecessary classes or methods that need to be kept 
in sync with everything that a user *may* want to do.

Is there a simple way to provide a custom QueryBuilder for range queries (or 
any other query type?)  I'm sure there must be, but there are so many classes 
in the new QP,  I'm having a little difficulty finding my way around.


 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Luis Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740662#action_12740662
 ] 

Luis Alves commented on LUCENE-1768:


{quote}
What's the cleaner way to do this? EG could I make my own 
ParametricRangeQueryNodeProcessor, subclassing the current one in the 
standard.processors package, that overrides postProcessNode to do its own 
conversion?
{quote}

For Yonik simple requirement, you could

Option 1 (more flexible):
- make your own ParametricRangeQueryNodeProcessor, subclassing the current, 
returning NumericQueryNodes where needed
- create a NumericQueryNode that extends RangeQueryNode (node extra code needed)
- create a NumericQueryNodeBuilder  that handles NumericQueryNodes, and set the 
map in  StandardQueryTreeBuilder, ex: setBuilder(NumericQueryNode.class, new 
NumericQueryNodeBuilder()),. RangeQueryNodes will still be normally handled by 
the RangeQueryNodeBuilder.

Option 2, (less flexible):
- make your own RangeQueryNodeBuilder subclassing the current(ex: 
NumericQueryNodeBuilder) , set the map in StandardQueryTreeBuilder, ex: 
setBuilder(RangeQueryNode.class, new NumericQueryNodeBuilder())

Option 1, implements the correct usage of the APIs. It's more flexible and 
dirty work is done in the processors pipeline.
Option 2, is not the correct use case for the APIs, requires less code and it 
will work, but the builder will be performing the tasks the Processor should be 
doing.


 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740667#action_12740667
 ] 

Michael McCandless commented on LUCENE-1771:


Patch looks good!:

  * Looks like you need to svn rm
src/java/org/apache/lucene/search/QueryWeight.java

  * Some javadocs still reference QueryWeight

  * Why do we need this in Weight?
{code}
public Explanation explain(IndexReader reader, int doc) throws IOException {
  return explain(null, reader, doc);
}
{code}
Ie, do we think there are places outside of Lucene that invoke
Weight.explain directly?


 Using explain may double ram reqs for fieldcaches when using 
 ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a 
 caching Filter.
 

 Key: LUCENE-1771
 URL: https://issues.apache.org/jira/browse/LUCENE-1771
 Project: Lucene - Java
  Issue Type: Bug
  Components: Search
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 2.9

 Attachments: LUCENE-1771.bc-tests.patch, LUCENE-1771.patch, 
 LUCENE-1771.patch, LUCENE-1771.patch, LUCENE-1771.patch, LUCENE-1771.patch, 
 LUCENE-1771.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1790) Boosting Function Term Query

2009-08-07 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-1790:


Attachment: LUCENE-1790.patch

Next take on this:

1. Added includeSpanScore flag, which allows you to ignore the score from the 
TermQuery part of the score and only count the payload.

2. Deprecated Similarity.scorePayload(String fieldName, ...) to a similar 
method that also takes in the Doc id.  Now, in theory, you could have different 
scoring for payloads based on different documents, fields, etc.  The old method 
just calls the new one and passes in a NO_DOC_ID_PROVIDED value (-1).

3. Added a Marker Interface named PayloadQuery and marked the various 
PayloadQueries.  This could be useful for Queries that work with other 
PayloadQueries (more exclusive than the fact that they are SpanQueries.

I really do intend to commit this :-)

 Boosting Function Term Query
 

 Key: LUCENE-1790
 URL: https://issues.apache.org/jira/browse/LUCENE-1790
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1790.patch, LUCENE-1790.patch, LUCENE-1790.patch


 Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
 SpanTermQuery, but the difference is the payload score for a doc is not the 
 average of all the payloads, but applies a function to them instead.  
 BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
 AveragePayloadFunction applied to it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1790) Add Boosting Function Term Query and Some Payload Query refactorings

2009-08-07 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-1790:


  Description: 
Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
SpanTermQuery, but the difference is the payload score for a doc is not the 
average of all the payloads, but applies a function to them instead.  
BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
AveragePayloadFunction applied to it.

Also add marker interface to indicate PayloadQuery types.  Refactor 
Similarity.scorePayload to also take in the doc id.

  was:Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
SpanTermQuery, but the difference is the payload score for a doc is not the 
average of all the payloads, but applies a function to them instead.  
BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
AveragePayloadFunction applied to it.

Lucene Fields: [Patch Available]  (was: [Patch Available, New])
  Summary: Add Boosting Function Term Query and Some Payload Query 
refactorings  (was: Boosting Function Term Query)

 Add Boosting Function Term Query and Some Payload Query refactorings
 

 Key: LUCENE-1790
 URL: https://issues.apache.org/jira/browse/LUCENE-1790
 Project: Lucene - Java
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1790.patch, LUCENE-1790.patch, LUCENE-1790.patch


 Similar to the BoostingTermQuery, the BoostingFunctionTermQuery is a 
 SpanTermQuery, but the difference is the payload score for a doc is not the 
 average of all the payloads, but applies a function to them instead.  
 BoostingTermQuery becomes a BoostingFunctionTermQuery with an 
 AveragePayloadFunction applied to it.
 Also add marker interface to indicate PayloadQuery types.  Refactor 
 Similarity.scorePayload to also take in the doc id.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Luis Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740718#action_12740718
 ] 

Luis Alves commented on LUCENE-1768:


{quote}
Neither is your version with rangeTypes.put(money, 
RangeUtils.getType(RangeUtils.NUMERIC...
That's the application specific configuration code and doesn't need (or want) 
to be committed.
{quote}
You are correct, I was describing the use case from the user perspective. 
That code was a example how to use the API's if we implement them in the 
future, those API's are not currently available.

{quote}
Directly instantiating the query you want is simple, ultimately configurable, 
and avoids adding a ton of unnecessary classes or methods that need to be kept 
in sync with everything that a user may want to do.
{quote}

I'm not sure what to say here. So I'll point to the documentation that we 
currently have:
You can read 
https://issues.apache.org/jira/secure/attachment/12410046/QueryParser_restructure_meetup_june2009_v2.pdf
and the java docs  for 
package org.apache.lucene.queryParser.core 
class org.apache.lucene.queryParser.standard.StandardQueryParser

You can also look at TestSpanQueryParserSimpleSample junit for another example 
how the API's can be used,
in a completely different way.

The new QueryParser was designed to be extensible,
allow the implementation of languages extensions or different languages,
and have reusable components like the processors and builders

We use SyntaxParsers, Processors and Builders, all are replaceable components 
at runtime.
Any user can build it's own pipeline and create new processors, builders, 
querynodes and integrate them
with the existing ones to create the features they require. 

Some of the features are:
- Syntax Tree optimization
- Syntax Tree expansion
- Syntax Tree validation and error reporting
- Tokenization and normalization of the query
- Makes it easy to create extensions
- Support for translation of error messages
- Allows users to plug and play processors and builders, without having to 
modify lucene code.
- Allow lucene users to implement features much faster
- Allow users to change default behavior in a easy way without having to modify 
lucene code.

{quote}
Is there a simple way to provide a custom QueryBuilder for range queries (or 
any other query type?) I'm sure there must be, but there are so many classes in 
the new QP, I'm having a little difficulty finding my way around.
{quote}



{code}
  class NumericQueryNodeBuilder extends RangeQueryNodeBuilder {
public TermRangeQuery build(QueryNode queryNode) throws QueryNodeException {
RangeQueryNode rangeNode = (RangeQueryNode) queryNode;
  
if (rangeNode.getField().toString().equals(money)) {
  // do whatever you need here with queryNode.
  return new NumericRangeQuery(field,...)
}
else {
return super.build(queryNode);
  }
}
  }
  
  public void testNewRangeQueryBuilder() throws Exception {
StandardQueryParser qp = new StandardQueryParser();
QueryTreeBuilder builder = (QueryTreeBuilder)qp.getQueryBuilder();
builder.setBuilder(RangeQueryNode.class, new NumericQueryNodeBuilder());

String startDate = getLocalizedDate(2002, 1, 1, false);
String endDate = getLocalizedDate(2002, 1, 4, false);

StandardAnalyzer oneStopAnalyzer = new StandardAnalyzer();
qp.setAnalyzer(oneStopAnalyzer);

Query a = qp.parse(date:[ + startDate +  TO  + endDate + ], null);
System.out.print(a);
  }
{code}

 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes 

[jira] Issue Comment Edited: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Luis Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740718#action_12740718
 ] 

Luis Alves edited comment on LUCENE-1768 at 8/7/09 1:43 PM:


{quote}
Neither is your version with rangeTypes.put(money, 
RangeUtils.getType(RangeUtils.NUMERIC...
That's the application specific configuration code and doesn't need (or want) 
to be committed.
{quote}
You are correct, I was describing the use case from the user perspective. 
That code was a example how to use the API's if we implement them in the 
future, those API's are not currently available.

{quote}
Directly instantiating the query you want is simple, ultimately configurable, 
and avoids adding a ton of unnecessary classes or methods that need to be kept 
in sync with everything that a user may want to do.
{quote}

I'm not sure what to say here. So I'll point to the documentation that we 
currently have:
You can read 
https://issues.apache.org/jira/secure/attachment/12410046/QueryParser_restructure_meetup_june2009_v2.pdf
and the java docs  for 
package org.apache.lucene.queryParser.core 
class org.apache.lucene.queryParser.standard.StandardQueryParser

You can also look at TestSpanQueryParserSimpleSample junit for another example 
how the API's can be used,
in a completely different way.

The new QueryParser was designed to be extensible,
allow the implementation of languages extensions or different languages,
and have reusable components like the processors and builders

We use SyntaxParsers, Processors and Builders, all are replaceable components 
at runtime.
Any user can build it's own pipeline and create new processors, builders, 
querynodes and integrate them
with the existing ones to create the features they require. 

Some of the features are:
- Syntax Tree optimization
- Syntax Tree expansion
- Syntax Tree validation and error reporting
- Tokenization and normalization of the query
- Makes it easy to create extensions
- Support for translation of error messages
- Allows users to plug and play processors and builders, without having to 
modify lucene code.
- Allow lucene users to implement features much faster
- Allow users to change default behavior in a easy way without having to modify 
lucene code.

{quote}
Is there a simple way to provide a custom QueryBuilder for range queries (or 
any other query type?) I'm sure there must be, but there are so many classes in 
the new QP, I'm having a little difficulty finding my way around.
{quote}

Below is the java code for option 2. It's not the recomend way to use the new 
queryparser,
but is the shortest way to do what you want.

{code}
  class NumericQueryNodeBuilder extends RangeQueryNodeBuilder {
public TermRangeQuery build(QueryNode queryNode) throws QueryNodeException {
RangeQueryNode rangeNode = (RangeQueryNode) queryNode;
  
if (rangeNode.getField().toString().equals(money)) {
  // do whatever you need here with queryNode.
  return new NumericRangeQuery(field,...)
}
else {
return super.build(queryNode);
  }
}
  }
  
  public void testNewRangeQueryBuilder() throws Exception {
StandardQueryParser qp = new StandardQueryParser();
QueryTreeBuilder builder = (QueryTreeBuilder)qp.getQueryBuilder();
builder.setBuilder(RangeQueryNode.class, new NumericQueryNodeBuilder());

String startDate = getLocalizedDate(2002, 1, 1, false);
String endDate = getLocalizedDate(2002, 1, 4, false);

StandardAnalyzer oneStopAnalyzer = new StandardAnalyzer();
qp.setAnalyzer(oneStopAnalyzer);

Query a = qp.parse(date:[ + startDate +  TO  + endDate + ], null);
System.out.print(a);
  }
{code}

  was (Author: lafa):
{quote}
Neither is your version with rangeTypes.put(money, 
RangeUtils.getType(RangeUtils.NUMERIC...
That's the application specific configuration code and doesn't need (or want) 
to be committed.
{quote}
You are correct, I was describing the use case from the user perspective. 
That code was a example how to use the API's if we implement them in the 
future, those API's are not currently available.

{quote}
Directly instantiating the query you want is simple, ultimately configurable, 
and avoids adding a ton of unnecessary classes or methods that need to be kept 
in sync with everything that a user may want to do.
{quote}

I'm not sure what to say here. So I'll point to the documentation that we 
currently have:
You can read 
https://issues.apache.org/jira/secure/attachment/12410046/QueryParser_restructure_meetup_june2009_v2.pdf
and the java docs  for 
package org.apache.lucene.queryParser.core 
class org.apache.lucene.queryParser.standard.StandardQueryParser

You can also look at TestSpanQueryParserSimpleSample junit for another example 
how the API's can be used,
in a completely different way.

The new QueryParser was designed to be extensible,
allow the 

[jira] Commented: (LUCENE-1768) NumericRange support for new query parser

2009-08-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740728#action_12740728
 ] 

Uwe Schindler commented on LUCENE-1768:
---

To go back to the idea why I opened the issue (and I think, this is also Mike's 
intention):

From what you see on java-user, where users asking questions about how to use 
Lucene:
Most users are not aware of the fact, that they can create Query classes 
themselves. Most examplecode on the list is just: I have such query string and 
I pass it to lucene and it does not work as exspected. It is hard to explain 
them, that they should simply not use a query parser for their queries and just 
instantiate the query classes directly. For such users it is even harder to 
customize this query parser.

My intention behind is: Make the RangeQueryNodeBuilder somehow configureable 
like Luis proposed, that you can set the type of a field (what we do not have 
in Lucene currently). If the type is undefined or explicite set to 
string/term, create a TermRangeQuery. If it is set to any numeric type, 
create a NumericRangeQuery.newXxxRange(field,).

The same can currently be done by the original Lucene query parser, but only 
for dates (and it is really a hack using this DateField class). I simply want 
to extend it that you can say: this field is of type 'int' and create 
automatically the correct range query for it. Because the old query parser is 
now deprecated, I want to do it for the new one. This would also be an 
intention for new users to throw away the old parser and use the new one, 
because it can be configured easily to create numeric ranges in addition to 
term ranges.

 NumericRange support for new query parser
 -

 Key: LUCENE-1768
 URL: https://issues.apache.org/jira/browse/LUCENE-1768
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9


 It would be good to specify some type of schema for the query parser in 
 future, to automatically create NumericRangeQuery for different numeric 
 types? It would then be possible to index a numeric value 
 (double,float,long,int) using NumericField and then the query parser knows, 
 which type of field this is and so it correctly creates a NumericRangeQuery 
 for strings like [1.567..*] or (1.787..19.5].
 There is currently no way to extract if a field is numeric from the index, so 
 the user will have to configure the FieldConfig objects in the ConfigHandler. 
 But if this is done, it will not be that difficult to implement the rest.
 The only difference between the current handling of RangeQuery is then the 
 instantiation of the correct Query type and conversion of the entered numeric 
 values (simple Number.valueOf(...) cast of the user entered numbers). 
 Evenerything else is identical, NumericRangeQuery also supports the MTQ 
 rewrite modes (as it is a MTQ).
 Another thing is a change in Date semantics. There are some strange flags in 
 the current parser that tells it how to handle dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: SpanQuery and Spans optimizations

2009-08-07 Thread Grant Ingersoll


On Aug 6, 2009, at 5:09 PM, Grant Ingersoll wrote:



On Aug 6, 2009, at 5:06 PM, Shai Erera wrote:

Only w/ ScoreDocs we reuse the same instance. So I guess we'd like  
to do the same here.


Seems like providing a TopSpansCollector is what you want, only  
unlike TopFieldCollector which populates the fields post search,  
you'd like to do it during search.


Bingo, but I think the collection functionality needs to be on  
Collector, as I'd hate to have to lose out on functionality that the  
other impls have to offer, or have to recreate them.




Hmm, maybe I can get at this info from the setScorer capabilities.   
Then I would just need a place to hang the data...  Maybe would just  
take having the SpanScorer implementation provide just a wee bit more  
access to structures...


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740798#action_12740798
 ] 

Bill Bell commented on LUCENE-1781:
---

Everything is working except when you use a large area like 1 miles. I get 
no results at this distance when crossing the anti-meridian (180 degrees).

Most of the time this is fine, but specifically when -181 becomes 178 there 
appears to be an issue somewhere else in the code and nothing is returned. I 
believe this code is good, the issue is somewhere else. Maybe lower left is no 
longer lower left, and upper right is no longer upper right? The box is 
probably too big for the other algorithms. Not sure what else to check. How it 
is being used? Regardless this section appears right.

Start here: ctr 39.3209801,-111.0937311
Distance: 7200

boxCorners: before norm 22.100623434197477,21.15746490712925
boxCorners: normLng 22.100623434197477,21.15746490712925
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat 22.100623434197477,21.15746490712925
boxCorners: before norm -43.22565169384456,-181.34791600031286  -- note -181
boxCorners: normLng -43.22565169384456,178.65208399968714 -- Note 178
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat -43.22565169384456,178.65208399968714
corner 1054.4155877284288

I do get results from Hawaii crossing this at 10,000 miles.

boxCorners: before norm 6.201324582593365,-0.012709669713800501
boxCorners: normLng 6.201324582593365,-0.012709669713800501
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat 6.201324582593365,-0.012709669713800501
boxCorners: before norm -41.508634930577436,-302.4840293070323 -- note -302
boxCorners: normLng -41.508634930577436,57.5159706929677 -- note 57
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat -41.508634930577436,57.5159706929677
corner 1464.4660940672625






 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 

[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell updated LUCENE-1781:
--

Attachment: (was: LLRect.java)

 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
 at 
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
 at 
 org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell updated LUCENE-1781:
--

Attachment: LLRect.java

Added flipping for  90 degrees if needed. See comment.

 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
 at 
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
 at 
 org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740798#action_12740798
 ] 

Bill Bell edited comment on LUCENE-1781 at 8/7/09 5:48 PM:
---

Everything is working except when you use a large area like 1 miles. I get 
no results at this distance when crossing the anti-meridian (180 degrees).

Most of the time this is fine, but specifically when -181 becomes 178 there 
appears to be an issue somewhere else in the code and nothing is returned. I 
believe this code is good, the issue is somewhere else. Maybe lower left is no 
longer lower left, and upper right is no longer upper right? The box is 
probably too big for the other algorithms. Not sure what else to check. How it 
is being used? Regardless this section appears right.

Start here: ctr 39.3209801,-111.0937311
Distance: 7200

boxCorners: before norm 22.100623434197477,21.15746490712925
boxCorners: normLng 22.100623434197477,21.15746490712925
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat 22.100623434197477,21.15746490712925
boxCorners: before norm -43.22565169384456,-181.34791600031286   note -181
boxCorners: normLng -43.22565169384456,178.65208399968714 Note 178
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat -43.22565169384456,178.65208399968714
corner 1054.4155877284288

I do get results from Hawaii crossing this at 10,000 miles. This works:

boxCorners: before norm 6.201324582593365,-0.012709669713800501
boxCorners: normLng 6.201324582593365,-0.012709669713800501
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat 6.201324582593365,-0.012709669713800501
boxCorners: before norm -41.508634930577436,-302.4840293070323 note -302
boxCorners: normLng -41.508634930577436,57.5159706929677 note 57
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat -41.508634930577436,57.5159706929677
corner 1464.4660940672625






  was (Author: billnbell):
Everything is working except when you use a large area like 1 miles. I 
get no results at this distance when crossing the anti-meridian (180 degrees).

Most of the time this is fine, but specifically when -181 becomes 178 there 
appears to be an issue somewhere else in the code and nothing is returned. I 
believe this code is good, the issue is somewhere else. Maybe lower left is no 
longer lower left, and upper right is no longer upper right? The box is 
probably too big for the other algorithms. Not sure what else to check. How it 
is being used? Regardless this section appears right.

Start here: ctr 39.3209801,-111.0937311
Distance: 7200

boxCorners: before norm 22.100623434197477,21.15746490712925
boxCorners: normLng 22.100623434197477,21.15746490712925
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat 22.100623434197477,21.15746490712925
boxCorners: before norm -43.22565169384456,-181.34791600031286  -- note -181
boxCorners: normLng -43.22565169384456,178.65208399968714 -- Note 178
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat -43.22565169384456,178.65208399968714
corner 1054.4155877284288

I do get results from Hawaii crossing this at 10,000 miles.

boxCorners: before norm 6.201324582593365,-0.012709669713800501
boxCorners: normLng 6.201324582593365,-0.012709669713800501
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat 6.201324582593365,-0.012709669713800501
boxCorners: before norm -41.508634930577436,-302.4840293070323 -- note -302
boxCorners: normLng -41.508634930577436,57.5159706929677 -- note 57
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat -41.508634930577436,57.5159706929677
corner 1464.4660940672625





  
 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 

[jira] Issue Comment Edited: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740798#action_12740798
 ] 

Bill Bell edited comment on LUCENE-1781 at 8/7/09 5:59 PM:
---

Everything is working except when you use a large area like 1 miles. I get 
no results at this distance when crossing the anti-meridian (180 degrees).

Most of the time this is fine, but specifically when -181 becomes 178 there 
appears to be an issue somewhere else in the code and nothing is returned. I 
believe this code is good, the issue is somewhere else. Maybe lower left is no 
longer lower left, and upper right is no longer upper right? The box is 
probably too big for the other algorithms. Not sure what else to check. How it 
is being used? Regardless this section appears right.

Start here: ctr 39.3209801,-111.0937311
Distance: 7200

boxCorners: before norm 22.100623434197477,21.15746490712925
boxCorners: normLng 22.100623434197477,21.15746490712925
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat 22.100623434197477,21.15746490712925
boxCorners: before norm -43.22565169384456,-181.34791600031286   note -181
boxCorners: normLng -43.22565169384456,178.65208399968714 Note 178
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat -43.22565169384456,178.65208399968714
corner 1054.4155877284288

I do get results from Hawaii crossing this at 10,000 miles. This works:

boxCorners: before norm 6.201324582593365,-0.012709669713800501
boxCorners: normLng 6.201324582593365,-0.012709669713800501
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat 6.201324582593365,-0.012709669713800501
boxCorners: before norm -41.508634930577436,-302.4840293070323 note -302
boxCorners: normLng -41.508634930577436,57.5159706929677 note 57
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat -41.508634930577436,57.5159706929677
corner 1464.4660940672625

Note: This does not get any results. Note the 4.815339955430126 difference. 
Very weird.

boxCorners: distance: d 10500.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat 0.8114618951495843,4.815339955430126
boxCorners: before norm -37.88735182208723,-310.6222696081052
boxCorners: normLng -37.88735182208723,49.37773039189477
boxCorners: distance: d 10500.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat -37.88735182208723,49.37773039189477
corner 1537.6893987706253






  was (Author: billnbell):
Everything is working except when you use a large area like 1 miles. I 
get no results at this distance when crossing the anti-meridian (180 degrees).

Most of the time this is fine, but specifically when -181 becomes 178 there 
appears to be an issue somewhere else in the code and nothing is returned. I 
believe this code is good, the issue is somewhere else. Maybe lower left is no 
longer lower left, and upper right is no longer upper right? The box is 
probably too big for the other algorithms. Not sure what else to check. How it 
is being used? Regardless this section appears right.

Start here: ctr 39.3209801,-111.0937311
Distance: 7200

boxCorners: before norm 22.100623434197477,21.15746490712925
boxCorners: normLng 22.100623434197477,21.15746490712925
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat 22.100623434197477,21.15746490712925
boxCorners: before norm -43.22565169384456,-181.34791600031286   note -181
boxCorners: normLng -43.22565169384456,178.65208399968714 Note 178
boxCorners: distance: d 7200.0
boxCorners: ctr 39.3209801,-111.0937311
boxCorners: normLat -43.22565169384456,178.65208399968714
corner 1054.4155877284288

I do get results from Hawaii crossing this at 10,000 miles. This works:

boxCorners: before norm 6.201324582593365,-0.012709669713800501
boxCorners: normLng 6.201324582593365,-0.012709669713800501
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat 6.201324582593365,-0.012709669713800501
boxCorners: before norm -41.508634930577436,-302.4840293070323 note -302
boxCorners: normLng -41.508634930577436,57.5159706929677 note 57
boxCorners: distance: d 1.0
boxCorners: ctr 19.8986819,-155.6658568
boxCorners: normLat -41.508634930577436,57.5159706929677
corner 1464.4660940672625





  
 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, 

[jira] Commented: (LUCENE-1781) Large distances in Spatial go beyond Prime MEridian

2009-08-07 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740831#action_12740831
 ] 

Bill Bell commented on LUCENE-1781:
---

I did some additional research. The current Spatial ONLY works for one 
hemisphere at a time. It does a simple min/max for lat/long measurements. This 
makes the whole solution not useful between one hemisphere and another. 
Specifically Rectangle.java, getBoundary, etc needs to work on a circle. The 
first step is to build a rectangle when lat goes from -90 to +89, and long goes 
from -180 to +179, etc.

 new Rectangle(ll.getLng(), ll.getLat(), ur.getLng(), ur.getLat())

At least LLRect appears correct now... Next step is to fix the 
CartesianPolyFilterBuilder.




 Large distances in Spatial go beyond Prime MEridian
 ---

 Key: LUCENE-1781
 URL: https://issues.apache.org/jira/browse/LUCENE-1781
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Affects Versions: 2.9
 Environment: All
Reporter: Bill Bell
Assignee: Michael McCandless
 Fix For: 3.1

 Attachments: LLRect.java, LLRect.java, LUCENE-1781.patch


 http://amidev.kaango.com/solr/core0/select?fl=*json.nl=mapwt=jsonradius=5000rows=20lat=39.5500507q=hondaqt=geolong=-105.7820674
 Get an error when using Solr when distance is calculated for the boundary box 
 past 90 degrees.
 Aug 4, 2009 1:54:00 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.IllegalArgumentException: Illegal lattitude value 
 93.1558669413734
 at 
 org.apache.lucene.spatial.geometry.FloatLatLng.init(FloatLatLng.java:26)
 at 
 org.apache.lucene.spatial.geometry.shape.LLRect.createBox(LLRect.java:93)
 at 
 org.apache.lucene.spatial.tier.DistanceUtils.getBoundary(DistanceUtils.java:50)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoxShape(CartesianPolyFilterBuilder.java:47)
 at 
 org.apache.lucene.spatial.tier.CartesianPolyFilterBuilder.getBoundingArea(CartesianPolyFilterBuilder.java:109)
 at 
 org.apache.lucene.spatial.tier.DistanceQueryBuilder.init(DistanceQueryBuilder.java:61)
 at 
 com.pjaol.search.solr.component.LocalSolrQueryComponent.prepare(LocalSolrQueryComponent.java:151)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
 at 
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
 at 
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
 at 
 org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org