date:20120923

David Smiley created LUCENE-4418:


 Summary: Improve RecursivePrefixTreeFilter's performance heuristic 
tunables
 Key: LUCENE-4418
 URL: https://issues.apache.org/jira/browse/LUCENE-4418
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor


RecursivePrefixTreeFilter recursively decomposes grid cells until it gets to a 
threshold grid level (e.g. 4 away from max levels), at which point it does a 
brute force scan because it's faster once the number of terms is smaller.  So 
if max levels is 10, then if the threshold is 4 then it will switch to scanning 
at 6.  Ideally, the filter would know exactly how many terms there are in that 
grid -- i.e. given a hi  lo term, determine how many indexed terms are 
in-between without actually iterating to find out.  

Instead, it could use the # docs that a grid cell has as a heuristic.  It's not 
perfect but I think its much better because it's dynamic based on density of 
actual indexed data.  It's not perfect because many documents could refer to 
the same indexed point, or few documents with multi-valued data could refer to 
many indexed points.

Before I do this, I need to re-invigorate my testing efforts so I can come up 
with a default threshold.  And it's also dependent on things like query shape 
complexity. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2255) local params are not parsed in facet.pivot parameter


[ 
https://issues.apache.org/jira/browse/SOLR-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461469#comment-13461469
 ] 

David Smiley commented on SOLR-2255:


Thanks for your examination Yonik.  I plan to commit this to the 4x branch 
Monday, and it should eventually show up in v4.1.  For the changes.txt entry 
I'll say this:

SOLR-2255: Enhanced pivot faceting to use local-params in the same way that 
regular field value faceting can.  This means support for excluding a filter 
query, using a different output key, and specifying 'threads' to do 
facet.method=fcs concurrently.  PivotFacetHelper now extends SimpleFacet and 
the getFacetImplementation() extension hook was removed.  (dsmiley)

 local params are not parsed in facet.pivot parameter
 

 Key: SOLR-2255
 URL: https://issues.apache.org/jira/browse/SOLR-2255
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.0-ALPHA
Reporter: Julien Lirochon
Assignee: David Smiley
 Attachments: SOLR-2255_local-param_support_for_pivot_faceting.patch, 
 SOLR-2255_local-param_support_for_pivot_faceting.patch


 ...facet=truefacet.pivot={!ex=category}category_id,subcategory_idfq={!tag=category}category_id=42
 generates the following error : undefined field {!ex=category}category_id
 If you filter on subcategory_id, the facet.pivot result will contain only 
 results from this subcategory. It's a loss of function since you can't alter 
 this behavior with local params.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2255) local params are not parsed in facet.pivot parameter

2012-09-23 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461493#comment-13461493
 ] 

Yonik Seeley commented on SOLR-2255:


The patch had minimal changes to SimpleFacets (making private protected and one 
variable name change).  I'd be comfortable with this being committed to the 4.0 
branch also.

 local params are not parsed in facet.pivot parameter
 

 Key: SOLR-2255
 URL: https://issues.apache.org/jira/browse/SOLR-2255
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.0-ALPHA
Reporter: Julien Lirochon
Assignee: David Smiley
 Attachments: SOLR-2255_local-param_support_for_pivot_faceting.patch, 
 SOLR-2255_local-param_support_for_pivot_faceting.patch


 ...facet=truefacet.pivot={!ex=category}category_id,subcategory_idfq={!tag=category}category_id=42
 generates the following error : undefined field {!ex=category}category_id
 If you filter on subcategory_id, the facet.pivot result will contain only 
 results from this subcategory. It's a loss of function since you can't alter 
 this behavior with local params.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1389091 - /lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java

2012-09-23 Thread Michael McCandless

Woops, thanks!

Mike McCandless

http://blog.mikemccandless.com


On Sun, Sep 23, 2012 at 12:45 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Sun Sep 23 16:45:53 2012
 New Revision: 1389091

 URL: http://svn.apache.org/viewvc?rev=1389091view=rev
 Log:
 clear javadocs warning

 Modified:
 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java

 Modified: 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java?rev=1389091r1=1389090r2=1389091view=diff
 ==
 --- 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java
  (original)
 +++ 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java
  Sun Sep 23 16:45:53 2012
 @@ -23,6 +23,7 @@ import java.util.List;

  import org.apache.lucene.store.DataInput;
  import org.apache.lucene.store.DataOutput;
 +import org.apache.lucene.util.IntsRef; // javadocs

  /**
   * Wraps another Outputs implementation and encodes one or



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1389091 - /lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java

2012-09-23 Thread Robert Muir

Not really your problem... I can only catch these things with the
eclipse validator at the moment. I gotta get LUCENE-4409 up and going
so javadocs-lint will too.

On Sun, Sep 23, 2012 at 3:22 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 Woops, thanks!

 Mike McCandless

 http://blog.mikemccandless.com


 On Sun, Sep 23, 2012 at 12:45 PM,  rm...@apache.org wrote:
 Author: rmuir
 Date: Sun Sep 23 16:45:53 2012
 New Revision: 1389091

 URL: http://svn.apache.org/viewvc?rev=1389091view=rev
 Log:
 clear javadocs warning

 Modified:
 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java

 Modified: 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java?rev=1389091r1=1389090r2=1389091view=diff
 ==
 --- 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java
  (original)
 +++ 
 lucene/dev/trunk/lucene/misc/src/java/org/apache/lucene/util/fst/ListOfOutputs.java
  Sun Sep 23 16:45:53 2012
 @@ -23,6 +23,7 @@ import java.util.List;

  import org.apache.lucene.store.DataInput;
  import org.apache.lucene.store.DataOutput;
 +import org.apache.lucene.util.IntsRef; // javadocs

  /**
   * Wraps another Outputs implementation and encodes one or



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
lucidworks.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4419) Test RecursivePrefixTree indexing non-point data

David Smiley created LUCENE-4419:


 Summary: Test RecursivePrefixTree indexing non-point data
 Key: LUCENE-4419
 URL: https://issues.apache.org/jira/browse/LUCENE-4419
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley


RecursivePrefixTreeFilter was modified in ~July 2011 to support spatial 
filtering of non-point indexed shapes.  It seems to work when playing with the 
capability but it isn't tested.  It really needs to be as this is a major 
feature.

I imagine an approach in which some randomly generated rectangles are indexed 
and then a randomly generated rectangle is queried.  The right answer can be 
calculated brute-force and then compared with the filter.  In order to deal 
with shape imprecision, the randomly generated shapes could be generated to fit 
a course grid (e.g. round everything to a 1 degree interval).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2255) local params are not parsed in facet.pivot parameter


[ 
https://issues.apache.org/jira/browse/SOLR-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461512#comment-13461512
 ] 

David Smiley commented on SOLR-2255:


Yonik, so are you proposing the whole thing be committed to 4.0 or just the 
SimpleFacets changes?

If the whole patch doesn't make it into 4.0, I propose that a deprecation 
warning to PivotFacetHelper.getFacetImplementation() being added to 4.0.

 local params are not parsed in facet.pivot parameter
 

 Key: SOLR-2255
 URL: https://issues.apache.org/jira/browse/SOLR-2255
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.0-ALPHA
Reporter: Julien Lirochon
Assignee: David Smiley
 Attachments: SOLR-2255_local-param_support_for_pivot_faceting.patch, 
 SOLR-2255_local-param_support_for_pivot_faceting.patch


 ...facet=truefacet.pivot={!ex=category}category_id,subcategory_idfq={!tag=category}category_id=42
 generates the following error : undefined field {!ex=category}category_id
 If you filter on subcategory_id, the facet.pivot result will contain only 
 results from this subcategory. It's a loss of function since you can't alter 
 this behavior with local params.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: need best solution for indexing and searching multiple, related database tables

2012-09-23 Thread Biff Baxter

So far no responses.  I did search the existing posts and found some related
topics but nothing as specific as I was looking for.

Biff



--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-best-solution-for-indexing-and-searching-multiple-related-database-tables-tp4009676p4009733.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3869) A PeerSync attempt to it's replicas by a candidate leader should not fail on o.a.http.conn.ConnectTimeoutException

Mark Miller created SOLR-3869:
-

 Summary: A PeerSync attempt to it's replicas by a candidate leader 
should not fail on o.a.http.conn.ConnectTimeoutException
 Key: SOLR-3869
 URL: https://issues.apache.org/jira/browse/SOLR-3869
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Blocker
 Fix For: 4.0, 5.0


I'd like to fix this for 4 - it's a simple fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3861) regresion of SOLR-2008 - updateHandler should be closed before searcherExecutor


 [ 
https://issues.apache.org/jira/browse/SOLR-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-3861.
---

Resolution: Won't Fix

Hossman - closing this as I don't think it requires a change. If you disagree, 
please reopen.

 regresion of SOLR-2008 - updateHandler should be closed before 
 searcherExecutor
 ---

 Key: SOLR-3861
 URL: https://issues.apache.org/jira/browse/SOLR-3861
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0-ALPHA, 4.0-BETA
Reporter: Hoss Man
Assignee: Mark Miller
Priority: Blocker
 Fix For: 4.0, 5.0


 SOLR-2008 fixed a possible RejectedExecutionException by ensuring that 
 SolrCore closed the updateHandler before the searcherExecutor.
 [~markrmil...@gmail.com] re-flipped this logic in r1159378, which is 
 annotated as fixing both SOLR-2654 and SOLR-2654 (dup typo i guess) but it's 
 not clear why - pretty sure this means that the risk of a Rejected exception 
 is back in 4.0-BETA...
 https://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/core/SolrCore.java?r1=1146905r2=1159378

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3870) SyncStrategy should have a close so it can abort earlier on shutdown.

Mark Miller created SOLR-3870:
-

 Summary: SyncStrategy should have a close so it can abort earlier 
on shutdown.
 Key: SOLR-3870
 URL: https://issues.apache.org/jira/browse/SOLR-3870
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: most useful for tests
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.1, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows ([[ Exception while replacing ENV. Please report this as a bug. ]]

2012-09-23 Thread Policeman Jenkins Server

{{ java.lang.NullPointerException }})
 - Build # 896 - Failure!
MIME-Version: 1.0
Content-Type: multipart/mixed; 
boundary==_Part_0_772174382.1348441652298
Precedence: bulk

--=_Part_0_772174382.1348441652298
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/896/
Java: [[ Exception while replacing ENV. Please report this as a bug. ]]
{{ java.lang.NullPointerException }}

No tests ran.

Build Log:
[...truncated 233 lines...]
FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException: 
Connection reset
hudson.remoting.RequestAbortedException: 
hudson.remoting.RequestAbortedException: java.net.SocketException: Connection 
reset
at hudson.remoting.Request.call(Request.java:174)
at hudson.remoting.Channel.call(Channel.java:664)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158)
at $Proxy71.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:861)
at hudson.Launcher$ProcStarter.join(Launcher.java:345)
at hudson.tasks.Ant.perform(Ant.java:217)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:717)
at hudson.model.Build$BuildExecution.build(Build.java:199)
at hudson.model.Build$BuildExecution.doRun(Build.java:160)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1502)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:236)
Caused by: hudson.remoting.RequestAbortedException: java.net.SocketException: 
Connection reset
at hudson.remoting.Request.abort(Request.java:299)
at hudson.remoting.Channel.terminate(Channel.java:724)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at 
java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2248)
at 
java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2541)
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2551)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at hudson.remoting.Command.readFrom(Command.java:90)
at 
hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)


--=_Part_0_772174382.1348441652298--

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS] Lucene-Solr-4.x-Windows ([[ Exception while replacing ENV. Please report this as a bug. ]]

2012-09-23 Thread Uwe Schindler

Sorry, my fault! Updates of Windows...

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Policeman Jenkins Server [mailto:jenk...@sd-datasolutions.de]
 Sent: Monday, September 24, 2012 1:08 AM
 To: dev@lucene.apache.org; markrmil...@apache.org
 Subject: [JENKINS] Lucene-Solr-4.x-Windows ([[ Exception while replacing ENV.
 Please report this as a bug. ]]
 
 {{ java.lang.NullPointerException }})
  - Build # 896 - Failure!
 MIME-Version: 1.0
 Content-Type: multipart/mixed;
   boundary==_Part_0_772174382.1348441652298
 Precedence: bulk
 
 --=_Part_0_772174382.1348441652298
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: 7bit
 
 Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/896/
 Java: [[ Exception while replacing ENV. Please report this as a bug. ]] {{
 java.lang.NullPointerException }}
 
 No tests ran.
 
 Build Log:
 [...truncated 233 lines...]
 FATAL: hudson.remoting.RequestAbortedException: java.net.SocketException:
 Connection reset
 hudson.remoting.RequestAbortedException:
 hudson.remoting.RequestAbortedException: java.net.SocketException:
 Connection reset
   at hudson.remoting.Request.call(Request.java:174)
   at hudson.remoting.Channel.call(Channel.java:664)
   at
 hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.j
 ava:158)
   at $Proxy71.join(Unknown Source)
   at
 hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:861)
   at hudson.Launcher$ProcStarter.join(Launcher.java:345)
   at hudson.tasks.Ant.perform(Ant.java:217)
   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
   at
 hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.jav
 a:717)
   at hudson.model.Build$BuildExecution.build(Build.java:199)
   at hudson.model.Build$BuildExecution.doRun(Build.java:160)
   at
 hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499
 )
   at hudson.model.Run.execute(Run.java:1502)
   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
   at
 hudson.model.ResourceController.execute(ResourceController.java:88)
   at hudson.model.Executor.run(Executor.java:236)
 Caused by: hudson.remoting.RequestAbortedException:
 java.net.SocketException: Connection reset
   at hudson.remoting.Request.abort(Request.java:299)
   at hudson.remoting.Channel.terminate(Channel.java:724)
   at
 hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchron
 ousCommandTransport.java:69)
 Caused by: java.net.SocketException: Connection reset
   at java.net.SocketInputStream.read(SocketInputStream.java:168)
   at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
   at
 java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2248
 )
   at
 java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java
 :2541)
   at
 java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream
 .java:2551)
   at
 java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1296)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
   at hudson.remoting.Command.readFrom(Command.java:90)
   at
 hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.ja
 va:59)
   at
 hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchron
 ousCommandTransport.java:48)
 
 
 --=_Part_0_772174382.1348441652298--
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3869) A PeerSync attempt to it's replicas by a candidate leader should not fail on o.a.http.conn.ConnectTimeoutException


 [ 
https://issues.apache.org/jira/browse/SOLR-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-3869.
---

Resolution: Fixed

Committed.

 A PeerSync attempt to it's replicas by a candidate leader should not fail on 
 o.a.http.conn.ConnectTimeoutException
 --

 Key: SOLR-3869
 URL: https://issues.apache.org/jira/browse/SOLR-3869
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Blocker
 Fix For: 4.0, 5.0


 I'd like to fix this for 4 - it's a simple fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3869) A PeerSync attempt to it's replicas by a candidate leader should not fail on o.a.http.conn.ConnectTimeoutException


[ 
https://issues.apache.org/jira/browse/SOLR-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461537#comment-13461537
 ] 

Mark Miller commented on SOLR-3869:
---

{code}Modified: 
lucene/dev/branches/lucene_solr_4_0/solr/core/src/java/org/apache/solr/update/PeerSync.java
URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_4_0/solr/core/src/java/org/apache/solr/update/PeerSync.java?rev=1389162r1=1389161r2=1389162view=diff
==
--- 
lucene/dev/branches/lucene_solr_4_0/solr/core/src/java/org/apache/solr/update/PeerSync.java
 (original)
+++ 
lucene/dev/branches/lucene_solr_4_0/solr/core/src/java/org/apache/solr/update/PeerSync.java
 Sun Sep 23 23:14:14 2012
@@ -28,6 +28,7 @@ import java.util.Set;

 import org.apache.http.NoHttpResponseException;
 import org.apache.http.client.HttpClient;
+import org.apache.http.conn.ConnectTimeoutException;
 import org.apache.lucene.util.BytesRef;
 import org.apache.solr.client.solrj.SolrServerException;
 import org.apache.solr.client.solrj.impl.HttpClientUtil;
@@ -299,7 +300,7 @@ public class PeerSync  {
   if (cantReachIsSuccess  sreq.purpose == 1  srsp.getException() 
instanceof SolrServerException) {
 Throwable solrException = ((SolrServerException) srsp.getException())
 .getRootCause();
-if (solrException instanceof ConnectException
+if (solrException instanceof ConnectException || solrException 
instanceof ConnectTimeoutException
 || solrException instanceof NoHttpResponseException) {
   log.warn(msg() +  couldn't connect to  + srsp.getShardAddress() + 
, counting as success);{code}

 A PeerSync attempt to it's replicas by a candidate leader should not fail on 
 o.a.http.conn.ConnectTimeoutException
 --

 Key: SOLR-3869
 URL: https://issues.apache.org/jira/browse/SOLR-3869
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Blocker
 Fix For: 4.0, 5.0


 I'd like to fix this for 4 - it's a simple fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_35) - Build # 897 - Still Failing!

2012-09-23 Thread Policeman Jenkins Server

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/897/
Java: 64bit/jdk1.6.0_35 -XX:+UseSerialGC

No tests ran.

Build Log:
[...truncated 69 lines...]
BUILD FAILED
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows\build.xml:32: The following error 
occurred while executing this line:
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows\lucene\build.xml:49: The following 
error occurred while executing this line:
C:\Jenkins\workspace\Lucene-Solr-4.x-Windows\lucene\common-build.xml:336: Ivy 
is not available

Total time: 2 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Description set: Java: 64bit/jdk1.6.0_35 -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_07) - Build # 899 - Failure!

2012-09-23 Thread Policeman Jenkins Server

Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/899/
Java: 32bit/jdk1.7.0_07 -client -XX:+UseParallelGC

No tests ran.

Build Log:
[...truncated 64 lines...]
BUILD FAILED
C:\Jenkins\workspace\Lucene-Solr-trunk-Windows\build.xml:32: The following 
error occurred while executing this line:
C:\Jenkins\workspace\Lucene-Solr-trunk-Windows\lucene\build.xml:49: The 
following error occurred while executing this line:
C:\Jenkins\workspace\Lucene-Solr-trunk-Windows\lucene\common-build.xml:336: Ivy 
is not available

Total time: 1 second
Build step 'Invoke Ant' marked build as failure
Recording test results
Description set: Java: 32bit/jdk1.7.0_07 -client -XX:+UseParallelGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3871) SyncStrategy should use an executor for the threads it creates to request recoveries.

Mark Miller created SOLR-3871:
-

 Summary: SyncStrategy should use an executor for the threads it 
creates to request recoveries.
 Key: SOLR-3871
 URL: https://issues.apache.org/jira/browse/SOLR-3871
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: This will improve our tests and shutdown since we can 
then shut down the executor and interrupt long running threads.
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.1, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3872) When an update succeeds locally but fails on a replica, we ask that replica to recover - this should be done asynchronously.

Mark Miller created SOLR-3872:
-

 Summary: When an update succeeds locally but fails on a replica, 
we ask that replica to recover - this should be done asynchronously.
 Key: SOLR-3872
 URL: https://issues.apache.org/jira/browse/SOLR-3872
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.1, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3871) SyncStrategy should use an executor for the threads it creates to request recoveries.


[ 
https://issues.apache.org/jira/browse/SOLR-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461543#comment-13461543
 ] 

Mark Miller commented on SOLR-3871:
---

This will depend on SOLR-3870 to close down the executor.

 SyncStrategy should use an executor for the threads it creates to request 
 recoveries.
 -

 Key: SOLR-3871
 URL: https://issues.apache.org/jira/browse/SOLR-3871
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: This will improve our tests and shutdown since we can 
 then shut down the executor and interrupt long running threads.
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.1, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3871) SyncStrategy should use an executor for the threads it creates to request recoveries.


[ 
https://issues.apache.org/jira/browse/SOLR-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461547#comment-13461547
 ] 

Mark Miller commented on SOLR-3871:
---

Due to the impact on jenkins tests, I'd actually like to put this straight to 
4.0 before the RC. I'd be more comfortable if 4.0 ran smoothly on freebsd 
jenkins. The changes themselves are fairly simple and easy to review. Patch in 
a moment.

 SyncStrategy should use an executor for the threads it creates to request 
 recoveries.
 -

 Key: SOLR-3871
 URL: https://issues.apache.org/jira/browse/SOLR-3871
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: This will improve our tests and shutdown since we can 
 then shut down the executor and interrupt long running threads.
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.1, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4419) Test RecursivePrefixTree indexing non-point data

2012-09-23 Thread Chris Male (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461548#comment-13461548
]

Chris Male commented on LUCENE-4419:

I really don't see the benefit of randomly generating Shapes. There isn't much
to be revealed with a rectangle that say covers one small part of the pacific
ocean and another rectangle which covers another small part. The number of
possible Shapes is just too massive to ever reveal anything.

What I feel would be better is if we defined Shapes that test particularly
troublesome areas. Datelines, equators, poles. We can also include massive
Shapes and tiny Shapes, circles, points, and whatever else we end up supporting.

Having this standardized Shape suite would be a big benefit to testing all the
Strategys. I don't think it would be particularly difficult to create and once
created, it wouldn't require much maintenance at all.

Test RecursivePrefixTree indexing non-point data

Key: LUCENE-4419
URL: https://issues.apache.org/jira/browse/LUCENE-4419
Project: Lucene - Core
Issue Type: Improvement
Components: modules/spatial
Reporter: David Smiley

RecursivePrefixTreeFilter was modified in ~July 2011 to support spatial
filtering of non-point indexed shapes. It seems to work when playing with
the capability but it isn't tested. It really needs to be as this is a major
feature.
I imagine an approach in which some randomly generated rectangles are indexed
and then a randomly generated rectangle is queried. The right answer can be
calculated brute-force and then compared with the filter. In order to deal
with shape imprecision, the randomly generated shapes could be generated to
fit a course grid (e.g. round everything to a 1 degree interval).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3870) SyncStrategy should have a close so it can abort earlier on shutdown.


 [ 
https://issues.apache.org/jira/browse/SOLR-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3870:
--

Fix Version/s: (was: 4.1)
   4.0

 SyncStrategy should have a close so it can abort earlier on shutdown.
 -

 Key: SOLR-3870
 URL: https://issues.apache.org/jira/browse/SOLR-3870
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: most useful for tests
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.0, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3871) SyncStrategy should use an executor for the threads it creates to request recoveries.


 [ 
https://issues.apache.org/jira/browse/SOLR-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3871:
--

Fix Version/s: (was: 4.1)
   4.0

 SyncStrategy should use an executor for the threads it creates to request 
 recoveries.
 -

 Key: SOLR-3871
 URL: https://issues.apache.org/jira/browse/SOLR-3871
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: This will improve our tests and shutdown since we can 
 then shut down the executor and interrupt long running threads.
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3871) SyncStrategy should use an executor for the threads it creates to request recoveries.


 [ 
https://issues.apache.org/jira/browse/SOLR-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-3871:
--

Attachment: SOLR-3871SOLR-3870.patch

 SyncStrategy should use an executor for the threads it creates to request 
 recoveries.
 -

 Key: SOLR-3871
 URL: https://issues.apache.org/jira/browse/SOLR-3871
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: This will improve our tests and shutdown since we can 
 then shut down the executor and interrupt long running threads.
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0, 5.0

 Attachments: SOLR-3871SOLR-3870.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3871) SyncStrategy should use an executor for the threads it creates to request recoveries.


 [ 
https://issues.apache.org/jira/browse/SOLR-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-3871.
---

Resolution: Fixed

 SyncStrategy should use an executor for the threads it creates to request 
 recoveries.
 -

 Key: SOLR-3871
 URL: https://issues.apache.org/jira/browse/SOLR-3871
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: This will improve our tests and shutdown since we can 
 then shut down the executor and interrupt long running threads.
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0, 5.0

 Attachments: SOLR-3871SOLR-3870.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3870) SyncStrategy should have a close so it can abort earlier on shutdown.


 [ 
https://issues.apache.org/jira/browse/SOLR-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-3870.
---

Resolution: Fixed

 SyncStrategy should have a close so it can abort earlier on shutdown.
 -

 Key: SOLR-3870
 URL: https://issues.apache.org/jira/browse/SOLR-3870
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
 Environment: most useful for tests
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.0, 5.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2255) local params are not parsed in facet.pivot parameter

2012-09-23 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461553#comment-13461553
 ] 

Yonik Seeley commented on SOLR-2255:


Yes, the whole thing.

 local params are not parsed in facet.pivot parameter
 

 Key: SOLR-2255
 URL: https://issues.apache.org/jira/browse/SOLR-2255
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.0-ALPHA
Reporter: Julien Lirochon
Assignee: David Smiley
 Attachments: SOLR-2255_local-param_support_for_pivot_faceting.patch, 
 SOLR-2255_local-param_support_for_pivot_faceting.patch


 ...facet=truefacet.pivot={!ex=category}category_id,subcategory_idfq={!tag=category}category_id=42
 generates the following error : undefined field {!ex=category}category_id
 If you filter on subcategory_id, the facet.pivot result will contain only 
 results from this subcategory. It's a loss of function since you can't alter 
 this behavior with local params.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: need best solution for indexing and searching multiple, related database tables

2012-09-23 Thread Jack Krupansky

Sorry, but you should be pursuing this on the solr-user list, not the dev 
list.


-- Jack Krupansky

-Original Message- 
From: Biff Baxter

Sent: Sunday, September 23, 2012 5:25 PM
To: dev@lucene.apache.org
Subject: Re: need best solution for indexing and searching multiple, related 
database tables


So far no responses.  I did search the existing posts and found some related
topics but nothing as specific as I was looking for.

Biff



--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-best-solution-for-indexing-and-searching-multiple-related-database-tables-tp4009676p4009733.html

Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4419) Test RecursivePrefixTree indexing non-point data

[
https://issues.apache.org/jira/browse/LUCENE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461561#comment-13461561
]

David Smiley commented on LUCENE-4419:
--

I'm all for what you suggest -- a test that could be used by multiple
strategies. We're doing that already in fact in PortedSolr3Test. And the
StrategyTestCase has methods that facilitate using test files of sample data,
which is used by several tests such as TestPointVectorStrategy.

bq. I really don't see the benefit of randomly generating Shapes.

I could have sworn you told me we should add that to the Spatial4j todo list.

I like randomized tests because it can catch errors that a static test simply
didn't test for. This helped out tremendously when I worked out the bugs in
Circle-Rectangle intersection in Spatial4j.

Test RecursivePrefixTree indexing non-point data

Key: LUCENE-4419
URL: https://issues.apache.org/jira/browse/LUCENE-4419
Project: Lucene - Core
Issue Type: Improvement
Components: modules/spatial
Reporter: David Smiley

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4419) Test RecursivePrefixTree indexing non-point data

2012-09-23 Thread Chris Male (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461564#comment-13461564
]

Chris Male commented on LUCENE-4419:

bq. I'm all for what you suggest – a test that could be used by multiple
strategies

I didn't suggest that. I suggested a common suite of Shapes. I don't like the
idea of having a single test for all Strategys since they work in different
ways and support different things.

bq. I like randomized tests because it can catch errors that a static test
simply didn't test for

Theres a difference between randomized tests and randomized Shape generation
(again I didn't suggest we stopped randomized testing). The world is massive,
much of it isn't remotely interesting or challenging to our spatial
implementations. Just generating arbitrary Shapes somewhere on the globe seems
a total waste of time.

If we have a standard set of Shapes then we can use randomized testing to
handle the permutations between them, but we shouldn't waste days waiting for
tests to hit an interesting Shape.

Test RecursivePrefixTree indexing non-point data

Key: LUCENE-4419
URL: https://issues.apache.org/jira/browse/LUCENE-4419
Project: Lucene - Core
Issue Type: Improvement
Components: modules/spatial
Reporter: David Smiley

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4409) implement javadocs linting with eclipse ecj compiler


 [ 
https://issues.apache.org/jira/browse/LUCENE-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4409:


Attachment: LUCENE-4409.patch

Almost got this working, two bugs to resolve:
# a bug in eclipse compiler (imo), i tell it to create no class files, but its 
creating some for spatial (package-info.class processing) because it uses 
package-info.java instead of package.html. I'll make the macro use a throwaway 
directory and delete it.
# a bug in solrj javadocs: it links to the lucene queryparser syntax 
incorrectly. I don't know why this is working with 'ant javadocs', but it 
really shouldnt, since lucene queryparser should not be in its compile 
classpath. I'll fix it to use docRoot.


 implement javadocs linting with eclipse ecj compiler
 

 Key: LUCENE-4409
 URL: https://issues.apache.org/jira/browse/LUCENE-4409
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Attachments: LUCENE-4409.patch


 today we have a lot of custom python scripts checking javadocs (checking for 
 missing stuff too).
 Most of this is implemented by parsing html etc (some of this should stay 
 this way, like broken-link detection)
 But actually the eclipse compiler can do most of this type of linting, and 
 has a lot of options for it. We can pull it via ivy and run it from the 
 command-line.
 I tested this manually by adding a bogus throws clause to Codec.java, 
 downloading the ecj.jar from maven and running it manually:
 {noformat}
 rmuir@beast:~/workspace/lucene-trunk/lucene/core/src/java$ java -cp 
 ~/Downloads/ecj-3.7.2.jar org.eclipse.jdt.internal.compiler.batch.Main 
 -source 1.6 -d none -enableJavadoc -properties 
 ~/workspace/lucene-trunk/dev-tools/eclipse/.settings/org.eclipse.jdt.core.prefs
  .
 ...
 --
 120. ERROR in 
 /home/rmuir/workspace/lucene-trunk/lucene/core/src/java/./org/apache/lucene/codecs/Codec.java
  (at line 59)
   * @throws IOException */
 ^^^
 Javadoc: Exception IOException is not declared
 --
 {noformat}
 here i specified -d none (don't generate class files), and essentially told 
 it to read the compiler warnings/errors options set in the dev-tools config. 
 For javadocs-lint we would want our own separate properties file that 
 disables the ordinary java warnings (because eclipse can warn/error/ignore on 
 lots of things, not just javadocs, and does by default).
 Separately we could also use this to check/fail/warn on other things besides 
 javadoc...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4409) implement javadocs linting with eclipse ecj compiler

[
https://issues.apache.org/jira/browse/LUCENE-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir updated LUCENE-4409:

Attachment: LUCENE-4409.patch

updated patch: everything is passing.

Ill run precommit and get this thing in (trunk/4x only): i spent a lot of time
cleaning up docs and want to keep the bar high.

we can adjust the properties as needed later as more cleanup happens, but i
dont want to let them get any worse.

implement javadocs linting with eclipse ecj compiler

Key: LUCENE-4409
URL: https://issues.apache.org/jira/browse/LUCENE-4409
Project: Lucene - Core
Issue Type: Task
Components: general/build
Reporter: Robert Muir
Attachments: LUCENE-4409.patch, LUCENE-4409.patch

today we have a lot of custom python scripts checking javadocs (checking for
missing stuff too).
Most of this is implemented by parsing html etc (some of this should stay
this way, like broken-link detection)
But actually the eclipse compiler can do most of this type of linting, and
has a lot of options for it. We can pull it via ivy and run it from the
command-line.
I tested this manually by adding a bogus throws clause to Codec.java,
downloading the ecj.jar from maven and running it manually:
{noformat}
rmuir@beast:~/workspace/lucene-trunk/lucene/core/src/java$ java -cp
~/Downloads/ecj-3.7.2.jar org.eclipse.jdt.internal.compiler.batch.Main
-source 1.6 -d none -enableJavadoc -properties
~/workspace/lucene-trunk/dev-tools/eclipse/.settings/org.eclipse.jdt.core.prefs
.
...
--
120. ERROR in
/home/rmuir/workspace/lucene-trunk/lucene/core/src/java/./org/apache/lucene/codecs/Codec.java
(at line 59)
* @throws IOException */
^^^
Javadoc: Exception IOException is not declared
--
{noformat}
here i specified -d none (don't generate class files), and essentially told
it to read the compiler warnings/errors options set in the dev-tools config.
For javadocs-lint we would want our own separate properties file that
disables the ordinary java warnings (because eclipse can warn/error/ignore on
lots of things, not just javadocs, and does by default).
Separately we could also use this to check/fail/warn on other things besides
javadoc...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3653) Custom bigramming filter for to handle Smart Chinese edge cases

[
https://issues.apache.org/jira/browse/SOLR-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461572#comment-13461572
]

Lance Norskog commented on SOLR-3653:
-

I ran some counts on a database of 300k Chinese legal documents. The index has
a unigram field based on the StandardAnalyzer, a bigram field based on the CJK
analyzer, and a Smart Chinese field. I pulled the terms for all of them and
filtered for Chinese ideograms only. These are text unigrams, with

* The unigram field had 55k terms.
* The bigram field had 1.8 million terms.
* The Smart Chinese field had 417k terms:
** unigrams: 9.6k
** bigrams: 40k
** trigrams: 14.6k
** four: 5.6k
** five: 300
** six: 70
** seven: 51
** eight: 19
** nine: 7
** ten: 2
** eleven: 3
** twelve: 2
** thirteen: 3

The 4+ ngrams are essentially parsing failures by the Smart Chinese tokenizer.
I have attached three Google Translate versions of the longer ngrams.
'translations_first_500.trigrams.txt' and 'translations_first_500.quad.txt' are
the most common 3-ideogram and 4-ideogram terms. They have a lot of phrases
which should have been split. 'translations_450.five2thirteen.txt' are 450
ngrams which are 5 ideograms or longer. The longer ones have a lot of formal
geographical names, government organization names and official propaganda
phrases, more as the length increases.

For this corpus, based the above breakdown and on other experience:
# CJK is a waste of disk space. Bigrams introduce a ton of noise.
# Unigrams might work well if you only do strict phrase searches. But searching
for A, B, and C separately when given ABC is useless.
# If you search for raw country names, Smart Chinese lets you down when the
document uses the formal name.

Smart Chinese really does need to be split into bigrams. To cut bigram noise, I
would take the database of bigrams that it generates, and then use these to
guide splitting 3+ grams into bigrams. That is, if it ever generates AB, then
the splitter turns ABCD into (AB CD). BC would be considered 'bigram noise'.
Similarly, if Smart Chinese generates EF, then DEFG would become (D EF G).

However, a good fallback would be to have two fields, Smart Chinese and
unigrams, with Smart Chinese boosted upwards and unigrams only with strict
phrase search. With a high term count, bigrams are not helpful. You might even
want to search Smart Chinese first, and then do unigram loose phrase search
only if the recall is too low or the user is unhappy with the Smart Chinese
results.

Custom bigramming filter for to handle Smart Chinese edge cases
---

The Smart Simplified Chinese toolkit in lucene/analysis/smartcn does not
work in some edge cases. It fails to split certain words which were not part
of the dictionary or training corpus.
This patch supplies a bigramming class to handle these occasional mistakes.
The algorithm creates bigrams out of all words longer than two ideograms.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3653) Custom bigramming filter for to handle Smart Chinese edge cases


 [ 
https://issues.apache.org/jira/browse/SOLR-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lance Norskog updated SOLR-3653:


Attachment: translations_450.five2thirteen.txt
translations_first_500.trigrams.txt
translations_first_500.quad.txt

 Custom bigramming filter for to handle Smart Chinese edge cases
 ---

 Key: SOLR-3653
 URL: https://issues.apache.org/jira/browse/SOLR-3653
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Lance Norskog
 Attachments: SmartChineseType.pdf, SOLR-3653.patch, 
 translations_450.five2thirteen.txt, translations_first_500.quad.txt, 
 translations_first_500.trigrams.txt


 The Smart Simplified Chinese toolkit in lucene/analysis/smartcn does not 
 work in some edge cases. It fails to split certain words which were not part 
 of the dictionary or training corpus. 
 This patch supplies a bigramming class to handle these occasional mistakes. 
 The algorithm creates bigrams out of all words longer than two ideograms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3653) Custom bigramming filter for to handle Smart Chinese edge cases

[
https://issues.apache.org/jira/browse/SOLR-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461573#comment-13461573
]

Lance Norskog commented on SOLR-3653:
-

Another note: one trigram is the number 15. There are several conventions for
representing integers, including regional quirks. There is no 'number
canonicalizer' in the Smart Chinese toolkit. This could be a problem with
formal documents: historical, government docs, treaties and the like.

[http://en.wikipedia.org/wiki/Chinese_numerals#Whole_numbers]

Custom bigramming filter for to handle Smart Chinese edge cases
---

Key: SOLR-3653
URL: https://issues.apache.org/jira/browse/SOLR-3653
Project: Solr
Issue Type: New Feature
Components: Schema and Analysis
Reporter: Lance Norskog
Attachments: SmartChineseType.pdf, SOLR-3653.patch,
translations_450.five2thirteen.txt, translations_first_500.quad.txt,
translations_first_500.trigrams.txt

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4409) implement javadocs linting with eclipse ecj compiler


 [ 
https://issues.apache.org/jira/browse/LUCENE-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4409.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.1

 implement javadocs linting with eclipse ecj compiler
 

 Key: LUCENE-4409
 URL: https://issues.apache.org/jira/browse/LUCENE-4409
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4409.patch, LUCENE-4409.patch


 today we have a lot of custom python scripts checking javadocs (checking for 
 missing stuff too).
 Most of this is implemented by parsing html etc (some of this should stay 
 this way, like broken-link detection)
 But actually the eclipse compiler can do most of this type of linting, and 
 has a lot of options for it. We can pull it via ivy and run it from the 
 command-line.
 I tested this manually by adding a bogus throws clause to Codec.java, 
 downloading the ecj.jar from maven and running it manually:
 {noformat}
 rmuir@beast:~/workspace/lucene-trunk/lucene/core/src/java$ java -cp 
 ~/Downloads/ecj-3.7.2.jar org.eclipse.jdt.internal.compiler.batch.Main 
 -source 1.6 -d none -enableJavadoc -properties 
 ~/workspace/lucene-trunk/dev-tools/eclipse/.settings/org.eclipse.jdt.core.prefs
  .
 ...
 --
 120. ERROR in 
 /home/rmuir/workspace/lucene-trunk/lucene/core/src/java/./org/apache/lucene/codecs/Codec.java
  (at line 59)
   * @throws IOException */
 ^^^
 Javadoc: Exception IOException is not declared
 --
 {noformat}
 here i specified -d none (don't generate class files), and essentially told 
 it to read the compiler warnings/errors options set in the dev-tools config. 
 For javadocs-lint we would want our own separate properties file that 
 disables the ordinary java warnings (because eclipse can warn/error/ignore on 
 lots of things, not just javadocs, and does by default).
 Separately we could also use this to check/fail/warn on other things besides 
 javadoc...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4420) add solr changes.html testing to smokeTester

Robert Muir created LUCENE-4420:
---

 Summary: add solr changes.html testing to smokeTester
 Key: LUCENE-4420
 URL: https://issues.apache.org/jira/browse/LUCENE-4420
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Fix For: 4.1


Currently it only expects a changes/ with html in the lucene/ directory.

But now we have a changes2html running for solr, we should add the same checks.

Also need to fix the fake-release-building in top-level build.xml to include 
this like it does for lucene.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3873) solr/ 'documentation-lint' could be confusing if run manually.

Robert Muir created SOLR-3873:
-

 Summary: solr/ 'documentation-lint' could be confusing if run 
manually.
 Key: SOLR-3873
 URL: https://issues.apache.org/jira/browse/SOLR-3873
 Project: Solr
  Issue Type: Task
  Components: Build
Reporter: Robert Muir
 Fix For: 4.1


if you run 'precommit' etc from the top-level, everything is fine.

but if you were to run 'documentation-lint' straight from solr, without 
generating lucene's documentation, you will get a false broken link (since the 
lucene index.html is not generated and solr links to it).

This could confuse committers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4420) add solr changes.html testing to smokeTester


 [ 
https://issues.apache.org/jira/browse/LUCENE-4420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4420:


Attachment: LUCENE-4420.patch

untested patch. looks like today, if a changes/ exists for solr in an RC we 
will test it, but the patch fixes the checker to require it exists, and we add 
it to the fake release in nightly-smoke.

 add solr changes.html testing to smokeTester
 

 Key: LUCENE-4420
 URL: https://issues.apache.org/jira/browse/LUCENE-4420
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Fix For: 4.1

 Attachments: LUCENE-4420.patch


 Currently it only expects a changes/ with html in the lucene/ directory.
 But now we have a changes2html running for solr, we should add the same 
 checks.
 Also need to fix the fake-release-building in top-level build.xml to include 
 this like it does for lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

VOTE: release 4.0

2012-09-23 Thread Robert Muir

Artifacts are here: http://s.apache.org/lusolr40rc0

Thanks,
Robert

-- 
lucidworks.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4420) add solr changes.html testing to smokeTester


[ 
https://issues.apache.org/jira/browse/LUCENE-4420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461596#comment-13461596
 ] 

Robert Muir commented on LUCENE-4420:
-

local nightly-smoke passed with the patch:

{noformat}
...
 [exec] Test Solr...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB
 [exec]   check changes HTML...
...
{noformat}

Will commit soon.

 add solr changes.html testing to smokeTester
 

 Key: LUCENE-4420
 URL: https://issues.apache.org/jira/browse/LUCENE-4420
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Fix For: 4.1

 Attachments: LUCENE-4420.patch


 Currently it only expects a changes/ with html in the lucene/ directory.
 But now we have a changes2html running for solr, we should add the same 
 checks.
 Also need to fix the fake-release-building in top-level build.xml to include 
 this like it does for lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4420) add solr changes.html testing to smokeTester


 [ 
https://issues.apache.org/jira/browse/LUCENE-4420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4420.
-

   Resolution: Fixed
Fix Version/s: 5.0

 add solr changes.html testing to smokeTester
 

 Key: LUCENE-4420
 URL: https://issues.apache.org/jira/browse/LUCENE-4420
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4420.patch


 Currently it only expects a changes/ with html in the lucene/ directory.
 But now we have a changes2html running for solr, we should add the same 
 checks.
 Also need to fix the fake-release-building in top-level build.xml to include 
 this like it does for lucene.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module


[ 
https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461598#comment-13461598
 ] 

Lance Norskog commented on LUCENE-2510:
---

bq. We should open new issues for:
* Update the goddamn wiki
* Add support to solr.class for classes under org.apache.lucene

If you're going to move the walls, please update the blueprints :)

 migrate solr analysis factories to analyzers module
 ---

 Key: LUCENE-2510
 URL: https://issues.apache.org/jira/browse/LUCENE-2510
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/analysis
Affects Versions: 4.0-ALPHA
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.0-BETA, 5.0

 Attachments: LUCENE-2510-movefactories.sh, 
 LUCENE-2510-movefactories.sh, LUCENE-2510-multitermcomponent.patch, 
 LUCENE-2510-multitermcomponent.patch, LUCENE-2510-parent-classes.patch, 
 LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, 
 LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch, 
 LUCENE-2510-resourceloader-bw.patch, LUCENE-2510-simplify-tests.patch


 In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
 This is a good step, but I think the next step is to put the Solr factories 
 into the analyzers module, too.
 This would make analyzers artifacts plugins to both lucene and solr, with 
 benefits such as:
 * users could use the old analyzers module with solr, too. This is a good 
 step to use real library versions instead of Version for backwards compat.
 * analyzers modules such as smartcn and icu, that aren't currently available 
 to solr users due to large file sizes or dependencies, would be simple 
 optional plugins to solr and easily available to users that want them.
 Rough sketch in this thread: 
 http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
 Practically, I havent looked much and don't really have a plan for how this 
 will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-2510) migrate solr analysis factories to analyzers module


[ 
https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461598#comment-13461598
 ] 

Lance Norskog edited comment on LUCENE-2510 at 9/24/12 3:47 PM:


bq. We should open new issues for:
* Update the goddamn wiki

If you're going to move the walls, please update the blueprints :)

  was (Author: lancenorskog):
bq. We should open new issues for:
* Update the goddamn wiki
* Add support to solr.class for classes under org.apache.lucene

If you're going to move the walls, please update the blueprints :)
  
 migrate solr analysis factories to analyzers module
 ---

 Key: LUCENE-2510
 URL: https://issues.apache.org/jira/browse/LUCENE-2510
 Project: Lucene - Core
  Issue Type: Task
  Components: modules/analysis
Affects Versions: 4.0-ALPHA
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.0-BETA, 5.0

 Attachments: LUCENE-2510-movefactories.sh, 
 LUCENE-2510-movefactories.sh, LUCENE-2510-multitermcomponent.patch, 
 LUCENE-2510-multitermcomponent.patch, LUCENE-2510-parent-classes.patch, 
 LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, 
 LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch, 
 LUCENE-2510-resourceloader-bw.patch, LUCENE-2510-simplify-tests.patch


 In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
 This is a good step, but I think the next step is to put the Solr factories 
 into the analyzers module, too.
 This would make analyzers artifacts plugins to both lucene and solr, with 
 benefits such as:
 * users could use the old analyzers module with solr, too. This is a good 
 step to use real library versions instead of Version for backwards compat.
 * analyzers modules such as smartcn and icu, that aren't currently available 
 to solr users due to large file sizes or dependencies, would be simple 
 optional plugins to solr and easily available to users that want them.
 Rough sketch in this thread: 
 http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
 Practically, I havent looked much and don't really have a plan for how this 
 will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: release 4.0

2012-09-23 Thread Mark Miller

Sweet, thanks!

Mark

On Mon, Sep 24, 2012 at 12:11 AM, Robert Muir rcm...@gmail.com wrote:
 Artifacts are here: http://s.apache.org/lusolr40rc0

 Thanks,
 Robert

 --
 lucidworks.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
- Mark

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2510) migrate solr analysis factories to analyzers module