date:20130116


 [ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-4288:
--

Summary: Improve logging for FileDataSource (basePath, relative
resources).  (was: FileDataSource with an empty basePath and a relative 
resource is broken.)

 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Priority: Minor
 Fix For: 4.2, 5.0


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


 [ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-4288:
--

Attachment: SOLR-4288.patch

I'd like to squeeze this one in for 4.1. It's a fairly trivial patch and 
improves the user experience if somebody needs to configure DIH with relative 
basedir/paths.

I'll commit to trunk -- if there are no objections, could you merge-in to the 
release branch, Steve?

 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Priority: Minor
 Fix For: 4.2, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2963) Improving the performance of group.ngroups=true when there are a lot of unique groups

2013-01-16 Thread Mickael Magniez (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554832#comment-13554832
 ] 

Mickael Magniez commented on SOLR-2963:
---

I try various values, no performance improvement with any percent value

 Improving the performance of group.ngroups=true when there are a lot of 
 unique groups
 -

 Key: SOLR-2963
 URL: https://issues.apache.org/jira/browse/SOLR-2963
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.5
 Environment: Linux (Debian 6) 64bit, Java 6, 21GB RAM (Xmx), Solr 3.5
Reporter: Michael Jakl

 The performance of computing the total number of groups (by setting 
 group.ngroups=true) degrades badly when there are many unique groups.
 It would be very useful to have an adequate number of groups to provide good 
 means for paging through the results etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


 [ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved SOLR-4288.
---

Resolution: Fixed

 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


 [ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-4288:
--

Fix Version/s: (was: 4.2)
   4.1
 Assignee: Dawid Weiss

 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


[ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554839#comment-13554839
 ] 

Commit Tag Bot commented on SOLR-4288:
--

[trunk commit] Dawid Weiss
http://svn.apache.org/viewvc?view=revisionrevision=1433849

SOLR-4288: Improve logging for FileDataSource (basePath, relative resources).




 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

One more issue to squeeze for 4.1

2013-01-16 Thread Dawid Weiss

I committed this one to the trunk:
https://issues.apache.org/jira/browse/SOLR-4288

and allowed myself to put 4.1 on the fix for on the issue. This is a
trivial patch, improved logging. Steve, could you merge to the 4.1
release (and branch_4x?), it's on trunk, -r1433849.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: problem with 'ant precommit'

2013-01-16 Thread Shai Erera

Thanks Steve.

I'm on the latest version, SR12. I installed the security libraries from
here https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=jcesdkand
it now works.

Shai



On Wed, Jan 16, 2013 at 8:01 AM, Steve Rowe sar...@gmail.com wrote:

 Shai,

 Uwe dealt with this exact (apparently J9-specific) issue on his Jenkins
 box a couple of months ago:

 
 http://lucene.472066.n3.nabble.com/JENKINS-Lucene-Solr-trunk-Linux-64bit-ibm-j9-jdk6-Build-1895-Failure-td4014958.html
 

 Steve

 On Jan 16, 2013, at 12:52 AM, Shai Erera ser...@gmail.com wrote:

  I always get this error in the end:
 
  lucene\common-build.xml:1990: javax.net.ssl.SSLKeyException: RSA
 premaster secret error
  at com.ibm.jsse2.jb.init(jb.java:22)
  at com.ibm.jsse2.lb.a(lb.java:208)
  at com.ibm.jsse2.lb.a(lb.java:459)
  at com.ibm.jsse2.kb.s(kb.java:11)
  at com.ibm.jsse2.kb.a(kb.java:394)
  at com.ibm.jsse2.SSLSocketImpl.a(SSLSocketImpl.java:44)
  at com.ibm.jsse2.SSLSocketImpl.h(SSLSocketImpl.java:496)
  at com.ibm.jsse2.SSLSocketImpl.a(SSLSocketImpl.java:528)
  at
 com.ibm.jsse2.SSLSocketImpl.startHandshake(SSLSocketImpl.java:505)
  at com.ibm.net.ssl.www2.protocol.https.c.afterConnect(c.java:83)
  at com.ibm.net.ssl.www2.protocol.https.d.connect(d.java:31)
  at com.ibm.net.ssl.www2.protocol.https.b.connect(b.java:31)
  at
 org.apache.tools.ant.taskdefs.Get$GetThread.openConnection(Get.java:660)
  at org.apache.tools.ant.taskdefs.Get$GetThread.get(Get.java:579)
  at org.apache.tools.ant.taskdefs.Get$GetThread.run(Get.java:569)
  Caused by: java.security.InvalidKeyException: Illegal key size or
 default parameters
  at javax.crypto.Cipher.a(Unknown Source)
  at javax.crypto.Cipher.a(Unknown Source)
  at javax.crypto.Cipher.a(Unknown Source)
  at javax.crypto.Cipher.init(Unknown Source)
  at com.ibm.jsse2.jb.init(jb.java:105)
  ... 14 more
 
  Anyone knows what do I need to fix in my system?
 
  Shai


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4602) Use DocValues to store per-doc facet ord


[ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554878#comment-13554878
 ] 

Shai Erera commented on LUCENE-4602:


I've decided to add the migration code to trunk as well, because 5.0 is 
supposed to handle 4x indexes too. Anyway, it doesn't hurt that it's there. I 
improved the testing more, and added a static utility method which will make 
using it (doing the migration) easier.

I beasted some, 'precommit' is happy, so will commit it shortly.

 Use DocValues to store per-doc facet ord
 

 Key: LUCENE-4602
 URL: https://issues.apache.org/jira/browse/LUCENE-4602
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: FacetsPayloadMigrationReader.java, LUCENE-4602.patch, 
 LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, 
 TestFacetsPayloadMigrationReader.java


 Spinoff from LUCENE-4600
 DocValues can be used to hold the byte[] encoding all facet ords for
 the document, instead of payloads.  I made a hacked up approximation
 of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
 gains were somewhat surprisingly large:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
 HighTerm0.53  (0.9%)1.00  (2.5%)   
 87.3% (  83% -   91%)
  LowTerm7.59  (0.6%)   26.75 (12.9%)  
 252.6% ( 237% -  267%)
  MedTerm3.35  (0.7%)   12.71  (9.0%)  
 279.8% ( 268% -  291%)
 {noformat}
 I didn't think payloads were THAT slow; I think it must be the advance
 implementation?
 We need to separately test on-disk DV to make sure it's at least
 on-par with payloads (but hopefully faster) and if so ... we should
 cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b65) - Build # 3804 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/3804/
Java: 64bit/jdk1.8.0-ea-b65 -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 23740 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.compressing...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40.values...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene41...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc] Loading source files for package org.apache.lucene.util.fst...
  [javadoc] Loading source files for package org.apache.lucene.util.mutable...
  [javadoc] Loading source files for package org.apache.lucene.util.packed...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.8.0-ea
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/docs/core/help-doc.html...
  [javadoc] 1 warning

[...truncated 45 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
  [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
  [javadoc] Loading source files for package org.apache.lucene.analysis.br...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.charfilter...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.commongrams...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound.hyphenation...
  [javadoc] Loading source files for package org.apache.lucene.analysis.core...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
  [javadoc] Loading source files for package org.apache.lucene.analysis.da...
  [javadoc] Loading source files for package org.apache.lucene.analysis.de...
  [javadoc] Loading source files for package org.apache.lucene.analysis.el...
  [javadoc] Loading source files for package org.apache.lucene.analysis.en...
  [javadoc] Loading source files for package org.apache.lucene.analysis.es...
  [javadoc] Loading source files for package org.apache.lucene.analysis.eu...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fa...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fi...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fr...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ga...
  [javadoc] Loading source files for

[jira] [Commented] (LUCENE-4602) Use DocValues to store per-doc facet ord


[ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554881#comment-13554881
 ] 

Commit Tag Bot commented on LUCENE-4602:


[trunk commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1433869

LUCENE-4602: migrate facets to DocValues


 Use DocValues to store per-doc facet ord
 

 Key: LUCENE-4602
 URL: https://issues.apache.org/jira/browse/LUCENE-4602
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: FacetsPayloadMigrationReader.java, LUCENE-4602.patch, 
 LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, 
 TestFacetsPayloadMigrationReader.java


 Spinoff from LUCENE-4600
 DocValues can be used to hold the byte[] encoding all facet ords for
 the document, instead of payloads.  I made a hacked up approximation
 of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
 gains were somewhat surprisingly large:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
 HighTerm0.53  (0.9%)1.00  (2.5%)   
 87.3% (  83% -   91%)
  LowTerm7.59  (0.6%)   26.75 (12.9%)  
 252.6% ( 237% -  267%)
  MedTerm3.35  (0.7%)   12.71  (9.0%)  
 279.8% ( 268% -  291%)
 {noformat}
 I didn't think payloads were THAT slow; I think it must be the advance
 implementation?
 We need to separately test on-disk DV to make sure it's at least
 on-par with payloads (but hopefully faster) and if so ... we should
 cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4602) Use DocValues to store per-doc facet ord


 [ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4602.


   Resolution: Fixed
Fix Version/s: 5.0
   4.2
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x. For 4x I add to add @SuppressCodecs(Lucene3x).

 Use DocValues to store per-doc facet ord
 

 Key: LUCENE-4602
 URL: https://issues.apache.org/jira/browse/LUCENE-4602
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: FacetsPayloadMigrationReader.java, LUCENE-4602.patch, 
 LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, 
 TestFacetsPayloadMigrationReader.java


 Spinoff from LUCENE-4600
 DocValues can be used to hold the byte[] encoding all facet ords for
 the document, instead of payloads.  I made a hacked up approximation
 of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
 gains were somewhat surprisingly large:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
 HighTerm0.53  (0.9%)1.00  (2.5%)   
 87.3% (  83% -   91%)
  LowTerm7.59  (0.6%)   26.75 (12.9%)  
 252.6% ( 237% -  267%)
  MedTerm3.35  (0.7%)   12.71  (9.0%)  
 279.8% ( 268% -  291%)
 {noformat}
 I didn't think payloads were THAT slow; I think it must be the advance
 implementation?
 We need to separately test on-disk DV to make sure it's at least
 on-par with payloads (but hopefully faster) and if so ... we should
 cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4602) Use DocValues to store per-doc facet ord


[ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554900#comment-13554900
 ] 

Shai Erera commented on LUCENE-4602:


Thanks Mike. This is one great and important milestone for facets!

 Use DocValues to store per-doc facet ord
 

 Key: LUCENE-4602
 URL: https://issues.apache.org/jira/browse/LUCENE-4602
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: FacetsPayloadMigrationReader.java, LUCENE-4602.patch, 
 LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, 
 TestFacetsPayloadMigrationReader.java


 Spinoff from LUCENE-4600
 DocValues can be used to hold the byte[] encoding all facet ords for
 the document, instead of payloads.  I made a hacked up approximation
 of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
 gains were somewhat surprisingly large:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
 HighTerm0.53  (0.9%)1.00  (2.5%)   
 87.3% (  83% -   91%)
  LowTerm7.59  (0.6%)   26.75 (12.9%)  
 252.6% ( 237% -  267%)
  MedTerm3.35  (0.7%)   12.71  (9.0%)  
 279.8% ( 268% -  291%)
 {noformat}
 I didn't think payloads were THAT slow; I think it must be the advance
 implementation?
 We need to separately test on-disk DV to make sure it's at least
 on-par with payloads (but hopefully faster) and if so ... we should
 cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4602) Use DocValues to store per-doc facet ord


[ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554902#comment-13554902
 ] 

Commit Tag Bot commented on LUCENE-4602:


[branch_4x commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1433878

LUCENE-4602: migrate facets to DocValues


 Use DocValues to store per-doc facet ord
 

 Key: LUCENE-4602
 URL: https://issues.apache.org/jira/browse/LUCENE-4602
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: FacetsPayloadMigrationReader.java, LUCENE-4602.patch, 
 LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, LUCENE-4602.patch, 
 TestFacetsPayloadMigrationReader.java


 Spinoff from LUCENE-4600
 DocValues can be used to hold the byte[] encoding all facet ords for
 the document, instead of payloads.  I made a hacked up approximation
 of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
 gains were somewhat surprisingly large:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
 HighTerm0.53  (0.9%)1.00  (2.5%)   
 87.3% (  83% -   91%)
  LowTerm7.59  (0.6%)   26.75 (12.9%)  
 252.6% ( 237% -  267%)
  MedTerm3.35  (0.7%)   12.71  (9.0%)  
 279.8% ( 268% -  291%)
 {noformat}
 I didn't think payloads were THAT slow; I think it must be the advance
 implementation?
 We need to separately test on-disk DV to make sure it's at least
 on-par with payloads (but hopefully faster) and if so ... we should
 cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat

Simon Willnauer created LUCENE-4687:
---

 Summary: Lazily initialize TermsEnum in BloomFilterPostingsFormat
 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0


BloomFilteringPostingsFormat initializes its delegate TermsEnum directly inside 
the Terms#iterator() call which can be a pretty heavy operation if executed 
thousands of times. I suspect that bloom filter postings are mainly used for 
primary keys etc. which in turn is mostly a seekExact. Given that, most of the 
time we don't even need the delegate termsenum since most of the segments won't 
contain the key and the bloomfilter will likely return false from seekExact 
without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


 [ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4687:


Attachment: LUCENE-4687.patch

here is a patch... I also removed the IOException from Terms#comparator() to 
make it consistent with TermsEnum#comparator()

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


 [ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4687:


Issue Type: Improvement  (was: Bug)

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


 [ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-4687:
---

Assignee: Simon Willnauer

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4307) eDismax cross-core query support (and scoring)

2013-01-16 Thread David vandendriessche (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554944#comment-13554944
 ] 

David vandendriessche commented on SOLR-4307:
-

{!join fromIndex=PageCore from=docId to=fileId}{!edismax qf=pageTxt}little red 

Seems to get me better results. Is this the correct way to query with join and 
use edismax?

 eDismax cross-core query support (and scoring)
 --

 Key: SOLR-4307
 URL: https://issues.apache.org/jira/browse/SOLR-4307
 Project: Solr
  Issue Type: Wish
  Components: multicore, query parsers
 Environment: I'm using Solr 4.0.0
Reporter: David vandendriessche
  Labels: java, solr

 I would like to have cross-core eDismax query support. (for the fromIndex 
 query)
 Example:
 q=   {!join fromIndex=PageCore from=docId to=fileId}pageTxt: little red 
 riding hood
 defType= edismax
 qf=  pageTxt
 When this Query is entered it only queries:pageTxt:little
 Even when I set the defType to edismax.
 I know I could change the query to:
 (pageTxt: little) AND (pageTxt:red) AND (pageTxt:riding) AND (pageTxt:hood)
 But as far as I know this doesn't score documents etc,...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.7.0_10) - Build # 2412 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/2412/
Java: 64bit/jdk1.7.0_10 -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 25905 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\build\jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs\suggest\org\apache\lucene\search\spell/package-summary.html
 [exec]   missing: DirectSpellChecker.ScoreTerm
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:60: The 
following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\build.xml:245: 
The following error occurred while executing this line:
C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\common-build.xml:1972:
 exec returned: 1

Total time: 62 minutes 56 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.7.0_10 -XX:+UseG1GC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-ea-b65) - Build # 3785 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/3785/
Java: 64bit/jdk1.8.0-ea-b65 -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 23679 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.compressing...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene3x...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40.values...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene41...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc] Loading source files for package org.apache.lucene.util.fst...
  [javadoc] Loading source files for package org.apache.lucene.util.mutable...
  [javadoc] Loading source files for package org.apache.lucene.util.packed...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.8.0-ea
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html...
  [javadoc] 1 warning

[...truncated 45 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
  [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
  [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
  [javadoc] Loading source files for package org.apache.lucene.analysis.br...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.charfilter...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cn...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.commongrams...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound.hyphenation...
  [javadoc] Loading source files for package org.apache.lucene.analysis.core...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
  [javadoc] Loading source files for package org.apache.lucene.analysis.da...
  [javadoc] Loading source files for package org.apache.lucene.analysis.de...
  [javadoc] Loading source files for package org.apache.lucene.analysis.el...
  [javadoc] Loading source files for package org.apache.lucene.analysis.en...
  [javadoc] Loading source files for package org.apache.lucene.analysis.es...
  [javadoc] Loading source files for package org.apache.lucene.analysis.eu...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fa...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fi...
  [javadoc] Loading source files for package

[jira] [Commented] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat

2013-01-16 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554983#comment-13554983
 ] 

Robert Muir commented on LUCENE-4687:
-

can the reset() method return void?

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4300) Possible race condition in CoreContainer.getCore() when lazily loading cores.

2013-01-16 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554991#comment-13554991
 ] 

Erick Erickson commented on SOLR-4300:
--

Duh. Thanks Steve! Some days I'm the pigeon and some days I'm the statue... 
Sihhh.

 Possible race condition in CoreContainer.getCore() when lazily loading cores.
 -

 Key: SOLR-4300
 URL: https://issues.apache.org/jira/browse/SOLR-4300
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.1, 5.0
Reporter: Erick Erickson
Assignee: Steve Rowe
Priority: Blocker
 Fix For: 4.1, 5.0

 Attachments: SOLR-4300.patch


 Yonik pointed out in SOLR-1028 that there is a possible race condition here, 
 he's right not to my surprise. Calling it a blocker for now so we make a 
 decision on it rather than let it fall through the cracks. I should be able 
 to get a patch up tonight (Sunday).
 That said, there's potential here to introduce deadlocks, is it worth rushing 
 into 4.1?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


[ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554990#comment-13554990
 ] 

Simon Willnauer commented on LUCENE-4687:
-

bq. can the reset() method return void?

hmm not sure, I can try but its hard...

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4620) Explore IntEncoder/Decoder bulk API

[
https://issues.apache.org/jira/browse/LUCENE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera updated LUCENE-4620:
---

Attachment: LUCENE-4620.patch

Attached patch:

* Inlines VInt8 encode/decode in relevant encoders/decdoers.

* Marks encoders/decoders final.

* Gets rid of the decode() + doDecode(). It was nice while I wrote it, but I
figure that this is a hot code, and every method call counts, especially when
called for few values usually.

* Decoders no longer mess w/ bytes.offset (now that the decoding is inlined).

* Removed VInt8 class and test.

Mike, would you like to run luceneutil with this patch?

Explore IntEncoder/Decoder bulk API
---

Key: LUCENE-4620
URL: https://issues.apache.org/jira/browse/LUCENE-4620
Project: Lucene - Core
Issue Type: Improvement
Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Fix For: 4.1, 5.0

Attachments: LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch,
LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch

Today, IntEncoder/Decoder offer a streaming API, where you can encode(int)
and decode(int). Originally, we believed that this layer can be useful for
other scenarios, but in practice it's used only for writing/reading the
category ordinals from payload/DV.
Therefore, Mike and I would like to explore a bulk API, something like
encode(IntsRef, BytesRef) and decode(BytesRef, IntsRef). Perhaps the Encoder
can still be streaming (as we don't know in advance how many ints will be
written), dunno. Will figure this out as we go.
One thing to check is whether the bulk API can work w/ e.g. facet
associations, which can write arbitrary byte[], and so may decoding to an
IntsRef won't make sense. This too we'll figure out as we go. I don't rule
out that associations will use a different bulk API.
At the end of the day, the requirement is for someone to be able to configure
how ordinals are written (i.e. different encoding schemes: VInt, PackedInts
etc.) and later read, with as little overhead as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-4.x-java7 - Build # 900 - Still Failing

2013-01-16 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-java7/900/

All tests passed

Build Log:
[...truncated 25666 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/suggest/org/apache/lucene/search/spell/package-summary.html
 [exec]   missing: DirectSpellChecker.ScoreTerm
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/build.xml:60:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/lucene/build.xml:245:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-java7/lucene/common-build.xml:1971:
 exec returned: 1

Total time: 49 minutes 25 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4620) Explore IntEncoder/Decoder bulk API


[ 
https://issues.apache.org/jira/browse/LUCENE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555016#comment-13555016
 ] 

Michael McCandless commented on LUCENE-4620:


+1

It's much faster than I had tested before (maybe because of the DV cutover!?):

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
PKLookup  181.98  (1.2%)  182.20  (1.3%)
0.1% (  -2% -2%)
 LowTerm   77.95  (2.0%)   83.59  (2.8%)
7.2% (   2% -   12%)
 MedTerm   26.60  (3.3%)   31.46  (1.4%)   
18.3% (  13% -   23%)
HighTerm   15.83  (3.9%)   19.35  (1.3%)   
22.2% (  16% -   28%)
{noformat}

 Explore IntEncoder/Decoder bulk API
 ---

 Key: LUCENE-4620
 URL: https://issues.apache.org/jira/browse/LUCENE-4620
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.1, 5.0

 Attachments: LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch, 
 LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch


 Today, IntEncoder/Decoder offer a streaming API, where you can encode(int) 
 and decode(int). Originally, we believed that this layer can be useful for 
 other scenarios, but in practice it's used only for writing/reading the 
 category ordinals from payload/DV.
 Therefore, Mike and I would like to explore a bulk API, something like 
 encode(IntsRef, BytesRef) and decode(BytesRef, IntsRef). Perhaps the Encoder 
 can still be streaming (as we don't know in advance how many ints will be 
 written), dunno. Will figure this out as we go.
 One thing to check is whether the bulk API can work w/ e.g. facet 
 associations, which can write arbitrary byte[], and so may decoding to an 
 IntsRef won't make sense. This too we'll figure out as we go. I don't rule 
 out that associations will use a different bulk API.
 At the end of the day, the requirement is for someone to be able to configure 
 how ordinals are written (i.e. different encoding schemes: VInt, PackedInts 
 etc.) and later read, with as little overhead as possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4620) Explore IntEncoder/Decoder bulk API

[
https://issues.apache.org/jira/browse/LUCENE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555023#comment-13555023
]

Shai Erera commented on LUCENE-4620:

Could be DV helps some too. Also not calling decode() + reset() + doDecode()
every time must help some too.

Committed the changes to trunk, 4x and 4.1 branch.

Explore IntEncoder/Decoder bulk API
---

Attachments: LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch,
LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4620) Explore IntEncoder/Decoder bulk API

[
https://issues.apache.org/jira/browse/LUCENE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555024#comment-13555024
]

Commit Tag Bot commented on LUCENE-4620:

[trunk commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1433926

LUCENE-4620: inline encoding/decoding

Explore IntEncoder/Decoder bulk API
---

Attachments: LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch,
LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-ea-b65) - Build # 3786 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/3786/
Java: 64bit/jdk1.8.0-ea-b65 -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 23597 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.compressing...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene3x...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40.values...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene41...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc] Loading source files for package org.apache.lucene.util.fst...
  [javadoc] Loading source files for package org.apache.lucene.util.mutable...
  [javadoc] Loading source files for package org.apache.lucene.util.packed...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.8.0-ea
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html...
  [javadoc] 1 warning

[...truncated 45 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.7
  [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
  [javadoc] Loading source files for package org.apache.lucene.analysis.br...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.charfilter...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cn...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.commongrams...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound.hyphenation...
  [javadoc] Loading source files for package org.apache.lucene.analysis.core...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
  [javadoc] Loading source files for package org.apache.lucene.analysis.da...
  [javadoc] Loading source files for package org.apache.lucene.analysis.de...
  [javadoc] Loading source files for package org.apache.lucene.analysis.el...
  [javadoc] Loading source files for package org.apache.lucene.analysis.en...
  [javadoc] Loading source files for package org.apache.lucene.analysis.es...
  [javadoc] Loading source files for package org.apache.lucene.analysis.eu...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fa...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fi...
  [javadoc] Loading source files for package

[jira] [Commented] (LUCENE-4620) Explore IntEncoder/Decoder bulk API

[
https://issues.apache.org/jira/browse/LUCENE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555025#comment-13555025
]

Commit Tag Bot commented on LUCENE-4620:

[branch_4x commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1433929

LUCENE-4620: inline encoding/decoding

Explore IntEncoder/Decoder bulk API
---

Attachments: LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch,
LUCENE-4620.patch, LUCENE-4620.patch, LUCENE-4620.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4686) Write a specialized DGapVIntEncoder/Decoder for facets


 [ 
https://issues.apache.org/jira/browse/LUCENE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4686:
---

Attachment: LUCENE-4686.patch

Adds DGapVInt8IntEncoder/Decoder. At least according to EncodingSpeed, it's 
faster than DGap(VInt8), but we should test w/ luceneutil.

I set it as the default encoder/decoder in CLP, with a nocommit until 
luceneutil blesses it.

 Write a specialized DGapVIntEncoder/Decoder for facets
 --

 Key: LUCENE-4686
 URL: https://issues.apache.org/jira/browse/LUCENE-4686
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-4686.patch


 Today the default encoder/decoder for facets is DGap(VInt). That is a 
 DGapEncoder wrapping a VIntEncoder. Instead of this wrapping, we can write a 
 specialized DGapVIntEncoder which does it all in one call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4686) Write a specialized DGapVIntEncoder/Decoder for facets


[ 
https://issues.apache.org/jira/browse/LUCENE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555032#comment-13555032
 ] 

Michael McCandless commented on LUCENE-4686:


+1

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
PKLookup  181.92  (1.3%)  181.25  (2.0%)   
-0.4% (  -3% -2%)
 LowTerm   83.54  (2.0%)   85.61  (2.6%)
2.5% (  -2% -7%)
 MedTerm   31.53  (0.9%)   33.01  (1.7%)
4.7% (   2% -7%)
HighTerm   19.41  (0.8%)   20.57  (1.6%)
6.0% (   3% -8%)
{noformat}


 Write a specialized DGapVIntEncoder/Decoder for facets
 --

 Key: LUCENE-4686
 URL: https://issues.apache.org/jira/browse/LUCENE-4686
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-4686.patch


 Today the default encoder/decoder for facets is DGap(VInt). That is a 
 DGapEncoder wrapping a VIntEncoder. Instead of this wrapping, we can write a 
 specialized DGapVIntEncoder which does it all in one call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-trunk-java7 - Build # 3648 - Still Failing

2013-01-16 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-java7/3648/

All tests passed

Build Log:
[...truncated 25904 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-java7/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/suggest/org/apache/lucene/search/spell/package-summary.html
 [exec]   missing: DirectSpellChecker.ScoreTerm
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-java7/build.xml:60:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-java7/lucene/build.xml:245:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-java7/lucene/common-build.xml:1972:
 exec returned: 1

Total time: 50 minutes 39 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4684) Allow DirectSpellChecker to be extended


[ 
https://issues.apache.org/jira/browse/LUCENE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555033#comment-13555033
 ] 

Commit Tag Bot commented on LUCENE-4684:


[trunk commit] Martijn van Groningen
http://svn.apache.org/viewvc?view=revisionrevision=1433932

LUCENE-4684: Added jdoc and made fields private again.


 Allow DirectSpellChecker  to be extended
 

 Key: LUCENE-4684
 URL: https://issues.apache.org/jira/browse/LUCENE-4684
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
 Environment: Currently the suggestSimilar() that actually operates on 
 the FuzzyTermy is private protected. Would be great if that would just be 
 protected for extensions.
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4684.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4684) Allow DirectSpellChecker to be extended


[ 
https://issues.apache.org/jira/browse/LUCENE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555036#comment-13555036
 ] 

Commit Tag Bot commented on LUCENE-4684:


[branch_4x commit] Martijn van Groningen
http://svn.apache.org/viewvc?view=revisionrevision=1433934

LUCENE-4684: Added jdoc and made fields private again.


 Allow DirectSpellChecker  to be extended
 

 Key: LUCENE-4684
 URL: https://issues.apache.org/jira/browse/LUCENE-4684
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
 Environment: Currently the suggestSimilar() that actually operates on 
 the FuzzyTermy is private protected. Would be great if that would just be 
 protected for extensions.
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4684.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4686) Write a specialized DGapVIntEncoder/Decoder for facets


[ 
https://issues.apache.org/jira/browse/LUCENE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555037#comment-13555037
 ] 

Commit Tag Bot commented on LUCENE-4686:


[trunk commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1433935

LUCENE-4686: Write a specialized DGapVIntEncoder/Decoder for facets


 Write a specialized DGapVIntEncoder/Decoder for facets
 --

 Key: LUCENE-4686
 URL: https://issues.apache.org/jira/browse/LUCENE-4686
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-4686.patch


 Today the default encoder/decoder for facets is DGap(VInt). That is a 
 DGapEncoder wrapping a VIntEncoder. Instead of this wrapping, we can write a 
 specialized DGapVIntEncoder which does it all in one call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4684) Allow DirectSpellChecker to be extended


[ 
https://issues.apache.org/jira/browse/LUCENE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555038#comment-13555038
 ] 

Commit Tag Bot commented on LUCENE-4684:


[trunk commit] Martijn van Groningen
http://svn.apache.org/viewvc?view=revisionrevision=1433933

LUCENE-4684: Made fields private again (2).


 Allow DirectSpellChecker  to be extended
 

 Key: LUCENE-4684
 URL: https://issues.apache.org/jira/browse/LUCENE-4684
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
 Environment: Currently the suggestSimilar() that actually operates on 
 the FuzzyTermy is private protected. Would be great if that would just be 
 protected for extensions.
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4684.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4686) Write a specialized DGapVIntEncoder/Decoder for facets


 [ 
https://issues.apache.org/jira/browse/LUCENE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4686.


   Resolution: Fixed
Fix Version/s: 5.0
   4.2

Thanks Mike. Committed to trunk and 4x.

 Write a specialized DGapVIntEncoder/Decoder for facets
 --

 Key: LUCENE-4686
 URL: https://issues.apache.org/jira/browse/LUCENE-4686
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4686.patch


 Today the default encoder/decoder for facets is DGap(VInt). That is a 
 DGapEncoder wrapping a VIntEncoder. Instead of this wrapping, we can write a 
 specialized DGapVIntEncoder which does it all in one call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4686) Write a specialized DGapVIntEncoder/Decoder for facets


[ 
https://issues.apache.org/jira/browse/LUCENE-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555042#comment-13555042
 ] 

Commit Tag Bot commented on LUCENE-4686:


[branch_4x commit] Shai Erera
http://svn.apache.org/viewvc?view=revisionrevision=1433938

LUCENE-4686: Write a specialized DGapVIntEncoder/Decoder for facets


 Write a specialized DGapVIntEncoder/Decoder for facets
 --

 Key: LUCENE-4686
 URL: https://issues.apache.org/jira/browse/LUCENE-4686
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4686.patch


 Today the default encoder/decoder for facets is DGap(VInt). That is a 
 DGapEncoder wrapping a VIntEncoder. Instead of this wrapping, we can write a 
 specialized DGapVIntEncoder which does it all in one call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4623) facets should index drill-down fields using DOCS_ONLY


 [ 
https://issues.apache.org/jira/browse/LUCENE-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4623.


   Resolution: Fixed
Fix Version/s: 5.0
   4.2
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Fixed in LUCENE-4602.

 facets should index drill-down fields using DOCS_ONLY
 -

 Key: LUCENE-4623
 URL: https://issues.apache.org/jira/browse/LUCENE-4623
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
 Fix For: 4.2, 5.0


 Today we index as DOCS_AND_POSITIONS, which is necessary because we stuff the 
 payload into one of those tokens.
 If we indexed under two fields instead, then we could make the drill-down 
 field DOCS_ONLY.
 But ... once/if we cutover to doc values then we could use one field again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2013-01-16 Thread Dmitry Kan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Kan updated SOLR-1604:
-

Attachment: ComplexPhrase_solr_3.4.zip

This is ComplexPhrase project based on the version submitted on 21/Jul/11. It 
compiles and runs under solr 3.4. I have uncommented the tests in 
/org/apache/solr/search/ComplexPhraseQParserPluginTest.java and they passed.

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: query parsers, search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, 
 ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, SOLR-1604-alternative.patch, SOLR-1604.patch, 
 SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b65) - Build # 3787 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/3787/
Java: 32bit/jdk1.8.0-ea-b65 -client -XX:+UseG1GC

1 tests failed.
REGRESSION:  
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testIndexingWithFacets

Error Message:
codec=Lucene3x does not support docValues: from 
docValuesFormat().docsConsumer(...) returned null; field=$facets

Stack Trace:
java.lang.IllegalStateException: codec=Lucene3x does not support docValues: 
from docValuesFormat().docsConsumer(...) returned null; field=$facets
at 
__randomizedtesting.SeedInfo.seed([1C47E5B11BE8E612:3DE522FC0982D8F2]:0)
at 
org.apache.lucene.index.DocFieldProcessor.docValuesConsumer(DocFieldProcessor.java:362)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:274)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:250)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1484)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1159)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1140)
at 
org.apache.lucene.benchmark.byTask.tasks.AddDocTask.doLogic(AddDocTask.java:70)
at 
org.apache.lucene.benchmark.byTask.tasks.AddFacetedDocTask.doLogic(AddFacetedDocTask.java:73)
at 
org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:132)
at 
org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doSerialTasks(TaskSequence.java:197)
at 
org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doLogic(TaskSequence.java:138)
at 
org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:143)
at 
org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doSerialTasks(TaskSequence.java:197)
at 
org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doLogic(TaskSequence.java:138)
at 
org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:143)
at 
org.apache.lucene.benchmark.byTask.utils.Algorithm.execute(Algorithm.java:301)
at 
org.apache.lucene.benchmark.byTask.Benchmark.execute(Benchmark.java:77)
at 
org.apache.lucene.benchmark.BenchmarkTestCase.execBenchmark(BenchmarkTestCase.java:83)
at 
org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testIndexingWithFacets(TestPerfTasksLogic.java:794)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:474)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b65) - Build # 3787 - Still Failing!

2013-01-16 Thread Robert Muir

Tests that use docvalues in branch4x need @SuppressCodecs(Lucene3x)

On Wed, Jan 16, 2013 at 6:21 AM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/3787/
 Java: 32bit/jdk1.8.0-ea-b65 -client -XX:+UseG1GC

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testIndexingWithFacets

 Error Message:
 codec=Lucene3x does not support docValues: from 
 docValuesFormat().docsConsumer(...) returned null; field=$facets

 Stack Trace:
 java.lang.IllegalStateException: codec=Lucene3x does not support docValues: 
 from docValuesFormat().docsConsumer(...) returned null; field=$facets
 at 
 __randomizedtesting.SeedInfo.seed([1C47E5B11BE8E612:3DE522FC0982D8F2]:0)
 at 
 org.apache.lucene.index.DocFieldProcessor.docValuesConsumer(DocFieldProcessor.java:362)
 at 
 org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:274)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:250)
 at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
 at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1484)
 at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1159)
 at 
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1140)
 at 
 org.apache.lucene.benchmark.byTask.tasks.AddDocTask.doLogic(AddDocTask.java:70)
 at 
 org.apache.lucene.benchmark.byTask.tasks.AddFacetedDocTask.doLogic(AddFacetedDocTask.java:73)
 at 
 org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:132)
 at 
 org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doSerialTasks(TaskSequence.java:197)
 at 
 org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doLogic(TaskSequence.java:138)
 at 
 org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:143)
 at 
 org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doSerialTasks(TaskSequence.java:197)
 at 
 org.apache.lucene.benchmark.byTask.tasks.TaskSequence.doLogic(TaskSequence.java:138)
 at 
 org.apache.lucene.benchmark.byTask.tasks.PerfTask.runAndMaybeStats(PerfTask.java:143)
 at 
 org.apache.lucene.benchmark.byTask.utils.Algorithm.execute(Algorithm.java:301)
 at 
 org.apache.lucene.benchmark.byTask.Benchmark.execute(Benchmark.java:77)
 at 
 org.apache.lucene.benchmark.BenchmarkTestCase.execBenchmark(BenchmarkTestCase.java:83)
 at 
 org.apache.lucene.benchmark.byTask.TestPerfTasksLogic.testIndexingWithFacets(TestPerfTasksLogic.java:794)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:474)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at

[jira] [Commented] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


[ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555070#comment-13555070
 ] 

Steve Rowe commented on SOLR-4288:
--

bq. I'll commit to trunk – if there are no objections, could you merge-in to 
the release branch, Steve?

Yes, I'll merge into branch_4x and lucene_solr_4_1.

 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


[ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555080#comment-13555080
 ] 

Dawid Weiss commented on SOLR-4288:
---

Thanks Steve!

 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators

2013-01-16 Thread Shawn Heisey (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555085#comment-13555085
]

Shawn Heisey commented on SOLR-2649:

bq. Thank you! We have been waiting a long time for this fix.

I'm a little confused here. Were you talking to me? I don't have a fix, I was
just saying that I'm having the same problem, and that my problem is not
exactly like the initial description. The initial description says that when
boolean operators are present, edismax behaves as if mm=100%. I'm seeing the
opposite.

To summarize: When boolean operators are present in the query, two versions of
Solr are behaving as if I did not have mm=100%, q.op=AND, or
defaultOperator=AND in the schema. Both versions behave as if the default
operator is OR. For 3.5, I have tried all three of those options
simultaneously. For 4.1, I have tried just the first two, because
defaultOperator is deprecated.

MM ignored in edismax queries with operators

Key: SOLR-2649
URL: https://issues.apache.org/jira/browse/SOLR-2649
Project: Solr
Issue Type: Bug
Components: query parsers
Reporter: Magnus Bergmark
Priority: Minor
Fix For: 4.2, 5.0

Hypothetical scenario:
1. User searches for stocks oil gold with MM set to 50%
2. User adds -stockings to the query: stocks oil gold -stockings
3. User gets no hits since MM was ignored and all terms where AND-ed
together
The behavior seems to be intentional, although the reason why is never
explained:
// For correct lucene queries, turn off mm processing if there
// were explicit operators (except for AND).
boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0;
(lines 232-234 taken from
tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
This makes edismax unsuitable as an replacement to dismax; mm is one of the
primary features of dismax.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4288) Improve logging for FileDataSource (basePath, relative resources).


[ 
https://issues.apache.org/jira/browse/SOLR-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555088#comment-13555088
 ] 

Commit Tag Bot commented on SOLR-4288:
--

[branch_4x commit] Steven Rowe
http://svn.apache.org/viewvc?view=revisionrevision=1433957

SOLR-4288: Improve logging for FileDataSource (basePath, relative resources). 
(merged trunk r1433849)


 Improve logging for FileDataSource (basePath, relativeresources).
 -

 Key: SOLR-4288
 URL: https://issues.apache.org/jira/browse/SOLR-4288
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-4288.patch


 In fact, the logic is broken:
 {code}
   if (!file.isAbsolute())
 file = new File(basePath + query);
 {code}
 because basePath is null so 'null' is concatenated with the query string 
 (path) resulting in an invalid path. 
 It should be checked if basePath is null, if so default to .? Then resolve 
 relative location as:
 {code}
 new File(basePathFile, query);
 {code}
 I'd also say change the log so that the absolute path is also logged in the 
 warning message, otherwise it's really hard to figure out what's going on.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #217: POMs out of sync

2013-01-16 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/217/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
shard1 should have just been set up to be inconsistent - but it's still 
consistent

Stack Trace:
java.lang.AssertionError: shard1 should have just been set up to be 
inconsistent - but it's still consistent
at 
__randomizedtesting.SeedInfo.seed([A7B402BA1BAF128E:26528CA26CF072B2]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:214)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:794)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)

[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators

2013-01-16 Thread Thomas Egense (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555096#comment-13555096
 ] 

Thomas Egense commented on SOLR-2649:
-

Thanks for the clarification. 

 MM ignored in edismax queries with operators
 

 Key: SOLR-2649
 URL: https://issues.apache.org/jira/browse/SOLR-2649
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Magnus Bergmark
Priority: Minor
 Fix For: 4.2, 5.0


 Hypothetical scenario:
   1. User searches for stocks oil gold with MM set to 50%
   2. User adds -stockings to the query: stocks oil gold -stockings
   3. User gets no hits since MM was ignored and all terms where AND-ed 
 together
 The behavior seems to be intentional, although the reason why is never 
 explained:
   // For correct lucene queries, turn off mm processing if there
   // were explicit operators (except for AND).
   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
 (lines 232-234 taken from 
 tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
 This makes edismax unsuitable as an replacement to dismax; mm is one of the 
 primary features of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_10) - Build # 3808 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/3808/
Java: 64bit/jdk1.7.0_10 -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 25872 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/suggest/org/apache/lucene/search/spell/DirectSpellChecker.ScoreTerm.html
 [exec]   missing Fields: boost
 [exec]   missing Fields: docfreq
 [exec]   missing Fields: score
 [exec]   missing Fields: term
 [exec]   missing Fields: termAsString
 [exec]   missing Constructors: DirectSpellChecker.ScoreTerm()
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:60: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:273: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:1972:
 exec returned: 1

Total time: 38 minutes 4 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Description set: Java: 64bit/jdk1.7.0_10 -XX:+UseSerialGC
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4309) /browse: Improve JQuery autosuggest behavior

2013-01-16 Thread JIRA

Jan Høydahl created SOLR-4309:
-

 Summary: /browse: Improve JQuery autosuggest behavior
 Key: SOLR-4309
 URL: https://issues.apache.org/jira/browse/SOLR-4309
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
Reporter: Jan Høydahl
 Fix For: 4.2, 5.0


Three problems with current autosuggest JavaScript behavior in Velocity /browse:

1. The first entry in the list is pre-selected, so hitting ENTER searches the 
first suggestion instead of what the user entered
2. jQuery autosuggest's built-in cache is buggy, rendering funny suggestion 
lists 2nd time
3. Using the arrow buttons and hitting ENTER only fills in the selected item 
into the search box, does not perform the search

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4309) /browse: Improve JQuery autosuggest behavior

2013-01-16 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4309:
--

Attachment: SOLR-4309.patch

The attached patch fixes all three issues

 /browse: Improve JQuery autosuggest behavior
 

 Key: SOLR-4309
 URL: https://issues.apache.org/jira/browse/SOLR-4309
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
Reporter: Jan Høydahl
  Labels: autocomplete, autosuggest
 Fix For: 4.2, 5.0

 Attachments: SOLR-4309.patch


 Three problems with current autosuggest JavaScript behavior in Velocity 
 /browse:
 1. The first entry in the list is pre-selected, so hitting ENTER searches the 
 first suggestion instead of what the user entered
 2. jQuery autosuggest's built-in cache is buggy, rendering funny suggestion 
 lists 2nd time
 3. Using the arrow buttons and hitting ENTER only fills in the selected item 
 into the search box, does not perform the search

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4688) Reuse TermsEnum in BlockTreeTermsReader

Simon Willnauer created LUCENE-4688:
---

 Summary: Reuse TermsEnum in BlockTreeTermsReader
 Key: LUCENE-4688
 URL: https://issues.apache.org/jira/browse/LUCENE-4688
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0


Opening a TermsEnum comes with a significant cost at this point if done 
frequently like primary key lookups or if many segments are present. Currently 
we don't reuse it at all and create a lot of objects even if the enum is just 
used for a single seekExact (ie. TermQuery). Stressing the 
Terms#iterator(reuse) call shows significant gains with reuse...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4309) /browse: Improve JQuery autosuggest behavior

2013-01-16 Thread Erik Hatcher (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555120#comment-13555120
 ] 

Erik Hatcher commented on SOLR-4309:


bq. The attached patch fixes all three issues

+1

 /browse: Improve JQuery autosuggest behavior
 

 Key: SOLR-4309
 URL: https://issues.apache.org/jira/browse/SOLR-4309
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
Reporter: Jan Høydahl
  Labels: autocomplete, autosuggest
 Fix For: 4.2, 5.0

 Attachments: SOLR-4309.patch


 Three problems with current autosuggest JavaScript behavior in Velocity 
 /browse:
 1. The first entry in the list is pre-selected, so hitting ENTER searches the 
 first suggestion instead of what the user entered
 2. jQuery autosuggest's built-in cache is buggy, rendering funny suggestion 
 lists 2nd time
 3. Using the arrow buttons and hitting ENTER only fills in the selected item 
 into the search box, does not perform the search

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4688) Reuse TermsEnum in BlockTreeTermsReader

[
https://issues.apache.org/jira/browse/LUCENE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon Willnauer updated LUCENE-4688:

Attachment: LUCENE-4688.patch

here is an initial patch including my small benchmark that shows a pretty
significant impact of reuse.

the benchmark indexes 2 Million super small docs and checks for each doc if the
ID has already been indexed. I use NRT manager to reopen the reader every
second.

the results are pretty significant IMO:
{noformat}
start benchmark
run with reuse
Run took: 24 seconds with reuse terms enum = [true]
run without reuse
Run took: 34 seconds with reuse terms enum = [false]
{noformat}

while all tests pass with that patch I really wanna ask somebody (mike? :) )
with more knowledge about the BlockTreeTermsReader to look at this patch!!

I also run benchmarks with lucene util but didn't see any real gains with this
change so far.

Reuse TermsEnum in BlockTreeTermsReader
---

Key: LUCENE-4688
URL: https://issues.apache.org/jira/browse/LUCENE-4688
Project: Lucene - Core
Issue Type: Improvement
Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Fix For: 4.2, 5.0

Attachments: LUCENE-4688.patch

Opening a TermsEnum comes with a significant cost at this point if done
frequently like primary key lookups or if many segments are present.
Currently we don't reuse it at all and create a lot of objects even if the
enum is just used for a single seekExact (ie. TermQuery). Stressing the
Terms#iterator(reuse) call shows significant gains with reuse...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4165) Queries blocked when stopping and starting a node

[
https://issues.apache.org/jira/browse/SOLR-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mark Miller updated SOLR-4165:
--

Fix Version/s: (was: 4.1)
4.2

Queries blocked when stopping and starting a node
-

Key: SOLR-4165
URL: https://issues.apache.org/jira/browse/SOLR-4165
Project: Solr
Issue Type: Bug
Components: search, SolrCloud
Affects Versions: 5.0
Environment: 5.0-SNAPSHOT 1366361:1420056M - markus - 2012-12-11
11:52:06
Reporter: Markus Jelsma
Assignee: Mark Miller
Priority: Critical
Fix For: 4.2, 5.0

Our 10 node test cluster (10 shards, 20 cores) blocks incoming queries
briefly when a node is stopped gracefully and again blocks queries for at
least a few seconds when the node is started again.
We're using siege to send roughly 10 queries per second to a pair a load
balancers. Those load balancers ping (admin/ping) each node every few hundres
milliseconds. The ping queries continue to operate normally while the
requests to our main request handler is blocked. A manual request directly to
a live Solr node is also blocked for the same duration.
There are no errors logged. But it is clear that the the entire cluster
blocks queries as soon as the starting node is reading its config from
Zookeeper, likely even slightly earlier.
The blocking time when stopping a node varies between 1 or 5 seconds. The
blocking time when starting a node varies between 10 up to 30 seconds. The
blocked queries come rushing in again after a queue of ping requests are
served. The ping request sets the main request handler via the qt parameter.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4234) Add support for binary files in ZooKeeper.


 [ 
https://issues.apache.org/jira/browse/SOLR-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4234:
--

Fix Version/s: (was: 4.1)
   4.2

 Add support for binary files in ZooKeeper.
 --

 Key: SOLR-4234
 URL: https://issues.apache.org/jira/browse/SOLR-4234
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Eric Pugh
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: binary_upload_download.patch, 
 fix_show_file_handler_with_binaries.patch, SOLR4234_binary_files.patch, 
 solr.png


 I was attempting to get the ShowFileHandler to show a .png file, and it was 
 failing.  But in non-ZK mode it worked just fine!   It took a while, but it 
 seems that we upload to zk as a text, and download as well.  I've attached a 
 unit test that demonstrates the problem, and a fix.  You have to have a 
 binary file in the conf directory to make the test work, I put solr.png in 
 the collection1/conf/velocity directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4688) Reuse TermsEnum in BlockTreeTermsReader


[ 
https://issues.apache.org/jira/browse/LUCENE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555126#comment-13555126
 ] 

Michael McCandless commented on LUCENE-4688:


Awesome!  I'll look at the patch.

Reuse is important w/ BlockTree's TermsEnum ...

 Reuse TermsEnum in BlockTreeTermsReader
 ---

 Key: LUCENE-4688
 URL: https://issues.apache.org/jira/browse/LUCENE-4688
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4688.patch


 Opening a TermsEnum comes with a significant cost at this point if done 
 frequently like primary key lookups or if many segments are present. 
 Currently we don't reuse it at all and create a lot of objects even if the 
 enum is just used for a single seekExact (ie. TermQuery). Stressing the 
 Terms#iterator(reuse) call shows significant gains with reuse...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4038) SolrCloud indexing blocks if shard is marked as DOWN


 [ 
https://issues.apache.org/jira/browse/SOLR-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4038:
--

Fix Version/s: (was: 4.1)
   4.2

 SolrCloud indexing blocks if shard is marked as DOWN
 

 Key: SOLR-4038
 URL: https://issues.apache.org/jira/browse/SOLR-4038
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Markus Jelsma
Assignee: Mark Miller
 Fix For: 4.2, 5.0


 See: 
 http://lucene.472066.n3.nabble.com/SolrCloud-indexing-blocks-if-node-is-recovering-td4017827.html
 While indexing (without CloudSolrServer at that time) one node dies with an 
 OOME perhaps  because of the linked issue SOLR-4032. The OOME stack traces 
 are varied but here are some ZK-related logs between the OOME stack traces:
 {code}
 2012-11-02 14:14:37,126 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Dropping buffered updates FSUpdateLog{state=BUFFERING, tlog=null}
 2012-11-02 14:14:37,127 ERROR [solr.cloud.RecoveryStrategy] - 
 [RecoveryThread] - : Recovery failed - trying again... (2) core=shard_e
 2012-11-02 14:14:37,127 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Wait 8.0 seconds before trying to recover again (3)
 2012-11-02 14:14:45,328 INFO [solr.cloud.ZkController] - [RecoveryThread] - : 
 numShards not found on descriptor - reading it from system property
 2012-11-02 14:14:45,363 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Starting Replication Recovery. core=shard_e
 2012-11-02 14:14:45,363 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-11-02 14:14:45,775 INFO [common.cloud.ZkStateReader] - 
 [main-EventThread] - : A cluster state change has occurred - updating... (10)
 2012-11-02 14:14:50,987 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Begin buffering updates. core=shard_e
 2012-11-02 14:14:50,987 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Starting to buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
 2012-11-02 14:14:50,987 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Attempting to replicate from http://rot05.solrserver:8080/solr/shard_e/. 
 core=shard_e
 2012-11-02 14:14:50,987 INFO [solrj.impl.HttpClientUtil] - [RecoveryThread] - 
 : Creating new http client, 
 config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
 2012-11-02 14:15:03,303 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_f/data/index
 2012-11-02 14:15:03,303 INFO [solr.handler.SnapPuller] - [RecoveryThread] - : 
 removing temporary index download directory files 
 NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_f/data/index.20121102141424591
  lockFactory=org.apache.lucene.store.SimpleFSLockFactory@1520a48c; 
 maxCacheMB=48.0 maxMergeSizeMB=4.0)
 2012-11-02 14:15:09,421 INFO [apache.zookeeper.ClientCnxn] - 
 [main-SendThread(rot1.zkserver:2181)] - : Client session timed out, have not 
 heard from server in 11873ms for sessionid 0x13abc504486000f, closing socket 
 connection and attempting reconnect
 2012-11-02 14:15:09,422 ERROR [solr.core.SolrCore] - [http-8080-exec-1] - : 
 org.apache.solr.common.SolrException: Ping query caused exception: Java heap 
 space
 .
 .
 2012-11-02 14:15:09,867 INFO [common.cloud.ConnectionManager] - 
 [main-EventThread] - : Watcher 
 org.apache.solr.common.cloud.ConnectionManager@305e7020 
 name:ZooKeeperConnection Watcher:rot1.zkserver:2181,rot2.zkserver:2181 got 
 event WatchedEvent state:Disconnected type:None path:null path:null type:None
 2012-11-02 14:15:09,867 INFO [common.cloud.ConnectionManager] - 
 [main-EventThread] - : zkClient has disconnected
 2012-11-02 14:15:09,869 ERROR [solr.cloud.RecoveryStrategy] - 
 [RecoveryThread] - : Error while trying to 
 recover:java.lang.OutOfMemoryError: Java heap space
 .
 .
 2012-11-02 14:15:10,159 INFO [solr.update.UpdateLog] - [RecoveryThread] - : 
 Dropping buffered updates FSUpdateLog{state=BUFFERING, tlog=null}
 2012-11-02 14:15:10,159 ERROR [solr.cloud.RecoveryStrategy] - 
 [RecoveryThread] - : Recovery failed - trying again... (3) core=shard_e
 2012-11-02 14:15:10,159 INFO [solr.cloud.RecoveryStrategy] - [RecoveryThread] 
 - : Wait 16.0 seconds before trying to recover again (4)
 2012-11-02 14:15:09,878 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing 
 directory:/opt/solr/cores/shard_f/data/index.20121102141424591
 2012-11-02 14:15:10,192 INFO [solr.core.CachingDirectoryFactory] - 
 [RecoveryThread] - : Releasing directory:/opt/solr/cores/shard_f_f/data/index
 2012-11-02 14:15:10,192 ERROR

[jira] [Updated] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


 [ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-4687:


Attachment: LUCENE-4687.patch

new patch making reset return void...

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch, LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4684) Allow DirectSpellChecker to be extended


[ 
https://issues.apache.org/jira/browse/LUCENE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555140#comment-13555140
 ] 

Commit Tag Bot commented on LUCENE-4684:


[branch_4x commit] Martijn van Groningen
http://svn.apache.org/viewvc?view=revisionrevision=1433992

LUCENE-4684: Fixed jdocs attempt 2


 Allow DirectSpellChecker  to be extended
 

 Key: LUCENE-4684
 URL: https://issues.apache.org/jira/browse/LUCENE-4684
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
 Environment: Currently the suggestSimilar() that actually operates on 
 the FuzzyTermy is private protected. Would be great if that would just be 
 protected for extensions.
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4684.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4684) Allow DirectSpellChecker to be extended


[ 
https://issues.apache.org/jira/browse/LUCENE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555141#comment-13555141
 ] 

Commit Tag Bot commented on LUCENE-4684:


[trunk commit] Martijn van Groningen
http://svn.apache.org/viewvc?view=revisionrevision=1433993

LUCENE-4684: Made fields private again (2).


 Allow DirectSpellChecker  to be extended
 

 Key: LUCENE-4684
 URL: https://issues.apache.org/jira/browse/LUCENE-4684
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
 Environment: Currently the suggestSimilar() that actually operates on 
 the FuzzyTermy is private protected. Would be great if that would just be 
 protected for extensions.
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4684.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource

2013-01-16 Thread Renaud Delbru (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555146#comment-13555146
 ] 

Renaud Delbru commented on LUCENE-4642:
---

Could someone from the team tell us if this patch may be considered for 
inclusion at some point ? We currently need it in our project, and therefore it 
is kind of blocking us in our development. Thanks.

 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-01-16 Thread Ken Ip (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555149#comment-13555149
 ] 

Ken Ip commented on SOLR-2894:
--

Hi Chris,

Thanks for the patch. Any chance this can be applied to 4_0 or 4_1 branch? We 
have no problem applying it to truck but it can't be applied to 4_0 nor 4_1. 
Appreciated.

➜  lucene_solr_4_1  patch -p0 -i SOLR-2894.patch --dry-run
patching file 
solr/core/src/test/org/apache/solr/handler/component/DistributedFacetPivotTest.java
patching file solr/core/src/test/org/apache/solr/SingleDocShardFeeder.java
patching file 
solr/core/src/test/org/apache/solr/TestRefinementAndOverrequestingForFieldFacetCounts.java
patching file 
solr/core/src/java/org/apache/solr/handler/component/EntryCountComparator.java
patching file 
solr/core/src/java/org/apache/solr/handler/component/PivotNamedListCountComparator.java
patching file 
solr/core/src/java/org/apache/solr/handler/component/PivotFacetHelper.java
Hunk #1 FAILED at 16.
Hunk #2 FAILED at 35.
Hunk #5 succeeded at 287 with fuzz 2.
2 out of 5 hunks FAILED -- saving rejects to file 
solr/core/src/java/org/apache/solr/handler/component/PivotFacetHelper.java.rej
patching file 
solr/core/src/java/org/apache/solr/handler/component/NullGoesLastComparator.java
patching file 
solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 43.
2 out of 11 hunks FAILED -- saving rejects to file 
solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java.rej
patching file solr/core/src/java/org/apache/solr/util/PivotListEntry.java
patching file solr/solrj/src/java/org/apache/solr/common/params/FacetParams.java

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.2, 5.0

 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4642) TokenizerFactory should provide a create method with a given AttributeSource


 [ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-4642:
---

Assignee: Steve Rowe

Hi Renaud, I'll review some time in the next week.

 TokenizerFactory should provide a create method with a given AttributeSource
 

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4642.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4688) Reuse TermsEnum in BlockTreeTermsReader

2013-01-16 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555164#comment-13555164
 ] 

Robert Muir commented on LUCENE-4688:
-

Can we break this patch up... particularly i think we should look at the 
multitermquery API as a separate issue from BlockTree's impl.

 Reuse TermsEnum in BlockTreeTermsReader
 ---

 Key: LUCENE-4688
 URL: https://issues.apache.org/jira/browse/LUCENE-4688
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4688.patch


 Opening a TermsEnum comes with a significant cost at this point if done 
 frequently like primary key lookups or if many segments are present. 
 Currently we don't reuse it at all and create a lot of objects even if the 
 enum is just used for a single seekExact (ie. TermQuery). Stressing the 
 Terms#iterator(reuse) call shows significant gains with reuse...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4308) We should remove log4j-over-slf4j.


[ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555174#comment-13555174
 ] 

Commit Tag Bot commented on SOLR-4308:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1434020

SOLR-4308: Remove the problematic and now unnecessary log4j-over-slf4j.


 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4308) We should remove log4j-over-slf4j.


[ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555181#comment-13555181
 ] 

Commit Tag Bot commented on SOLR-4308:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1434022

SOLR-4308: Remove license files and maven references


 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4308) We should remove log4j-over-slf4j.


[ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555190#comment-13555190
 ] 

Commit Tag Bot commented on SOLR-4308:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1434026

SOLR-4308: Remove the problematic and now unnecessary log4j-over-slf4j.


 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4234) Add support for binary files in ZooKeeper.

2013-01-16 Thread Eric Pugh (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555197#comment-13555197
 ] 

Eric Pugh commented on SOLR-4234:
-

This isn't worth delaying a release!   I am somewhat struggling to figure out 
why the test is failing.  Going to try on a new code checkout.

 Add support for binary files in ZooKeeper.
 --

 Key: SOLR-4234
 URL: https://issues.apache.org/jira/browse/SOLR-4234
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
Reporter: Eric Pugh
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: binary_upload_download.patch, 
 fix_show_file_handler_with_binaries.patch, SOLR4234_binary_files.patch, 
 solr.png


 I was attempting to get the ShowFileHandler to show a .png file, and it was 
 failing.  But in non-ZK mode it worked just fine!   It took a while, but it 
 seems that we upload to zk as a text, and download as well.  I've attached a 
 unit test that demonstrates the problem, and a fix.  You have to have a 
 binary file in the conf directory to make the test work, I put solr.png in 
 the collection1/conf/velocity directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4688) Reuse TermsEnum in BlockTreeTermsReader

[
https://issues.apache.org/jira/browse/LUCENE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555201#comment-13555201
]

Michael McCandless commented on LUCENE-4688:

I think it's interesting/powerful to enable across-segment reuse: none
of our other reuse APIs (DocsEnum, DPEnum) can do that.

But I'm not sure we should do it: to take full advantage of it
requires API changes (like the MTQ.getTermsEnum change) ... we'd have
to do something similar to Weight/Scorer to share the D/PEnum across
segments.

The patch itself is spooky: this BlockTree code is hairy, and I'm not
sure that the reset() isn't going to cause subtle corner-case bugs.
(Separately: we need to simplify this code: it's unapproachable now).

The benchmark gain is impressive, but, we are talking about 10 seconds
over 2M docs right? So 5 micro-seconds (.005 msec) per document? In a
more realistic scenario (indexing more normal docs) surely this is a
minor part of the time ...

The app can always reuse itself per-segment today ... I think reuse is
rather expert so it's OK to offer that as the way to reuse?

Reuse TermsEnum in BlockTreeTermsReader
---

Attachments: LUCENE-4688.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups

2013-01-16 Thread Amit Nithian (JIRA)

Amit Nithian created SOLR-4310:
--

 Summary: If groups.ngroups is specified, the docList's numFound 
should be the number of groups
 Key: SOLR-4310
 URL: https://issues.apache.org/jira/browse/SOLR-4310
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.1
Reporter: Amit Nithian
Priority: Minor
 Fix For: 4.1


If you group by a field, the response may look like this:
lst name=grouped
lst name=series
int name=matches138/int
int name=ngroups1/int
result name=doclist numFound=138 start=0
doc
int name=id267038365/int
str name=name
Larry's Grand Ole Garage Country Dance - Pure Country
/str
/doc
/result
/lst
/lst

and if you specify group.main then the doclist becomes the result and you lose 
all context of the number of groups. If you want to keep your response format 
backwards compatible with clients (i.e. clients who don't know about the 
grouped format), setting group.main=true solves this BUT the numFound is the 
number of raw matches instead of the number of groups. This may have downstream 
consequences.

I'd like to propose that if the user specifies ngroups=true then when creating 
the returning DocSlice, set the numFound to be the number of groups instead of 
the number of raw matches to keep the response consistent with what the user 
would expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups

2013-01-16 Thread Amit Nithian (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Nithian updated SOLR-4310:
---

Attachment: SOLR-4310.patch

Here's my patch to address the problem.

 If groups.ngroups is specified, the docList's numFound should be the number 
 of groups
 -

 Key: SOLR-4310
 URL: https://issues.apache.org/jira/browse/SOLR-4310
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.1
Reporter: Amit Nithian
Priority: Minor
 Fix For: 4.1

 Attachments: SOLR-4310.patch


 If you group by a field, the response may look like this:
 lst name=grouped
 lst name=series
 int name=matches138/int
 int name=ngroups1/int
 result name=doclist numFound=138 start=0
 doc
 int name=id267038365/int
 str name=name
 Larry's Grand Ole Garage Country Dance - Pure Country
 /str
 /doc
 /result
 /lst
 /lst
 and if you specify group.main then the doclist becomes the result and you 
 lose all context of the number of groups. If you want to keep your response 
 format backwards compatible with clients (i.e. clients who don't know about 
 the grouped format), setting group.main=true solves this BUT the numFound is 
 the number of raw matches instead of the number of groups. This may have 
 downstream consequences.
 I'd like to propose that if the user specifies ngroups=true then when 
 creating the returning DocSlice, set the numFound to be the number of groups 
 instead of the number of raw matches to keep the response consistent with 
 what the user would expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4308) We should remove log4j-over-slf4j.


[ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555204#comment-13555204
 ] 

Commit Tag Bot commented on SOLR-4308:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1434028

SOLR-4308: Remove license files and maven references


 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


[ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555206#comment-13555206
 ] 

Michael McCandless commented on LUCENE-4687:


+1

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch, LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4308) We should remove log4j-over-slf4j.


 [ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4308:
--

Fix Version/s: 4.2

 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 4.2, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4308) We should remove log4j-over-slf4j.


 [ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4308:
--

Fix Version/s: (was: 4.2)

 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4308) We should remove log4j-over-slf4j.


 [ 
https://issues.apache.org/jira/browse/SOLR-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4308.
---

Resolution: Fixed

 We should remove log4j-over-slf4j.
 --

 Key: SOLR-4308
 URL: https://issues.apache.org/jira/browse/SOLR-4308
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.1, 5.0


 We don't need this anymore - Zk moved to SLF4j in 3.4: 
 https://issues.apache.org/jira/browse/ZOOKEEPER-850
 Since this jar causes nasty hard to diagnose problems if you ever get a log4j 
 jar in the classpath and we have no current use for it, we should remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4266) HttpSolrServer does not release connection properly on exception when no response parser is used


[ 
https://issues.apache.org/jira/browse/SOLR-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555337#comment-13555337
 ] 

Commit Tag Bot commented on SOLR-4266:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1434109

SOLR-4266: HttpSolrServer does not release connection properly on exception 
when no response parser is used.


 HttpSolrServer does not release connection properly on exception when no 
 response parser is used
 

 Key: SOLR-4266
 URL: https://issues.apache.org/jira/browse/SOLR-4266
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: Steve Molloy
Assignee: Mark Miller
 Fix For: 4.1, 5.0

 Attachments: patch-4266.txt


 When using HttpSolrServer for requests with no response parser, any 
 unpredicted status code (401, 500...) will throw the exception properly, but 
 will not close the connection. Since no handle for connection is returned in 
 case of exception, it should be closed. So only case where it should not be 
 closed is when the stream is actually returned, that is, when no response 
 parser is used and the call is successful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4600) Explore facets aggregation during documents collection


 [ 
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4600:
---

Component/s: modules/facet

 Explore facets aggregation during documents collection
 --

 Key: LUCENE-4600
 URL: https://issues.apache.org/jira/browse/LUCENE-4600
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, 
 LUCENE-4600.patch


 Today the facet module simply gathers all hits (as a bitset, optionally with 
 a float[] to hold scores as well, if you will aggregate them) during 
 collection, and then at the end when you call getFacetsResults(), it makes a 
 2nd pass over all those hits doing the actual aggregation.
 We should investigate just aggregating as we collect instead, so we don't 
 have to tie up transient RAM (fairly small for the bit set but possibly big 
 for the float[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups

2013-01-16 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555349#comment-13555349
 ] 

Tomás Fernández Löbbe commented on SOLR-4310:
-

+1 on the idea. I always thought this is how it should work in this case.

 If groups.ngroups is specified, the docList's numFound should be the number 
 of groups
 -

 Key: SOLR-4310
 URL: https://issues.apache.org/jira/browse/SOLR-4310
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.1
Reporter: Amit Nithian
Priority: Minor
 Fix For: 4.1

 Attachments: SOLR-4310.patch


 If you group by a field, the response may look like this:
 lst name=grouped
 lst name=series
 int name=matches138/int
 int name=ngroups1/int
 result name=doclist numFound=138 start=0
 doc
 int name=id267038365/int
 str name=name
 Larry's Grand Ole Garage Country Dance - Pure Country
 /str
 /doc
 /result
 /lst
 /lst
 and if you specify group.main then the doclist becomes the result and you 
 lose all context of the number of groups. If you want to keep your response 
 format backwards compatible with clients (i.e. clients who don't know about 
 the grouped format), setting group.main=true solves this BUT the numFound is 
 the number of raw matches instead of the number of groups. This may have 
 downstream consequences.
 I'd like to propose that if the user specifies ngroups=true then when 
 creating the returning DocSlice, set the numFound to be the number of groups 
 instead of the number of raw matches to keep the response consistent with 
 what the user would expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Commented] (SOLR-4112) Dataimporting with SolrCloud Fails

2013-01-16 Thread sausarkar

Hi Erick,

There is a difference in *.fdt and *.fdx files between 14G and 7G files. I
have enclosed the sizes.

I also did add fl=score to the queries and lazy load is configured in
solrconfig.xml but still the performance issues exist.

I am measuring QTime.

I am seeing a huge CPU spike in the machine when searching 100 QPS here is
the load average: load average: 86.90, 134.85, 79.98

Here is a comparison between v4.0 and v4.1 in a cloud setup
QPS v4.0 (msec)  v4.1 (msec)
35  27  46.7
70  24  290 Some timeout errorserrors
128 26  Error   
240 26.4Did not try

Please let us know how to solve this issue.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/jira-Created-SOLR-4112-Dataimporting-with-SolrCloud-Fails-tp4022365p4033952.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Commented] (SOLR-4112) Dataimporting with SolrCloud Fails

2013-01-16 Thread sausarkar

Here are the *.fdt  *.fdx file details:

14G files

-rw-rw 1 root root 7784904347 Jan 11 18:38 _12n7.fdt
-rw-rw 1 root root   82947778 Jan 11 18:38 _12n7.fdx

7G files

-rw-rw 1 root root 3162548974 Jan 11 18:57 _43v.fdt
-rw-rw 1 root root1052674 Jan 11 18:56 _43v.fdx
-rw-rw 1 root root  157718173 Jan 11 18:57 _44y.fdt
-rw-rw 1 root root  46891 Jan 11 18:57 _44y.fdx
-rw-rw 1 root root  749326180 Jan 11 18:56 _50u.fdt
-rw-rw 1 root root 225013 Jan 11 18:56 _50u.fdx
-rw-rw 1 root root   64541680 Jan 11 18:56 _545.fdt
-rw-rw 1 root root  19206 Jan 11 18:56 _545.fdx
-rw-rw 1 root root  115060275 Jan 11 18:56 _57i.fdt
-rw-rw 1 root root  35375 Jan 11 18:56 _57i.fdx
-rw-rw 1 root root   49801042 Jan 11 18:56 _5ct.fdt
-rw-rw 1 root root  15291 Jan 11 18:56 _5ct.fdx
-rw-rw 1 root root2555043 Jan 11 18:57 _5fw.fdt
-rw-rw 1 root root684 Jan 11 18:57 _5fw.fdx
-rw-rw 1 root root  125076704 Jan 11 18:57 _5gq.fdt
-rw-rw 1 root root  36049 Jan 11 18:57 _5gq.fdx
-rw-rw 1 root root   55056217 Jan 11 18:56 _5kx.fdt
-rw-rw 1 root root  16925 Jan 11 18:56 _5kx.fdx
-rw-rw 1 root root5057535 Jan 11 18:56 _5lg.fdt
-rw-rw 1 root root   1387 Jan 11 18:56 _5lg.fdx
-rw-rw 1 root root2465839 Jan 11 18:57 _5n5.fdt
-rw-rw 1 root root671 Jan 11 18:57 _5n5.fdx
-rw-rw 1 root root3121301 Jan 11 18:57 _5ne.fdt
-rw-rw 1 root root801 Jan 11 18:57 _5ne.fdx
-rw-rw 1 root root3640966 Jan 11 18:57 _5nz.fdt
-rw-rw 1 root root925 Jan 11 18:57 _5nz.fdx
-rw-rw 1 root root1407426 Jan 11 18:57 _5o2.fdt
-rw-rw 1 root root368 Jan 11 18:57 _5o2.fdx
-rw-rw 1 root root1291281 Jan 11 18:56 _5o6.fdt
-rw-rw 1 root root342 Jan 11 18:56 _5o6.fdx
-rw-rw 1 root root  118825712 Jan 11 18:56 _5o9.fdt
-rw-rw 1 root root  36276 Jan 11 18:56 _5o9.fdx
-rw-rw 1 root root1172483 Jan 11 18:56 _5oc.fdt
-rw-rw 1 root root364 Jan 11 18:56 _5oc.fdx
-rw-rw 1 root root 960418 Jan 11 18:56 _5og.fdt
-rw-rw 1 root root277 Jan 11 18:57 _5og.fdx
-rw-rw 1 root root 742292 Jan 11 18:56 _5om.fdt
-rw-rw 1 root root215 Jan 11 18:56 _5om.fdx
-rw-rw 1 root root1047978 Jan 11 18:57 _5or.fdt
-rw-rw 1 root root287 Jan 11 18:57 _5or.fdx
-rw-rw 1 root root2570733 Jan 11 18:56 _5ot.fdt
-rw-rw 1 root root696 Jan 11 18:56 _5ot.fdx
-rw-rw 1 root root 672436 Jan 11 18:57 _5p0.fdt
-rw-rw 1 root root199 Jan 11 18:57 _5p0.fdx
-rw-rw 1 root root2938914 Jan 11 18:56 _5p2.fdt
-rw-rw 1 root root794 Jan 11 18:56 _5p2.fdx
-rw-rw 1 root root 401757 Jan 11 18:57 _5p3.fdt
-rw-rw 1 root root128 Jan 11 18:56 _5p3.fdx



--
View this message in context: 
http://lucene.472066.n3.nabble.com/jira-Created-SOLR-4112-Dataimporting-with-SolrCloud-Fails-tp4022365p4033954.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Commented] (SOLR-4112) Dataimporting with SolrCloud Fails

2013-01-16 Thread sausarkar

One more update we did try a master slave setup with v4.0 but the query
performance is as bad as the Solr4.1

CPU just spikes after a few minutes of running our test the load average is
more than 300.

load average: 353.46, 242.38, 141.74

Please suggest how this can be improved.

Thanks,



--
View this message in context: 
http://lucene.472066.n3.nabble.com/jira-Created-SOLR-4112-Dataimporting-with-SolrCloud-Fails-tp4022365p4033958.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4687) Lazily initialize TermsEnum in BloomFilterPostingsFormat


[ 
https://issues.apache.org/jira/browse/LUCENE-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555374#comment-13555374
 ] 

Simon Willnauer commented on LUCENE-4687:
-

I will commit this tomorrow...

 Lazily initialize TermsEnum in BloomFilterPostingsFormat
 

 Key: LUCENE-4687
 URL: https://issues.apache.org/jira/browse/LUCENE-4687
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs
Affects Versions: 4.0, 4.1
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4687.patch, LUCENE-4687.patch


 BloomFilteringPostingsFormat initializes its delegate TermsEnum directly 
 inside the Terms#iterator() call which can be a pretty heavy operation if 
 executed thousands of times. I suspect that bloom filter postings are mainly 
 used for primary keys etc. which in turn is mostly a seekExact. Given that, 
 most of the time we don't even need the delegate termsenum since most of the 
 segments won't contain the key and the bloomfilter will likely return false 
 from seekExact without consulting the delegate. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-656) better error message when data/index is completely empty

2013-01-16 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-656.
---

   Resolution: Fixed
Fix Version/s: (was: 4.2)
   (was: 5.0)
   4.0

bq. My patch is unnecessary on 4x, all that remains for this issue is to decide 
whether the 3.6 behavior is considered a bug.

given that doing an rm index/\* has never been supported, and the only reason 
this issue was opened was to try and provide a better error message, i don't 
think we need to worry about fixing this in a future 3.6.x release.

bq. I am still curious whether adding isFSBased() to DirectoryFactory is 
something that holds interest. The patch is ready - all that you'd have to do 
is remove the part that wasn't necessary for this issue.

I'm not sure that i grasp the value/use of a method like that -- i would 
suggesting bring up the usecases you have in mind on the dev@lucene

 better error message when data/index is completely empty
 --

 Key: SOLR-656
 URL: https://issues.apache.org/jira/browse/SOLR-656
 Project: Solr
  Issue Type: Wish
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.0

 Attachments: SOLR-656.patch, SOLR-656-rmdir.patch, 
 SOLR-656-rmdir.patch


 Solr's normal behavior is to create an index dire in the dataDir if one 
 does not already exist, but if index does exist it is used as is, warts and 
 all ... if the index is corrupt in some way, and Solr can't create an 
 IndexWriter or IndexReader that error is propagated up to the user.
 I don't think this should change: Solr shouldn't attempt to do anything 
 special if there is a low level problem with the index, but something that 
 i've seen happen more then a few times is that people unwittingly rm 
 index/* when they should run -r index and as a result Solr+Lucene gives 
 them an error instead of just giving them an empty index
 when checking if an existing index dir exists, it would probably be worth 
 while to add a little one line sanity test that it contains some files, and 
 log a warning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4518) Suggesters: highlighting (explicit markup of user-typed portions vs. generated portions in a suggestion)

2013-01-16 Thread Oliver Christ (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555396#comment-13555396
]

Oliver Christ commented on LUCENE-4518:
---

I’ve played around with Mike’s patches, but for the AnalyzingSuggester the
results have been mixed. Since the transition symbols in the automaton are not
closely aligned between the surface and the analyzed form,
LookupResult.prefixLength (which attempts to represent the length of the
surface string which corresponds to the lookup string) is off quite a bit,
leading to very confusing highlighting in non-trivial cases.

I think this is ultimately due to the way how the FST is constructed, but that
seems to be non-trivial to change.

In addition, just returning the (surface) prefix length which corresponds to
the lookup string is not sufficient for more complex suggesters, such as “infix
suggesters” where the user-provided string is not a prefix of the full surface
term (google.com: type in “sox rumor”). What the suggesters ultimately would
have to return is a list of text chunks where each chunk has a flag whether
it’s based on the lookup string or has been auto-completed.

So at this point we are back at trying to identify the matched string portions
by other means, which isn’t perfect either, but acceptable in most cases. :(

Suggesters: highlighting (explicit markup of user-typed portions vs.
generated portions in a suggestion)

Key: LUCENE-4518
URL: https://issues.apache.org/jira/browse/LUCENE-4518
Project: Lucene - Core
Issue Type: New Feature
Reporter: Oliver Christ
Assignee: Michael McCandless
Attachments: LUCENE-4518.patch

As a user, I would like the lookup result of the suggestion engine to contain
information which allows me to distinguish the user-entered portion from the
autocompleted portion of a suggestion. That information can then be used for
e.g. highlighting.
*Notes:*
It's trivial if the suggestion engine only applies simple prefix search, as
then the user-typed prefix is always a true prefix of the completion.
However, it's non-trivial as soon as you use an AnalyzingSuggester, where the
completion may (in extreme cases) be quite different from the user-provided
input. As soon as case/diacritics folding, script adaptation (kanji/hiragana)
come into play, the completion is no longer guaranteed to be an extension of
the query. Since the caller of the suggestion engine (UI) generally does not
know the implementation details, the required information needs to be passed
in the LookupResult.
*Discussion on java-user:*
I haven't found a simple solution for the highlighting yet,
particularly when using AnalyzingSuggester (where it's non-trivial).
Mike McCandless:
Ahh I see ... it is challenging in that case. Hmm. Maybe open an issue for
this as well, so we can discuss/iterate?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4311) Luke and Core admin ajax requests shouldn't be cached in admin

2013-01-16 Thread Chris Bleakley (JIRA)

Chris Bleakley created SOLR-4311:


 Summary: Luke and Core admin ajax requests shouldn't be cached in 
admin
 Key: SOLR-4311
 URL: https://issues.apache.org/jira/browse/SOLR-4311
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.0
Reporter: Chris Bleakley
Priority: Minor


  Although both the luke and core admin handlers set http caching to false in 
the response headers** I believe the Cache-Control settings are ignored during 
ajax requests in certain browsers. This can be a problem if you're refreshing 
admin to get the latest do count. It can also be a problem when you compare the 
count of Num Docs on the main index page (/solr/#/CORE) vs the count on the 
core admin page (/solr/#/~cores/CORE). Consider that if you first visit the 
main index page, add and commit 100 docs, and then visit core admin the doc 
count will be off by 100.
  
  If this is an issue the ajax requests can explictly set caching to false ( 
http://api.jquery.com/jQuery.ajax/#jQuery-ajax-settings ) ... for example, 
inserting 'cache: false,' after line 91 here: 
https://github.com/apache/lucene-solr/blob/branch_4x/solr/webapp/web/js/scripts/dashboard.js#L91
  
  ** 
https://github.com/apache/lucene-solr/blob/branch_4x/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java#L167
  ** 
https://github.com/apache/lucene-solr/blob/branch_4x/solr/core/src/java/org/apache/solr/handler/admin/CoreAdminHandler.java#L216
  

  Tested using Chrome Version 24.0.1312.52

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4266) HttpSolrServer does not release connection properly on exception when no response parser is used


 [ 
https://issues.apache.org/jira/browse/SOLR-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4266.
---

Resolution: Fixed

Thanks Steve!

 HttpSolrServer does not release connection properly on exception when no 
 response parser is used
 

 Key: SOLR-4266
 URL: https://issues.apache.org/jira/browse/SOLR-4266
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: Steve Molloy
Assignee: Mark Miller
 Fix For: 4.1, 5.0

 Attachments: patch-4266.txt


 When using HttpSolrServer for requests with no response parser, any 
 unpredicted status code (401, 500...) will throw the exception properly, but 
 will not close the connection. Since no handle for connection is returned in 
 case of exception, it should be closed. So only case where it should not be 
 closed is when the stream is actually returned, that is, when no response 
 parser is used and the call is successful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4266) HttpSolrServer does not release connection properly on exception when no response parser is used


[ 
https://issues.apache.org/jira/browse/SOLR-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555406#comment-13555406
 ] 

Commit Tag Bot commented on SOLR-4266:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1434354

SOLR-4266: HttpSolrServer does not release connection properly on exception 
when no response parser is used.


 HttpSolrServer does not release connection properly on exception when no 
 response parser is used
 

 Key: SOLR-4266
 URL: https://issues.apache.org/jira/browse/SOLR-4266
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: Steve Molloy
Assignee: Mark Miller
 Fix For: 4.1, 5.0

 Attachments: patch-4266.txt


 When using HttpSolrServer for requests with no response parser, any 
 unpredicted status code (401, 500...) will throw the exception properly, but 
 will not close the connection. Since no handle for connection is returned in 
 case of exception, it should be closed. So only case where it should not be 
 closed is when the stream is actually returned, that is, when no response 
 parser is used and the call is successful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4689) Eclipse project name change for 4.1

2013-01-16 Thread Shawn Heisey (JIRA)

Shawn Heisey created LUCENE-4689:


 Summary: Eclipse project name change for 4.1
 Key: LUCENE-4689
 URL: https://issues.apache.org/jira/browse/LUCENE-4689
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.1
Reporter: Shawn Heisey
Priority: Minor
 Fix For: 4.1
 Attachments: LUCENE-4689.patch

Just updating the eclipse project name from lucene_solr_branch_4x to 
lucene_solr_4_1 on the new branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4689) Eclipse project name change for 4.1

2013-01-16 Thread Shawn Heisey (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated LUCENE-4689:
-

Attachment: LUCENE-4689.patch

 Eclipse project name change for 4.1
 ---

 Key: LUCENE-4689
 URL: https://issues.apache.org/jira/browse/LUCENE-4689
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.1
Reporter: Shawn Heisey
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4689.patch


 Just updating the eclipse project name from lucene_solr_branch_4x to 
 lucene_solr_4_1 on the new branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2592) Custom Hashing


[ 
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555459#comment-13555459
 ] 

Commit Tag Bot commented on SOLR-2592:
--

[trunk commit] Steven Rowe
http://svn.apache.org/viewvc?view=revisionrevision=1434401

- Make complex SOLR-2592 changes entry not get converted to a single wrapped 
line in Changes.html.
- 'Via' - 'via' (merged lucene_solr_4_1 r1434389)


 Custom Hashing
 --

 Key: SOLR-2592
 URL: https://issues.apache.org/jira/browse/SOLR-2592
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.0-ALPHA
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 4.1, 5.0

 Attachments: dbq_fix.patch, pluggable_sharding.patch, 
 pluggable_sharding_V2.patch, SOLR-2592_collectionProperties.patch, 
 SOLR-2592_collectionProperties.patch, SOLR-2592.patch, 
 SOLR-2592_progress.patch, SOLR-2592_query_try1.patch, 
 SOLR-2592_r1373086.patch, SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch, 
 SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch


 If the data in a cloud can be partitioned on some criteria (say range, hash, 
 attribute value etc) It will be easy to narrow down the search to a smaller 
 subset of shards and in effect can achieve more efficient search.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4689) Eclipse project name change for 4.1


 [ 
https://issues.apache.org/jira/browse/LUCENE-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-4689.


Resolution: Fixed
  Assignee: Steve Rowe

Committed.

Thanks Shawn!

 Eclipse project name change for 4.1
 ---

 Key: LUCENE-4689
 URL: https://issues.apache.org/jira/browse/LUCENE-4689
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.1
Reporter: Shawn Heisey
Assignee: Steve Rowe
Priority: Minor
 Fix For: 4.1

 Attachments: LUCENE-4689.patch


 Just updating the eclipse project name from lucene_solr_branch_4x to 
 lucene_solr_4_1 on the new branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2592) Custom Hashing


[ 
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555463#comment-13555463
 ] 

Commit Tag Bot commented on SOLR-2592:
--

[branch_4x commit] Steven Rowe
http://svn.apache.org/viewvc?view=revisionrevision=1434402

- Make complex SOLR-2592 changes entry not get converted to a single wrapped 
line in Changes.html.
- 'Via' - 'via' (merged lucene_solr_4_1 r1434389)


 Custom Hashing
 --

 Key: SOLR-2592
 URL: https://issues.apache.org/jira/browse/SOLR-2592
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.0-ALPHA
Reporter: Noble Paul
Assignee: Yonik Seeley
 Fix For: 4.1, 5.0

 Attachments: dbq_fix.patch, pluggable_sharding.patch, 
 pluggable_sharding_V2.patch, SOLR-2592_collectionProperties.patch, 
 SOLR-2592_collectionProperties.patch, SOLR-2592.patch, 
 SOLR-2592_progress.patch, SOLR-2592_query_try1.patch, 
 SOLR-2592_r1373086.patch, SOLR-2592_r1384367.patch, SOLR-2592_rev_2.patch, 
 SOLR_2592_solr_4_0_0_BETA_ShardPartitioner.patch


 If the data in a cloud can be partitioned on some criteria (say range, hash, 
 attribute value etc) It will be easy to narrow down the search to a smaller 
 subset of shards and in effect can achieve more efficient search.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups


 [ 
https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-4310:
-

Fix Version/s: (was: 4.1)
   4.2

Amit, it's too close to the release to get this in for 4.1, so I'm pushing Fix 
Version to 4.2.

 If groups.ngroups is specified, the docList's numFound should be the number 
 of groups
 -

 Key: SOLR-4310
 URL: https://issues.apache.org/jira/browse/SOLR-4310
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.1
Reporter: Amit Nithian
Priority: Minor
 Fix For: 4.2

 Attachments: SOLR-4310.patch


 If you group by a field, the response may look like this:
 lst name=grouped
 lst name=series
 int name=matches138/int
 int name=ngroups1/int
 result name=doclist numFound=138 start=0
 doc
 int name=id267038365/int
 str name=name
 Larry's Grand Ole Garage Country Dance - Pure Country
 /str
 /doc
 /result
 /lst
 /lst
 and if you specify group.main then the doclist becomes the result and you 
 lose all context of the number of groups. If you want to keep your response 
 format backwards compatible with clients (i.e. clients who don't know about 
 the grouped format), setting group.main=true solves this BUT the numFound is 
 the number of raw matches instead of the number of groups. This may have 
 downstream consequences.
 I'd like to propose that if the user specifies ngroups=true then when 
 creating the returning DocSlice, set the numFound to be the number of groups 
 instead of the number of raw matches to keep the response consistent with 
 what the user would expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3118) We need a better error message when failing due to a slice that is part of collection is not available


[ 
https://issues.apache.org/jira/browse/SOLR-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555476#comment-13555476
 ] 

Steve Rowe commented on SOLR-3118:
--

Mark, anything from here you want to get into 4.1?  I'd like to make an RC 
shortly...

 We need a better error message when failing due to a slice that is part of 
 collection is not available
 --

 Key: SOLR-3118
 URL: https://issues.apache.org/jira/browse/SOLR-3118
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-ALPHA
Reporter: Sami Siren
Assignee: Mark Miller
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-3118.patch


 When indexing to/searching from an incomplete collection (for example a slice 
 does not have any shards registered/available) a cruel error without a proper 
 explanation is shown to the user. These errors are from running example1.sh 
 and creating a new collection with coreadminhandler:
 Slices with no shards:
 Indexing:
 {code}
 Error 500 No registered leader was found, collection:collection2 slice:shard4
 java.lang.RuntimeException: No registered leader was found, 
 collection:collection2 slice:shard4
   at 
 org.apache.solr.common.cloud.ZkStateReader.getLeaderProps(ZkStateReader.java:408)
   at 
 org.apache.solr.common.cloud.ZkStateReader.getLeaderProps(ZkStateReader.java:393)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:154)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:210)
   at 
 org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
   at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:135)
   at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79)
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1523)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:234)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 {code}
 Searching:
 {code}
 HTTP ERROR 503
 Problem accessing /solr/coreX/select/. Reason:
 no servers hosting shard: 
 Powered by Jetty://
 {code}
 Surprisingly the error is different when searching from a collection after 
 removing a core from an collection that was in OK condition:
 {code}
 HTTP ERROR 500
 Problem accessing /solr/coreX/select/. Reason:
 null
 java.util.concurrent.RejectedExecutionException
   at 
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
   at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
   at 
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
   at 
 java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:152)
   at 
 org.apache.solr.handler.component.HttpShardHandler.submit(HttpShardHandler.java:173)
   at

Re: Query modifier

2013-01-16 Thread balaji.gandhi

Hi Chris,

Is this util class available in Lucene? I am trying to do something
similar:- 

Eg. 
Input: (name:John AND name:Doe) 
Output: ((firstName:John OR lastName:John) AND (firstName:John OR
lastName:John)) 

Can I have some info on the Surround parser methods available to walk the
query tree?

Thanks, 
Balaji



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-modifier-tp567265p4033991.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3118) We need a better error message when failing due to a slice that is part of collection is not available