AW: Problems installing Pylucene on Ubuntu 12.04

2014-03-07 Thread Ritzschke, Uwe
Thanks for the quick reply. The tests work fine with that patch.

Uwe


On Thu, 6 Mar 2014, Ritzschke, Uwe wrote:

 Hello,

 I'm facing problems with installing Pylucene on an Ubuntu 12.04 Server 
 (32bit). Perhaps someone can give me some helpful advice?
 I've followed the official installation instructions [1]. It seems that 
 building and installing JCC works fine. Also, running make to build 
 Pylucene seems to succeed. But if I run make test, I get the errors 
 attached below.

It looks like there is a left-over 'import pdb; pdb.set_trace()' statement in 
the test_PythonDirectory.py test, at line 260.
Please, remove it and re-run the tests.

Thanks !

Andi..


Re: Suggestions about writing / extending QueryParsers

2014-03-07 Thread Tommaso Teofili
Thanks Tim and Upayavira for your replies.

I still need to decide what the final syntax could be, however generally
speaking the ideal would be that I am able to extend the current Lucene
syntax with a new expression which will trigger the creation of a more like
this query with something like +title:foo +text for similar docs%2 where
the phrase between quotes will generate a MoreLikeThisQuery on that text if
it's followed by the % character (and the number 2 may control the MLT
configuration, e.g. min document freq == min term freq = 2), similarly to
what it's done for proximity search (not sure about using %, it's just a
syntax example).
I guess then I'd need to extend the classic query parser, as per Tim's
suggestions and I'd assume that if this goes into the classic qp it should
be a no brainer on the Solr side.
Does it sound correct / feasible?

Regards,
Tommaso

2014-03-06 15:08 GMT+01:00 Upayavira u...@odoko.co.uk:

  Tommaso,

 Do say more about what you're thinking of. I'm currently getting my dev
 environment up to look into enhancing the MoreLikeThisHandler to be able
 handle function query boosts. This should be eminently possible from my
 initial research. However, if you're thinking of something more powerful,
 perhaps we can work together.

 Upayavira


 On Thu, Mar 6, 2014, at 11:23 AM, Tommaso Teofili wrote:

 Hi all,

 I'm thinking about writing/extending a QueryParser for MLT queries; I've
 never really looked into that code too much, while I'm doing that now, I'm
 wondering if anyone has suggestions on how to start with such a topic.
  Should I write a new grammar for that ? Or can I just extend an existing
 grammar / class?

 Thanks in advance,
 Tommaso




[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #606: POMs out of sync

2014-03-07 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/606/

1 tests failed.
REGRESSION:  
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch

Error Message:
some core start times did not change on reload

Stack Trace:
java.lang.AssertionError: some core start times did not change on reload
at 
__randomizedtesting.SeedInfo.seed([F401181A3936ADA2:75E796024E69CD9E]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:835)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:202)




Build Log:
[...truncated 52622 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:494: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:176: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77:
 Java returned: 1

Total time: 141 minutes 34 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: JDK 8 : Third Release Candidate - Build 132 is available on java.net

2014-03-07 Thread Rory O'Donnell Oracle, Dublin Ireland

Thanks Uwe!
On 06/03/2014 23:59, Uwe Schindler wrote:


Hi Rory, hi Lucene committers,

Thanks for the info!

I updated our Jenkins build server to use JDK 8 b132 and JDK 7u60 b07. 
In addition, the MacOSX virtual machine now also runs JDK 8 b132 
builds (after I sorted out how to **not** make JDK8 the default Java 
on OSX).


Next to operating system upgrades I also updated to latest versions of 
IBM J9 v6.0 and 7.1 (releases of January 29^th ).


Uwe

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de http://www.thetaphi.de/

eMail: u...@thetaphi.de

*From:*Rory O'Donnell Oracle, Dublin Ireland 
[mailto:rory.odonn...@oracle.com]

*Sent:* Thursday, March 06, 2014 6:48 PM
*To:* Uwe Schindler; Dawid Weiss
*Cc:* dev@lucene.apache.org; Dalibor Topic; Cecilia Borg; Balchandra 
Vaidya
*Subject:* JDK 8 : Third Release Candidate - Build 132 is available on 
java.net


Hi Uwe,Dawid,

JDK 8 Third Release Candidate , Build 132 is now available for 
download http://jdk8.java.net/download.html  test.

Please log all show stopper issues as soon as possible.

Thanks for your support, Rory

--
Rgds,Rory O'Donnell
Quality Engineering Manager
Oracle EMEA , Dublin, Ireland


--
Rgds,Rory O'Donnell
Quality Engineering Manager
Oracle EMEA , Dublin, Ireland



[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs

2014-03-07 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923694#comment-13923694
 ] 

Adrien Grand commented on LUCENE-5493:
--

This is a very nice cleanup, and the ability to use any Sort object including 
expressions makes it very flexible. +1 to commit

 Rename Sorter, NumericDocValuesSorter, and fix javadocs
 ---

 Key: LUCENE-5493
 URL: https://issues.apache.org/jira/browse/LUCENE-5493
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch


 Its not clear to users that these are for this super-expert thing of 
 pre-sorting the index. From the names and documentation they think they 
 should use them instead of Sort/SortField.
 These need to be renamed or, even better, the API fixed so they aren't public 
 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fwd: Am I allowed to generate, enhance and republish a JavaDoc of an Apache project?

2014-03-07 Thread Shawn Heisey
On 3/6/2014 8:42 PM, Alexandre Rafalovitch wrote:
 I asked this on Apache legal list but got no reply. So, I thought I'll
 try again for the group it will affect directly (project not mentioned
 below is Solr).
 
 Any opinion on legality, usefulness or possibly underlying causes of
 the original problem would be appreciated.
 
 Regards,
Alex.
 -- Forwarded message --
 Date: Thu, Feb 27, 2014 at 4:41 PM
 Subject: Am I allowed to generate, enhance and republish a JavaDoc of
 an Apache project?
 To: legal-disc...@apache.org
 
 
 Hello,
 
 For one (of many) of the Apache projects that I use, I am very
 frustrated that Google cannot find the officially-hosted Javadocs.

I'm not going to try to comment about the legal issues, but I will tell
you that I can very often find javadocs for a very specific class by
searching for it along with a specific recent version number.  So I will
google for 'HttpSolrServer 4.7.0' and I have what I need.  Finding
related things is normally pretty easy, because there are clickable
links for related classes buried in any given javadoc page.

When searching for recent docs for SolrQuery, if I leave out the version
number, I only get 4.2.1 and 3.6.0 near the top of the results.  If I
add a version number, the top results are actually kinda useless.  A
search for 'SolrQuery 4.6.1 API' did the trick.  It's simply too common
a phrase, especially when broken apart into Solr and Query.

There are very likely things that we can do to improve our search engine
results.  I'm not well versed in SEO myself.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Fwd: Am I allowed to generate, enhance and republish a JavaDoc of an Apache project?

2014-03-07 Thread Alexandre Rafalovitch
Thanks Shawn, these are neat tricks.

I did find a couple of similar tricks with the versions. But I keep
forgetting them and sometimes even that does not help.

Additionally, the way Javadocs are built, the cross-links do not work
too well. The classes are split between build modules and if you want
to go up and down the inheritance hierarchy that spins multiple
modules (or Solr/Lucene divide) you do not even get told those classes
exist. So, it becomes a case of knowing that it exists to look for it.

I am not saying it is terrible, just that perhaps it can be made
better. And I want to experiment with making it better. So I need a
freedom to experiment faster than official release policy.

Regards,
   Alex.
P.s. I am not a SEO expert either. Once I learn to be one with this
and/or other projects, I would be more than happy to contribute my
skills back to the official documentation.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Fri, Mar 7, 2014 at 4:32 PM, Shawn Heisey s...@elyograg.org wrote:
 On 3/6/2014 8:42 PM, Alexandre Rafalovitch wrote:
 I asked this on Apache legal list but got no reply. So, I thought I'll
 try again for the group it will affect directly (project not mentioned
 below is Solr).

 Any opinion on legality, usefulness or possibly underlying causes of
 the original problem would be appreciated.

 Regards,
Alex.
 -- Forwarded message --
 Date: Thu, Feb 27, 2014 at 4:41 PM
 Subject: Am I allowed to generate, enhance and republish a JavaDoc of
 an Apache project?
 To: legal-disc...@apache.org


 Hello,

 For one (of many) of the Apache projects that I use, I am very
 frustrated that Google cannot find the officially-hosted Javadocs.

 I'm not going to try to comment about the legal issues, but I will tell
 you that I can very often find javadocs for a very specific class by
 searching for it along with a specific recent version number.  So I will
 google for 'HttpSolrServer 4.7.0' and I have what I need.  Finding
 related things is normally pretty easy, because there are clickable
 links for related classes buried in any given javadoc page.

 When searching for recent docs for SolrQuery, if I leave out the version
 number, I only get 4.2.1 and 3.6.0 near the top of the results.  If I
 add a version number, the top results are actually kinda useless.  A
 search for 'SolrQuery 4.6.1 API' did the trick.  It's simply too common
 a phrase, especially when broken apart into Solr and Query.

 There are very likely things that we can do to improve our search engine
 results.  I'm not well versed in SEO myself.

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!

2014-03-07 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/
Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 43612 lines...]
-documentation-lint:
 [echo] checking for broken html...
[jtidy] Checking for broken html (such as invalid tags)...
   [delete] Deleting directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/jtidy_tmp
 [echo] Checking for broken links...
 [exec] 
 [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for missing docs...
 [exec] 
 [exec] 
build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQueryDemo.html
 [exec]   missing Methods: 
doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpServletResponse)
 [exec] 
 [exec] Missing javadocs were found!

BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:241: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:2321:
 exec returned: 1

Total time: 54 minutes 47 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops 
-XX:+UseParallelGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5826) Request caching SolrServer

2014-03-07 Thread Tommaso Teofili (JIRA)
Tommaso Teofili created SOLR-5826:
-

 Summary: Request caching SolrServer
 Key: SOLR-5826
 URL: https://issues.apache.org/jira/browse/SOLR-5826
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 5.0


As stated in http://markmail.org/thread/a477kyxsp5xrusdu there're scenarios 
where an application communicating with Solr needs to not loose requests 
(especially update/indexing requests) that may fail because Solr instance / 
cluster is not reachable for some time.
For such scenarios it may helpful to have a wrapping SolrServer which can cache 
(in a FIFO queue, so that they get executed in order) requests when the Solr 
endpoint is not reachable and execute them as soon as it's reachable again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Group-ignored tests and @Before/After class hooks.

2014-03-07 Thread Dawid Weiss
Robert pointed out this:

  [junit4] Suite: org.apache.solr.cloud.BasicZkTest
  [junit4] IGNOR/A 0.00s J2 | BasicZkTest.testBasic
  [junit4] Assumption #1: 'slow' test group is disabled (@Slow)
  [junit4] Completed on J2 in 42.45s, 1 test, 1 skipped

 Bug? Like it must be running @BeforeClass etc even though no tests are 
 enabled...

Indeed, this is currently the case. The problem is that the way JUnit
works (or rather: the various tooling environments expects it to work)
one has a choice of:

1) ignoring/ filtering certain tests or classes; then they will not
show up in IDEs at all,

2) ignoring/ filtering certain tests *at evaluation time*; this
unfortunately means @BeforeClass and @AfterClass will run (and so will
static class initializers). This has the benefit that all ignored
methods are reported properly.

I'll see what I can do about it but it's not a trivial bug/ change.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-07 Thread Upayavira (JIRA)
Upayavira created SOLR-5827:
---

 Summary: Add boosting functionality to MoreLikeThisHandler
 Key: SOLR-5827
 URL: https://issues.apache.org/jira/browse/SOLR-5827
 Project: Solr
  Issue Type: Improvement
  Components: MoreLikeThis
Reporter: Upayavira
 Fix For: 4.8


The MoreLikeThisHandler facilitates the creation of a very simple yet powerful 
recommendation engine. 

It is possible to constrain the result set using filter queries. However, it 
isn't possible to influence the scoring using function queries. Adding function 
query boosting would allow for including such things as recency in the 
relevancy calculations.

Unfortunately, the boost= parameter is already in use, meaning we cannot 
replicate the edismax boost/bf for additive/multiplicative boostings.

My patch only touches the MoreLikeThisHandler, so the only really contentious 
thing is to decide the parameters to configure it.

I have a prototype working, and will upload a patch shortly. 




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5826) Request caching SolrServer

2014-03-07 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-5826:
--

Attachment: SOLR-5826.patch

attached first draft patch which introduces a RequestCachingSolrServer.
It still needs to be improved to switch from active to passive wait for 
consuming cached requests.
The testcase needs to be adjusted as it only works for me from IDE (strangely 
failing from ant due to File permissions on the index ..).

 Request caching SolrServer
 --

 Key: SOLR-5826
 URL: https://issues.apache.org/jira/browse/SOLR-5826
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 5.0

 Attachments: SOLR-5826.patch


 As stated in http://markmail.org/thread/a477kyxsp5xrusdu there're scenarios 
 where an application communicating with Solr needs to not loose requests 
 (especially update/indexing requests) that may fail because Solr instance / 
 cluster is not reachable for some time.
 For such scenarios it may helpful to have a wrapping SolrServer which can 
 cache (in a FIFO queue, so that they get executed in order) requests when the 
 Solr endpoint is not reachable and execute them as soon as it's reachable 
 again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3178) Native MMapDir

2014-03-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923769#comment-13923769
 ] 

Michael McCandless commented on LUCENE-3178:


bq. ... and suddenly I got good results, this is idiopathic :S

Lovely :)

It is odd, because we do relatively few IO ops, since we read a big byte[] 
blob, and then do all decode from that (128 packed ints) in RAM.

It do think it'd be interesting to pair up a NativeMMapDir with a custom 
postings format that instead uses IndexInput.readLong (via Unsafe.getLong) to 
pull longs from disk; this should save some cost we now have in packed ints to 
reconstitute longs from byte[] in Java.  But, we'd need to fix the byte order 
in the index to match the CPU used at search time.  Or, maybe we could use a 
DirectByteBuffer and set the byte order (but this may mean byte swapping for 
every access, which maybe is not so bad).

 Native MMapDir
 --

 Key: LUCENE-3178
 URL: https://issues.apache.org/jira/browse/LUCENE-3178
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/store
Reporter: Michael McCandless
  Labels: gsoc2014
 Attachments: LUCENE-3178-Native-MMap-implementation.patch, 
 LUCENE-3178-Native-MMap-implementation.patch, 
 LUCENE-3178-Native-MMap-implementation.patch


 Spinoff from LUCENE-2793.
 Just like we will create native Dir impl (UnixDirectory) to pass the right OS 
 level IO flags depending on the IOContext, we could in theory do something 
 similar with MMapDir.
 The problem is MMap is apparently quite hairy... and to pass the flags the 
 native code would need to invoke mmap (I think?), unlike UnixDir where the 
 code only has to open the file handle.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs

2014-03-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923772#comment-13923772
 ] 

Michael McCandless commented on LUCENE-5493:


bq. What is meant by impact sorted postings?

It's when you sort your documents according to biggest impact which is your 
own measure, and which you intend to sort by at search time.  
AnalyzingInfixSuggester uses this, to sort the suggestions by their weight.  
This way if you are looking for 5 suggestions, you can stop searching after 
collecting 5 hits, which is an enormous speedup when the query would have 
otherwise matched many documents.

See e.g. http://nlp.stanford.edu/IR-book/html/htmledition/impact-ordering-1.html

 Rename Sorter, NumericDocValuesSorter, and fix javadocs
 ---

 Key: LUCENE-5493
 URL: https://issues.apache.org/jira/browse/LUCENE-5493
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch


 Its not clear to users that these are for this super-expert thing of 
 pre-sorting the index. From the names and documentation they think they 
 should use them instead of Sort/SortField.
 These need to be renamed or, even better, the API fixed so they aren't public 
 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Stalled unit tests

2014-03-07 Thread Michael McCandless
Unfortunately, some tests take a very long time, and the test infra
will print these HEARTBEAT messages notifying you that they are still
running.  They should eventually finish?

Mike McCandless

http://blog.mikemccandless.com


On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith sheb...@gmail.com wrote:
 I'm sure that I'm just missing something obvious but I'm having trouble
 getting the unit tests to run to completion on my laptop and was hoping that
 someone would be kind enough to point me in the right direction.

 I've cloned the repository from GitHub
 (http://git.apache.org/lucene-solr.git) and checked out the latest commit on
 branch_4x.

 commit 6e06247cec1410f32592bfd307c1020b814def06

 Author: Robert Muir rm...@apache.org

 Date:   Thu Mar 6 19:54:07 2014 +


 disable slow solr tests in smoketester



 git-svn-id:
 https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
 13f79535-47bb-0310-9956-ffa450edef68


 Executing ant clean test from the top level directory of the project shows
 the tests running but they seems to get stuck in loop with some stalled
 heartbeat messages. If I run the tests directly from lucene/ then they
 complete successfully after about 10 minutes.

 I'm using Java 6 under OS X (10.9.2).

 $ java -version

 java version 1.6.0_65

 Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)

 Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)


 My terminal lists repeating stalled heartbeat messages like so:

 HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for 2111s
 at: HdfsLockFactoryTest.testBasic

 HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for 2108s
 at: TestSurroundQueryParser.testQueryParser

 HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for 2167s
 at: TestRecoveryHdfs.testBuffering

 HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for 2165s
 at: HdfsDirectoryTest.testEOF


 My machine does have 3 java processes chewing CPU, see attached jstack dumps
 for more information.

 Should I expect the tests to complete on my platform? Do I need to specify
 any special flags to give them more memory or to avoid any bad apples?

 Thanks in advance,

 --Terry




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Group-ignored tests and @Before/After class hooks.

2014-03-07 Thread Dawid Weiss
Ok, I've fixed it.

https://github.com/carrotsearch/randomizedtesting/issues/158

I'll include it in the next release.

Dawid

On Fri, Mar 7, 2014 at 11:17 AM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 Robert pointed out this:

  [junit4] Suite: org.apache.solr.cloud.BasicZkTest
  [junit4] IGNOR/A 0.00s J2 | BasicZkTest.testBasic
  [junit4] Assumption #1: 'slow' test group is disabled (@Slow)
  [junit4] Completed on J2 in 42.45s, 1 test, 1 skipped

 Bug? Like it must be running @BeforeClass etc even though no tests are 
 enabled...

 Indeed, this is currently the case. The problem is that the way JUnit
 works (or rather: the various tooling environments expects it to work)
 one has a choice of:

 1) ignoring/ filtering certain tests or classes; then they will not
 show up in IDEs at all,

 2) ignoring/ filtering certain tests *at evaluation time*; this
 unfortunately means @BeforeClass and @AfterClass will run (and so will
 static class initializers). This has the benefit that all ignored
 methods are reported properly.

 I'll see what I can do about it but it's not a trivial bug/ change.

 Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5487) Can we separate top scorer from sub scorer?

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923781#comment-13923781
 ] 

ASF subversion and git services commented on LUCENE-5487:
-

Commit 1575234 from [~mikemccand] in branch 'dev/branches/lucene5487'
[ https://svn.apache.org/r1575234 ]

LUCENE-5487: rename TopScorer - BulkScorer

 Can we separate top scorer from sub scorer?
 ---

 Key: LUCENE-5487
 URL: https://issues.apache.org/jira/browse/LUCENE-5487
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-5487.patch, LUCENE-5487.patch


 This is just an exploratory patch ... still many nocommits, but I
 think it may be promising.
 I find the two booleans we pass to Weight.scorer confusing, because
 they really only apply to whoever will call score(Collector) (just
 IndexSearcher and BooleanScorer).
 The params are pointless for the vast majority of scorers, because
 very, very few query scorers really need to change how top-scoring is
 done, and those scorers can *only* score top-level (throw throw UOE
 from nextDoc/advance).  It seems like these two types of scorers
 should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs

2014-03-07 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923793#comment-13923793
 ] 

Uwe Schindler commented on LUCENE-5493:
---

Beautiful! I like the linest starting with -  :-)

 Rename Sorter, NumericDocValuesSorter, and fix javadocs
 ---

 Key: LUCENE-5493
 URL: https://issues.apache.org/jira/browse/LUCENE-5493
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch


 Its not clear to users that these are for this super-expert thing of 
 pre-sorting the index. From the names and documentation they think they 
 should use them instead of Sort/SortField.
 These need to be renamed or, even better, the API fixed so they aren't public 
 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-07 Thread Upayavira (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Upayavira updated SOLR-5827:


Attachment: SOLR-5827.patch

First pass. Supports additive boosting with the mlt.bf parameter. No support 
for multiplicative boost, pending a choice of parameter name!

 Add boosting functionality to MoreLikeThisHandler
 -

 Key: SOLR-5827
 URL: https://issues.apache.org/jira/browse/SOLR-5827
 Project: Solr
  Issue Type: Improvement
  Components: MoreLikeThis
Reporter: Upayavira
 Fix For: 4.8

 Attachments: SOLR-5827.patch


 The MoreLikeThisHandler facilitates the creation of a very simple yet 
 powerful recommendation engine. 
 It is possible to constrain the result set using filter queries. However, it 
 isn't possible to influence the scoring using function queries. Adding 
 function query boosting would allow for including such things as recency in 
 the relevancy calculations.
 Unfortunately, the boost= parameter is already in use, meaning we cannot 
 replicate the edismax boost/bf for additive/multiplicative boostings.
 My patch only touches the MoreLikeThisHandler, so the only really contentious 
 thing is to decide the parameters to configure it.
 I have a prototype working, and will upload a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!

2014-03-07 Thread Uwe Schindler
Hi,

I have no idea, why this error appears. This class is unchanged since months. 
Maybe this is something new, only appearing with latest JDK 7u60 build? But, 
there are no javadocs changes in the whole series of 7u60 updates.
I see that the class mentioned here has no Javadocs at all, because it’s a demo 
class. What's wrong?

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
 Sent: Friday, March 07, 2014 10:50 AM
 To: dev@lucene.apache.org
 Subject: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) -
 Build # 9703 - Still Failing!
 
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/
 Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -
 XX:+UseParallelGC
 
 All tests passed
 
 Build Log:
 [...truncated 43612 lines...]
 -documentation-lint:
  [echo] checking for broken html...
 [jtidy] Checking for broken html (such as invalid tags)...
[delete] Deleting directory /mnt/ssd/jenkins/workspace/Lucene-Solr-
 trunk-Linux/lucene/build/jtidy_tmp
  [echo] Checking for broken links...
  [exec]
  [exec] Crawl/parse...
  [exec]
  [exec] Verify...
  [echo] Checking for missing docs...
  [exec]
  [exec]
 build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQuer
 yDemo.html
  [exec]   missing Methods:
 doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpSer
 vletResponse)
  [exec]
  [exec] Missing javadocs were found!
 
 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:241:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-
 build.xml:2321: exec returned: 1
 
 Total time: 54 minutes 47 seconds
 Build step 'Invoke Ant' marked build as failure Description set: Java:
 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC
 Archiving artifacts Recording test results Email was triggered for: Failure
 Sending email for trigger: Failure
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-07 Thread Upayavira (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Upayavira updated SOLR-5827:


Attachment: SOLR-5827.patch

Updated version with a minor tweak to get rid of compile error

 Add boosting functionality to MoreLikeThisHandler
 -

 Key: SOLR-5827
 URL: https://issues.apache.org/jira/browse/SOLR-5827
 Project: Solr
  Issue Type: Improvement
  Components: MoreLikeThis
Reporter: Upayavira
 Fix For: 4.8

 Attachments: SOLR-5827.patch, SOLR-5827.patch


 The MoreLikeThisHandler facilitates the creation of a very simple yet 
 powerful recommendation engine. 
 It is possible to constrain the result set using filter queries. However, it 
 isn't possible to influence the scoring using function queries. Adding 
 function query boosting would allow for including such things as recency in 
 the relevancy calculations.
 Unfortunately, the boost= parameter is already in use, meaning we cannot 
 replicate the edismax boost/bf for additive/multiplicative boostings.
 My patch only touches the MoreLikeThisHandler, so the only really contentious 
 thing is to decide the parameters to configure it.
 I have a prototype working, and will upload a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!

2014-03-07 Thread Michael McCandless
I really don't like that %20 in there!  Maybe a minor change in the
recent JDK7 build caused it to escape space with %20 instead of + or
maybe where it wasn't escaping before ... I'll try to repro/fix.
Seems like we just need to make the linter unescape somewhere.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Mar 7, 2014 at 6:41 AM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I have no idea, why this error appears. This class is unchanged since months. 
 Maybe this is something new, only appearing with latest JDK 7u60 build? But, 
 there are no javadocs changes in the whole series of 7u60 updates.
 I see that the class mentioned here has no Javadocs at all, because it's a 
 demo class. What's wrong?

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
 Sent: Friday, March 07, 2014 10:50 AM
 To: dev@lucene.apache.org
 Subject: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) -
 Build # 9703 - Still Failing!

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/
 Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -
 XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 43612 lines...]
 -documentation-lint:
  [echo] checking for broken html...
 [jtidy] Checking for broken html (such as invalid tags)...
[delete] Deleting directory /mnt/ssd/jenkins/workspace/Lucene-Solr-
 trunk-Linux/lucene/build/jtidy_tmp
  [echo] Checking for broken links...
  [exec]
  [exec] Crawl/parse...
  [exec]
  [exec] Verify...
  [echo] Checking for missing docs...
  [exec]
  [exec]
 build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQuer
 yDemo.html
  [exec]   missing Methods:
 doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.HttpSer
 vletResponse)
  [exec]
  [exec] Missing javadocs were found!

 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:241:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-
 build.xml:2321: exec returned: 1

 Total time: 54 minutes 47 seconds
 Build step 'Invoke Ant' marked build as failure Description set: Java:
 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC
 Archiving artifacts Recording test results Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5827) Add boosting functionality to MoreLikeThisHandler

2014-03-07 Thread Upayavira (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923808#comment-13923808
 ] 

Upayavira commented on SOLR-5827:
-

It is perhaps worth noting that these are made against the 4x branch.

 Add boosting functionality to MoreLikeThisHandler
 -

 Key: SOLR-5827
 URL: https://issues.apache.org/jira/browse/SOLR-5827
 Project: Solr
  Issue Type: Improvement
  Components: MoreLikeThis
Reporter: Upayavira
 Fix For: 4.8

 Attachments: SOLR-5827.patch, SOLR-5827.patch


 The MoreLikeThisHandler facilitates the creation of a very simple yet 
 powerful recommendation engine. 
 It is possible to constrain the result set using filter queries. However, it 
 isn't possible to influence the scoring using function queries. Adding 
 function query boosting would allow for including such things as recency in 
 the relevancy calculations.
 Unfortunately, the boost= parameter is already in use, meaning we cannot 
 replicate the edismax boost/bf for additive/multiplicative boostings.
 My patch only touches the MoreLikeThisHandler, so the only really contentious 
 thing is to decide the parameters to configure it.
 I have a prototype working, and will upload a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-5492) IndexFileDeleter AssertionError in presence of *_upgraded.si files

2014-03-07 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-5492:
--

Assignee: Michael McCandless

 IndexFileDeleter AssertionError in presence of *_upgraded.si files
 --

 Key: LUCENE-5492
 URL: https://issues.apache.org/jira/browse/LUCENE-5492
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Tim Smith
Assignee: Michael McCandless

 When calling IndexWriter.deleteUnusedFiles against an index that contains 3.x 
 segments, i am seeing the following exception:
 {code}
 java.lang.AssertionError: failAndDumpStackJunitStatment: RefCount is 0 
 pre-decrement for file _0_upgraded.si
 at 
 org.apache.lucene.index.IndexFileDeleter$RefCount.DecRef(IndexFileDeleter.java:630)
 at 
 org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:514)
 at 
 org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:286)
 at 
 org.apache.lucene.index.IndexFileDeleter.revisitPolicy(IndexFileDeleter.java:393)
 at 
 org.apache.lucene.index.IndexWriter.deleteUnusedFiles(IndexWriter.java:4617)
 {code}
 I believe this is caused by IndexFileDeleter not being aware of the Lucene3x 
 Segment Infos Format (notably the _upgraded.si files created to upgrade an 
 old index)
 This is new in 4.7 and did not occur in 4.6.1
 Still trying to track down a workaround/fix



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5492) IndexFileDeleter AssertionError in presence of *_upgraded.si files

2014-03-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923813#comment-13923813
 ] 

Michael McCandless commented on LUCENE-5492:


Hmm, not good.  Can you describe what you are doing / boil it down to a test 
case?

 IndexFileDeleter AssertionError in presence of *_upgraded.si files
 --

 Key: LUCENE-5492
 URL: https://issues.apache.org/jira/browse/LUCENE-5492
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Tim Smith

 When calling IndexWriter.deleteUnusedFiles against an index that contains 3.x 
 segments, i am seeing the following exception:
 {code}
 java.lang.AssertionError: failAndDumpStackJunitStatment: RefCount is 0 
 pre-decrement for file _0_upgraded.si
 at 
 org.apache.lucene.index.IndexFileDeleter$RefCount.DecRef(IndexFileDeleter.java:630)
 at 
 org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:514)
 at 
 org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:286)
 at 
 org.apache.lucene.index.IndexFileDeleter.revisitPolicy(IndexFileDeleter.java:393)
 at 
 org.apache.lucene.index.IndexWriter.deleteUnusedFiles(IndexWriter.java:4617)
 {code}
 I believe this is caused by IndexFileDeleter not being aware of the Lucene3x 
 Segment Infos Format (notably the _upgraded.si files created to upgrade an 
 old index)
 This is new in 4.7 and did not occur in 4.6.1
 Still trying to track down a workaround/fix



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9703 - Still Failing!

2014-03-07 Thread Uwe Schindler
Thanks Mike. Maybe this is the difference! The only strange thing is the fact 
that it only happens on this single file.

I can confirm: 7u60 b04 does not trigger this bug, but 7u60 b07 does. This is 
definitely no bug in the JDK, its just more correct bahviour (because 
whitespace must be escaped in URIs). Please note + is no valid replacement 
for whitespace in URI components. Only the form-url-encoding in the query 
string (may) use +, but not the default encoding used in path names or 
fragments.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Friday, March 07, 2014 12:45 PM
 To: Lucene/Solr dev
 Subject: Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07) -
 Build # 9703 - Still Failing!
 
 I really don't like that %20 in there!  Maybe a minor change in the recent 
 JDK7
 build caused it to escape space with %20 instead of + or maybe where it
 wasn't escaping before ... I'll try to repro/fix.
 Seems like we just need to make the linter unescape somewhere.
 
 Mike McCandless
 
 http://blog.mikemccandless.com
 
 
 On Fri, Mar 7, 2014 at 6:41 AM, Uwe Schindler u...@thetaphi.de wrote:
  Hi,
 
  I have no idea, why this error appears. This class is unchanged since
 months. Maybe this is something new, only appearing with latest JDK 7u60
 build? But, there are no javadocs changes in the whole series of 7u60
 updates.
  I see that the class mentioned here has no Javadocs at all, because it's a
 demo class. What's wrong?
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Policeman Jenkins Server [mailto:jenk...@thetaphi.de]
  Sent: Friday, March 07, 2014 10:50 AM
  To: dev@lucene.apache.org
  Subject: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b07)
  - Build # 9703 - Still Failing!
 
  Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9703/
  Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -
  XX:+UseParallelGC
 
  All tests passed
 
  Build Log:
  [...truncated 43612 lines...]
  -documentation-lint:
   [echo] checking for broken html...
  [jtidy] Checking for broken html (such as invalid tags)...
 [delete] Deleting directory
  /mnt/ssd/jenkins/workspace/Lucene-Solr-
  trunk-Linux/lucene/build/jtidy_tmp
   [echo] Checking for broken links...
   [exec]
   [exec] Crawl/parse...
   [exec]
   [exec] Verify...
   [echo] Checking for missing docs...
   [exec]
   [exec]
 
 build/docs/demo/org/apache/lucene/demo/xmlparser/FormBasedXmlQuer
  yDemo.html
   [exec]   missing Methods:
  doPost(javax.servlet.http.HttpServletRequest,%20javax.servlet.http.Ht
  tpSer
  vletResponse)
   [exec]
   [exec] Missing javadocs were found!
 
  BUILD FAILED
  /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The
  following error occurred while executing this line:
  /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:57: The
  following error occurred while executing this line:
  /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-
 Linux/lucene/build.xml:208:
  The following error occurred while executing this line:
  /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-
 Linux/lucene/build.xml:241:
  The following error occurred while executing this line:
  /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-
  build.xml:2321: exec returned: 1
 
  Total time: 54 minutes 47 seconds
  Build step 'Invoke Ant' marked build as failure Description set: Java:
  64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseParallelGC
  Archiving artifacts Recording test results Email was triggered for:
  Failure Sending email for trigger: Failure
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923852#comment-13923852
 ] 

ASF subversion and git services commented on LUCENE-5493:
-

Commit 1575248 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1575248 ]

LUCENE-5493: cut over index sorting to use Sort api for specifying the order

 Rename Sorter, NumericDocValuesSorter, and fix javadocs
 ---

 Key: LUCENE-5493
 URL: https://issues.apache.org/jira/browse/LUCENE-5493
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch


 Its not clear to users that these are for this super-expert thing of 
 pre-sorting the index. From the names and documentation they think they 
 should use them instead of Sort/SortField.
 These need to be renamed or, even better, the API fixed so they aren't public 
 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs

2014-03-07 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5493.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.8

 Rename Sorter, NumericDocValuesSorter, and fix javadocs
 ---

 Key: LUCENE-5493
 URL: https://issues.apache.org/jira/browse/LUCENE-5493
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch


 Its not clear to users that these are for this super-expert thing of 
 pre-sorting the index. From the names and documentation they think they 
 should use them instead of Sort/SortField.
 These need to be renamed or, even better, the API fixed so they aren't public 
 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5493) Rename Sorter, NumericDocValuesSorter, and fix javadocs

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923864#comment-13923864
 ] 

ASF subversion and git services commented on LUCENE-5493:
-

Commit 1575253 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1575253 ]

LUCENE-5493: cut over index sorting to use Sort api for specifying the order

 Rename Sorter, NumericDocValuesSorter, and fix javadocs
 ---

 Key: LUCENE-5493
 URL: https://issues.apache.org/jira/browse/LUCENE-5493
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5493-poc.patch, LUCENE-5493.patch


 Its not clear to users that these are for this super-expert thing of 
 pre-sorting the index. From the names and documentation they think they 
 should use them instead of Sort/SortField.
 These need to be renamed or, even better, the API fixed so they aren't public 
 classes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5498) SortingAtomicReader should be package private

2014-03-07 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5498:
---

 Summary: SortingAtomicReader should be package private
 Key: LUCENE-5498
 URL: https://issues.apache.org/jira/browse/LUCENE-5498
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


The intended purpose of this reader is to allow you to sort your entire index 
with IW.addIndexes(IR).

Perhaps we should supply some kind of tool to do this and hide the reader. 
Its scary to think of someone using this for searching (based on its name and 
docs, its probably not clear that it would be ridiculously slow)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5828) Support for multiple wildcard highlight fields

2014-03-07 Thread Daniel Debray (JIRA)
Daniel Debray created SOLR-5828:
---

 Summary: Support for multiple wildcard highlight fields
 Key: SOLR-5828
 URL: https://issues.apache.org/jira/browse/SOLR-5828
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 5.0
Reporter: Daniel Debray
Priority: Minor


Hey guys,

is there a reason why we don't support multiple wildcard querys for 
highlighting? Something like hl.fl=foo.*hl.fl=bar.* or hl.fl=foo.* bar.*.

If nothing speaks against it i would like to provide a patch for this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5499) EarlyTerminatingSortingCollector shouldnt require exact Sort match

2014-03-07 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5499:
---

 Summary: EarlyTerminatingSortingCollector shouldnt require exact 
Sort match
 Key: LUCENE-5499
 URL: https://issues.apache.org/jira/browse/LUCENE-5499
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir


Today EarlyTerminatingSortingCollector requires that the Sort match exactly at 
query and at index time.

However, now that you can use any Sort (e.g. with multiple sortfields), this 
should be improved.

For example, early termination is fine in the following case:
* index-time: popularity desc, time desc
* query-time: popularity desc




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5499) EarlyTerminatingSortingCollector shouldnt require exact Sort match

2014-03-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923872#comment-13923872
 ] 

Robert Muir commented on LUCENE-5499:
-

The basics are: right now we just encode Sort.toString() in the index. But a 
Sort is just a collection of SortFields. So if we encode it differently (e.g. 
each SortField.toString() separated by INFORMATION_SEPARATOR_ONE, escaping the 
former in case someone is crazy...) we can easily have logic like this.

 EarlyTerminatingSortingCollector shouldnt require exact Sort match
 --

 Key: LUCENE-5499
 URL: https://issues.apache.org/jira/browse/LUCENE-5499
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir

 Today EarlyTerminatingSortingCollector requires that the Sort match exactly 
 at query and at index time.
 However, now that you can use any Sort (e.g. with multiple sortfields), this 
 should be improved.
 For example, early termination is fine in the following case:
 * index-time: popularity desc, time desc
 * query-time: popularity desc



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5828) Support for multiple wildcard highlight fields

2014-03-07 Thread Daniel Debray (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Debray updated SOLR-5828:


Description: 
Hey guys,

is there a reason why we don't support multiple wildcard querys for 
highlighting? Something like hl.fl=foo.* hl.fl=bar.*  or hl.fl=foo.* bar.*.

If nothing speaks against it i would like to provide a patch for this issue.

  was:
Hey guys,

is there a reason why we don't support multiple wildcard querys for 
highlighting? Something like hl.fl=foo.*hl.fl=bar.* or hl.fl=foo.* bar.*.

If nothing speaks against it i would like to provide a patch for this issue.


 Support for multiple wildcard highlight fields
 --

 Key: SOLR-5828
 URL: https://issues.apache.org/jira/browse/SOLR-5828
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 5.0
Reporter: Daniel Debray
Priority: Minor

 Hey guys,
 is there a reason why we don't support multiple wildcard querys for 
 highlighting? Something like hl.fl=foo.* hl.fl=bar.*  or hl.fl=foo.* 
 bar.*.
 If nothing speaks against it i would like to provide a patch for this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5498) SortingAtomicReader should be package private

2014-03-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923878#comment-13923878
 ] 

Robert Muir commented on LUCENE-5498:
-

FWIW the other tools in lucene/misc seem to take a similar approach: e.g. 
PKIndexSplitter hides its FilterReader

 SortingAtomicReader should be package private
 -

 Key: LUCENE-5498
 URL: https://issues.apache.org/jira/browse/LUCENE-5498
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 The intended purpose of this reader is to allow you to sort your entire index 
 with IW.addIndexes(IR).
 Perhaps we should supply some kind of tool to do this and hide the reader. 
 Its scary to think of someone using this for searching (based on its name and 
 docs, its probably not clear that it would be ridiculously slow)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-5828) Support for multiple wildcard highlight fields

2014-03-07 Thread Daniel Debray (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Debray closed SOLR-5828.
---

Resolution: Duplicate

 Support for multiple wildcard highlight fields
 --

 Key: SOLR-5828
 URL: https://issues.apache.org/jira/browse/SOLR-5828
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 5.0
Reporter: Daniel Debray
Priority: Minor

 Hey guys,
 is there a reason why we don't support multiple wildcard querys for 
 highlighting? Something like hl.fl=foo.* hl.fl=bar.*  or hl.fl=foo.* 
 bar.*.
 If nothing speaks against it i would like to provide a patch for this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5828) Support for multiple wildcard highlight fields

2014-03-07 Thread Daniel Debray (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923881#comment-13923881
 ] 

Daniel Debray commented on SOLR-5828:
-

Duplicate of SOLR-5127.

 Support for multiple wildcard highlight fields
 --

 Key: SOLR-5828
 URL: https://issues.apache.org/jira/browse/SOLR-5828
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 5.0
Reporter: Daniel Debray
Priority: Minor

 Hey guys,
 is there a reason why we don't support multiple wildcard querys for 
 highlighting? Something like hl.fl=foo.* hl.fl=bar.*  or hl.fl=foo.* 
 bar.*.
 If nothing speaks against it i would like to provide a patch for this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1121: POMs out of sync

2014-03-07 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1121/

1 tests failed.
REGRESSION:  
org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch

Error Message:
There were too many update fails - we expect it can happen, but shouldn't easily

Stack Trace:
java.lang.AssertionError: There were too many update fails - we expect it can 
happen, but shouldn't easily
at 
__randomizedtesting.SeedInfo.seed([C6C5F3B33C48A73B:47237DAB4B17C707]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertFalse(Assert.java:68)
at 
org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:212)




Build Log:
[...truncated 53142 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:488: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:176: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77:
 Java returned: 1

Total time: 140 minutes 36 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head

2014-03-07 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5773:
-

Attachment: SOLR-5773.patch

 CollapsingQParserPlugin should make elevated documents the group head
 -

 Key: SOLR-5773
 URL: https://issues.apache.org/jira/browse/SOLR-5773
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.6.1
Reporter: David
Assignee: Joel Bernstein
  Labels: collapse, solr
 Fix For: 4.8

 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, 
 SOLR-5773.patch, SOLR-5773.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Hi Joel,
 I sent you an email but I'm not sure if you received it or not. I ran into a 
 bit of trouble using the CollapsingQParserPlugin with elevated documents. To 
 explain it simply, I want to exclude grouped documents when one of the 
 members of the group are contained in the elevated document set. I'm not sure 
 this is possible currently because as you explain above elevated documents 
 are added to the request context after the original query is constructed.
 To try to better illustrate the problem. If I have 2 documents docid=1 and 
 docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 
 first in the results but I have elevated docid 1 then both documents are 
 shown in the results when I really only want the elevated document to be 
 shown in the results.
 Is this something that would be difficult to implement? Any help is 
 appreciated.
 I think the solution would be to remove the documents from liveDocs that 
 share the same groupid in the getBoostDocs() function. Let me know if this 
 makes any sense. I'll continue working towards a solution in the meantime.
 {code}
 private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, 
 SetString boosted) throws IOException {
   IntOpenHashSet boostDocs = null;
   if(boosted != null) {
 SchemaField idField = indexSearcher.getSchema().getUniqueKeyField();
 String fieldName = idField.getName();
 HashSetBytesRef localBoosts = new HashSet(boosted.size()*2);
 IteratorString boostedIt = boosted.iterator();
 while(boostedIt.hasNext()) {
   localBoosts.add(new BytesRef(boostedIt.next()));
 }
 boostDocs = new IntOpenHashSet(boosted.size()*2);
 ListAtomicReaderContextleaves = 
 indexSearcher.getTopReaderContext().leaves();
 TermsEnum termsEnum = null;
 DocsEnum docsEnum = null;
 for(AtomicReaderContext leaf : leaves) {
   AtomicReader reader = leaf.reader();
   int docBase = leaf.docBase;
   Bits liveDocs = reader.getLiveDocs();
   Terms terms = reader.terms(fieldName);
   termsEnum = terms.iterator(termsEnum);
   IteratorBytesRef it = localBoosts.iterator();
   while(it.hasNext()) {
 BytesRef ref = it.next();
 if(termsEnum.seekExact(ref)) {
   docsEnum = termsEnum.docs(liveDocs, docsEnum);
   int doc = docsEnum.nextDoc();
   if(doc != -1) {
 //Found the document.
 boostDocs.add(doc+docBase);
*// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY 
 THE DOCID //*
 it.remove();
   }
 }
   }
 }
   }
   return boostDocs;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head

2014-03-07 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923885#comment-13923885
 ] 

Joel Bernstein commented on SOLR-5773:
--

Tested this at scale and it seems to be functioning properly. 

David, let me know when you've had a chance to test out the patch.

Thanks,
Joel

 CollapsingQParserPlugin should make elevated documents the group head
 -

 Key: SOLR-5773
 URL: https://issues.apache.org/jira/browse/SOLR-5773
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.6.1
Reporter: David
Assignee: Joel Bernstein
  Labels: collapse, solr
 Fix For: 4.8

 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, 
 SOLR-5773.patch, SOLR-5773.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Hi Joel,
 I sent you an email but I'm not sure if you received it or not. I ran into a 
 bit of trouble using the CollapsingQParserPlugin with elevated documents. To 
 explain it simply, I want to exclude grouped documents when one of the 
 members of the group are contained in the elevated document set. I'm not sure 
 this is possible currently because as you explain above elevated documents 
 are added to the request context after the original query is constructed.
 To try to better illustrate the problem. If I have 2 documents docid=1 and 
 docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 
 first in the results but I have elevated docid 1 then both documents are 
 shown in the results when I really only want the elevated document to be 
 shown in the results.
 Is this something that would be difficult to implement? Any help is 
 appreciated.
 I think the solution would be to remove the documents from liveDocs that 
 share the same groupid in the getBoostDocs() function. Let me know if this 
 makes any sense. I'll continue working towards a solution in the meantime.
 {code}
 private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, 
 SetString boosted) throws IOException {
   IntOpenHashSet boostDocs = null;
   if(boosted != null) {
 SchemaField idField = indexSearcher.getSchema().getUniqueKeyField();
 String fieldName = idField.getName();
 HashSetBytesRef localBoosts = new HashSet(boosted.size()*2);
 IteratorString boostedIt = boosted.iterator();
 while(boostedIt.hasNext()) {
   localBoosts.add(new BytesRef(boostedIt.next()));
 }
 boostDocs = new IntOpenHashSet(boosted.size()*2);
 ListAtomicReaderContextleaves = 
 indexSearcher.getTopReaderContext().leaves();
 TermsEnum termsEnum = null;
 DocsEnum docsEnum = null;
 for(AtomicReaderContext leaf : leaves) {
   AtomicReader reader = leaf.reader();
   int docBase = leaf.docBase;
   Bits liveDocs = reader.getLiveDocs();
   Terms terms = reader.terms(fieldName);
   termsEnum = terms.iterator(termsEnum);
   IteratorBytesRef it = localBoosts.iterator();
   while(it.hasNext()) {
 BytesRef ref = it.next();
 if(termsEnum.seekExact(ref)) {
   docsEnum = termsEnum.docs(liveDocs, docsEnum);
   int doc = docsEnum.nextDoc();
   if(doc != -1) {
 //Found the document.
 boostDocs.add(doc+docBase);
*// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY 
 THE DOCID //*
 it.remove();
   }
 }
   }
 }
   }
   return boostDocs;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Stalled unit tests

2014-03-07 Thread Terry Smith
Mike,

Fair enough. I'll let them run for more than 30 minutes and see what
happens.

How long does it take on your machine? I'm happy to signup for the wiki and
add some extra information to
http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to
tinker with Lucene.

Do the Lucene developers typically run a subset of the test suite to make
committing cheaper?

Thanks,

--Terry



On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Unfortunately, some tests take a very long time, and the test infra
 will print these HEARTBEAT messages notifying you that they are still
 running.  They should eventually finish?

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith sheb...@gmail.com wrote:
  I'm sure that I'm just missing something obvious but I'm having trouble
  getting the unit tests to run to completion on my laptop and was hoping
 that
  someone would be kind enough to point me in the right direction.
 
  I've cloned the repository from GitHub
  (http://git.apache.org/lucene-solr.git) and checked out the latest
 commit on
  branch_4x.
 
  commit 6e06247cec1410f32592bfd307c1020b814def06
 
  Author: Robert Muir rm...@apache.org
 
  Date:   Thu Mar 6 19:54:07 2014 +
 
 
  disable slow solr tests in smoketester
 
 
 
  git-svn-id:
  https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
  13f79535-47bb-0310-9956-ffa450edef68
 
 
  Executing ant clean test from the top level directory of the project
 shows
  the tests running but they seems to get stuck in loop with some stalled
  heartbeat messages. If I run the tests directly from lucene/ then they
  complete successfully after about 10 minutes.
 
  I'm using Java 6 under OS X (10.9.2).
 
  $ java -version
 
  java version 1.6.0_65
 
  Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
 
  Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
 
 
  My terminal lists repeating stalled heartbeat messages like so:
 
  HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for
 2111s
  at: HdfsLockFactoryTest.testBasic
 
  HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for
 2108s
  at: TestSurroundQueryParser.testQueryParser
 
  HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for
 2167s
  at: TestRecoveryHdfs.testBuffering
 
  HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for
 2165s
  at: HdfsDirectoryTest.testEOF
 
 
  My machine does have 3 java processes chewing CPU, see attached jstack
 dumps
  for more information.
 
  Should I expect the tests to complete on my platform? Do I need to
 specify
  any special flags to give them more memory or to avoid any bad apples?
 
  Thanks in advance,
 
  --Terry
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Stalled unit tests

2014-03-07 Thread Dawid Weiss
 How long does it take on your machine?

It really depends... check out the limit on some heavy nightly tests
like this one:

@TimeoutSuite(millis = 80 * TimeUnits.HOUR)
@Ignore(takes ~ 45 minutes)

(Somebody should really inspect this inconsistency :).

Or this one:

@Ignore(Requires tons of heap to run (420G works))
@TimeoutSuite(millis = 100 * TimeUnits.HOUR)

Wait... how many Gs? :)

And seriously the top parent class of all tests declares:

@TimeoutSuite(millis = 2 * TimeUnits.HOUR)

And this unfortunately means that a test class will timeout after 2
hours of inactivity. To me, it's absurdly high but in the past tests
ran on very slow virtualized machines and were actually hitting these
limits.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Suggestions about writing / extending QueryParsers

2014-03-07 Thread Allison, Timothy B.
Tommaso,
  Ah, now I see.  If you want to add new operators, you'll have to modify the 
javacc files.  For the SpanQueryParser, I added a handful of new operators and 
chose to go with regexes instead of javacc...not sure that was the right 
decision, but given my lack of knowledge of javacc, it was expedient.  If you 
have time or already know javacc, it shouldn't be difficult.
  As for nobrainer on the Solr side, y, it shouldn't be a problem.  However, as 
of now the basic queryparser is a copy and paste job between Lucene and Solr, 
so you'll just have to redo your code in Solrunless you do something 
smarter.
  If you'd be willing to wait for LUCENE-5205 to be brought into Lucene, I'd 
consider adding this functionality into the SpanQueryParser as a later step.

  Cheers,

 Tim

From: Tommaso Teofili [mailto:tommaso.teof...@gmail.com]
Sent: Friday, March 07, 2014 3:17 AM
To: dev@lucene.apache.org
Subject: Re: Suggestions about writing / extending QueryParsers

Thanks Tim and Upayavira for your replies.

I still need to decide what the final syntax could be, however generally 
speaking the ideal would be that I am able to extend the current Lucene syntax 
with a new expression which will trigger the creation of a more like this query 
with something like +title:foo +text for similar docs%2 where the phrase 
between quotes will generate a MoreLikeThisQuery on that text if it's followed 
by the % character (and the number 2 may control the MLT configuration, e.g. 
min document freq == min term freq = 2), similarly to what it's done for 
proximity search (not sure about using %, it's just a syntax example).
I guess then I'd need to extend the classic query parser, as per Tim's 
suggestions and I'd assume that if this goes into the classic qp it should be a 
no brainer on the Solr side.
Does it sound correct / feasible?

Regards,
Tommaso
2014-03-06 15:08 GMT+01:00 Upayavira 
u...@odoko.co.ukmailto:u...@odoko.co.uk:
Tommaso,

Do say more about what you're thinking of. I'm currently getting my dev 
environment up to look into enhancing the MoreLikeThisHandler to be able handle 
function query boosts. This should be eminently possible from my initial 
research. However, if you're thinking of something more powerful, perhaps we 
can work together.

Upayavira


On Thu, Mar 6, 2014, at 11:23 AM, Tommaso Teofili wrote:
Hi all,

I'm thinking about writing/extending a QueryParser for MLT queries; I've never 
really looked into that code too much, while I'm doing that now, I'm wondering 
if anyone has suggestions on how to start with such a topic.
Should I write a new grammar for that ? Or can I just extend an existing 
grammar / class?

Thanks in advance,
Tommaso



[jira] [Commented] (SOLR-5720) Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923900#comment-13923900
 ] 

ASF subversion and git services commented on SOLR-5720:
---

Commit 1575266 from [~joel.bernstein] in branch 'dev/trunk'
[ https://svn.apache.org/r1575266 ]

SOLR-5720: Updated CHANGES.txt

 Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin
 --

 Key: SOLR-5720
 URL: https://issues.apache.org/jira/browse/SOLR-5720
 Project: Solr
  Issue Type: New Feature
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.8, 5.0

 Attachments: SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, 
 SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, 
 SOLR-5720.patch, SOLR-5720.patch


 This ticket introduces a new search component called the ExpandComponent. The 
 expand component expands a single page of results collapsed by the 
 CollapsingQParserPlugin.
 Sample syntax:
 {code}
 q=*:*fq={!collapse 
 field=fieldA}expand=trueexpand.sort=fieldB+ascexpand.rows=10
 {code}
 In the above query the results are collapsed on fieldA with the 
 CollapsingQParserPlugin. The expand component expands the current page of 
 collapsed results.
 The initial implementation of the ExpandComponent takes three parameters:
 *expand=true* (Turns on the ExpandComponent)
 *expand.sort=fieldB+asc,fieldC+desc* (Sorts the documents based on a sort 
 spec. If none is specified the documents are sorted by relevance based on the 
 main query.)
 *expand.rows=10* (Sets the numbers of rows that groups are expanded to).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5720) Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923904#comment-13923904
 ] 

ASF subversion and git services commented on SOLR-5720:
---

Commit 1575267 from [~joel.bernstein] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1575267 ]

SOLR-5720: Updated CHANGES.txt

 Add ExpandComponent to expand results collapsed by the CollapsingQParserPlugin
 --

 Key: SOLR-5720
 URL: https://issues.apache.org/jira/browse/SOLR-5720
 Project: Solr
  Issue Type: New Feature
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.8, 5.0

 Attachments: SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, 
 SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, SOLR-5720.patch, 
 SOLR-5720.patch, SOLR-5720.patch


 This ticket introduces a new search component called the ExpandComponent. The 
 expand component expands a single page of results collapsed by the 
 CollapsingQParserPlugin.
 Sample syntax:
 {code}
 q=*:*fq={!collapse 
 field=fieldA}expand=trueexpand.sort=fieldB+ascexpand.rows=10
 {code}
 In the above query the results are collapsed on fieldA with the 
 CollapsingQParserPlugin. The expand component expands the current page of 
 collapsed results.
 The initial implementation of the ExpandComponent takes three parameters:
 *expand=true* (Turns on the ExpandComponent)
 *expand.sort=fieldB+asc,fieldC+desc* (Sorts the documents based on a sort 
 spec. If none is specified the documents are sorted by relevance based on the 
 main query.)
 *expand.rows=10* (Sets the numbers of rows that groups are expanded to).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5492) IndexFileDeleter AssertionError in presence of *_upgraded.si files

2014-03-07 Thread Tim Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923910#comment-13923910
 ] 

Tim Smith commented on LUCENE-5492:
---

Here's what my test is doing:

1. unpacks lucene 3.x era index (has one segment in it)
2. opens IndexWriter on 3.x index
3. opens DirectoryReader using IndexWriter
4. Add 1 new document
5. commit IndexWriter
6. reopens DirectoryReader using IndexWriter
7. optimizes IndexWriter
8. commit optimized index
9. reopens DirectoryReader using IndexWriter

One thing of note is that i have a custom IndexDeletionPolicy
this policy will hold onto named commit points 
i hold onto the previous commit point at commit time, and then release it 
shortly after the commit is finished, once i have persisted my acceptance of 
the new commit point (calling deleteUnusedFiles() to purge it)




 IndexFileDeleter AssertionError in presence of *_upgraded.si files
 --

 Key: LUCENE-5492
 URL: https://issues.apache.org/jira/browse/LUCENE-5492
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.7
Reporter: Tim Smith
Assignee: Michael McCandless

 When calling IndexWriter.deleteUnusedFiles against an index that contains 3.x 
 segments, i am seeing the following exception:
 {code}
 java.lang.AssertionError: failAndDumpStackJunitStatment: RefCount is 0 
 pre-decrement for file _0_upgraded.si
 at 
 org.apache.lucene.index.IndexFileDeleter$RefCount.DecRef(IndexFileDeleter.java:630)
 at 
 org.apache.lucene.index.IndexFileDeleter.decRef(IndexFileDeleter.java:514)
 at 
 org.apache.lucene.index.IndexFileDeleter.deleteCommits(IndexFileDeleter.java:286)
 at 
 org.apache.lucene.index.IndexFileDeleter.revisitPolicy(IndexFileDeleter.java:393)
 at 
 org.apache.lucene.index.IndexWriter.deleteUnusedFiles(IndexWriter.java:4617)
 {code}
 I believe this is caused by IndexFileDeleter not being aware of the Lucene3x 
 Segment Infos Format (notably the _upgraded.si files created to upgrade an 
 old index)
 This is new in 4.7 and did not occur in 4.6.1
 Still trying to track down a workaround/fix



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5825) Separate http request creation and execution in SolrJ

2014-03-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923913#comment-13923913
 ] 

Mark Miller commented on SOLR-5825:
---

+1

 Separate http request creation and execution in SolrJ
 -

 Key: SOLR-5825
 URL: https://issues.apache.org/jira/browse/SOLR-5825
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Steven Bower
 Attachments: SOLR-5825.patch


 In order to implement some custom behaviors I split the request() method in 
 HttpSolrServer into 2 distinct method createMethod() and executeMethod(). 
 This allows for customization of either/both of these phases vs having it in 
 a single function.
 In my use case I extended HttpSolrServer to support client side timeouts 
 (so_timeout, connectTimeout and request timeout).. without duplicating the 
 code in request() I couldn't accomplish..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923915#comment-13923915
 ] 

Tim Allison commented on LUCENE-5205:
-

The root of this problem is that SpanNearIQuery has no good way to handle 
stopwords in a way analagous to PhraseQuery.

In SpanQueryParser, this limitation should be well described in the javadocs to 
SpanQueryParser and in the test cases.  Let me know if it isn't.  You have the 
option of throwing an exception when a stopword is found to notify the user 
about stopwords, but that's exceedingly unsatisfactory.

Without digging into the internals of SpanNearQuery, we can still do better on 
this.  One proposal is to do what the basic highlighter does and risk false 
positives...behind the scenes modify calculator for evaluating to calculator 
evaluating~1.  This would then falsely match calculator zebra evaluating.  
PhraseQuery can have false positives, too, but it guarantees that the false hit 
has to be a stop word.  This solution would not do that.  So, is this better 
than no matches at all?


 [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
 classic QueryParser
 ---

 Key: LUCENE-5205
 URL: https://issues.apache.org/jira/browse/LUCENE-5205
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Reporter: Tim Allison
  Labels: patch
 Fix For: 4.7

 Attachments: LUCENE-5205-cleanup-tests.patch, 
 LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
 LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_smallTestMods.patch, 
 LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt


 This parser extends QueryParserBase and includes functionality from:
 * Classic QueryParser: most of its syntax
 * SurroundQueryParser: recursive parsing for near and not clauses.
 * ComplexPhraseQueryParser: can handle near queries that include multiterms 
 (wildcard, fuzzy, regex, prefix),
 * AnalyzingQueryParser: has an option to analyze multiterms.
 At a high level, there's a first pass BooleanQuery/field parser and then a 
 span query parser handles all terminal nodes and phrases.
 Same as classic syntax:
 * term: test 
 * fuzzy: roam~0.8, roam~2
 * wildcard: te?t, test*, t*st
 * regex: /\[mb\]oat/
 * phrase: jakarta apache
 * phrase with slop: jakarta apache~3
 * default or clause: jakarta apache
 * grouping or clause: (jakarta apache)
 * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
 * multiple fields: title:lucene author:hatcher
  
 Main additions in SpanQueryParser syntax vs. classic syntax:
 * Can require in order for phrases with slop with the \~ operator: 
 jakarta apache\~3
 * Can specify not near: fever bieber!\~3,10 ::
 find fever but not if bieber appears within 3 words before or 10 
 words after it.
 * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
 apache\]~3 lucene\]\~4 :: 
 find jakarta within 3 words of apache, and that hit has to be within 
 four words before lucene
 * Can also use \[\] for single level phrasal queries instead of  as in: 
 \[jakarta apache\]
 * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 
 :: find apache and then either lucene or solr within three words.
 * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2
 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
 /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two 
 words of ap*che and that hit has to be within ten words of something like 
 solr or that lucene regex.
 * Can require at least x number of hits at boolean level: apache AND (lucene 
 solr tika)~2
 * Can use negative only query: -jakarta :: Find all docs that don't contain 
 jakarta
 * Can use an edit distance  2 for fuzzy query via SlowFuzzyQuery (beware of 
 potential performance issues!).
 Trivial additions:
 * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
 prefix =2)
 * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
 =2: (jakarta~1 (OSA) vs jakarta~1(Levenshtein)
 This parser can be very useful for concordance tasks (see also LUCENE-5317 
 and LUCENE-5318) and for analytical search.  
 Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
 Most of the documentation is in the javadoc for SpanQueryParser.
 Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5205) [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to classic QueryParser

2014-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923915#comment-13923915
 ] 

Tim Allison edited comment on LUCENE-5205 at 3/7/14 2:36 PM:
-

The root of this problem is that SpanNearQuery has no good way to handle 
stopwords in a way analagous to PhraseQuery.

In SpanQueryParser, this limitation should be well described in the javadocs to 
SpanQueryParser and in the test cases.  Let me know if it isn't.  You have the 
option of throwing an exception when a stopword is found to notify the user 
about stopwords, but that's exceedingly unsatisfactory.

Without digging into the internals of SpanNearQuery, we can still do better on 
this.  One proposal is to do what the basic highlighter does and risk false 
positives...behind the scenes modify calculator for evaluating to calculator 
evaluating~1.  This would then falsely match calculator zebra evaluating.  
PhraseQuery can have false positives, too, but it guarantees that the false hit 
has to be a stop word.  This solution would not do that.  So, is this better 
than no matches at all?



was (Author: talli...@mitre.org):
The root of this problem is that SpanNearIQuery has no good way to handle 
stopwords in a way analagous to PhraseQuery.

In SpanQueryParser, this limitation should be well described in the javadocs to 
SpanQueryParser and in the test cases.  Let me know if it isn't.  You have the 
option of throwing an exception when a stopword is found to notify the user 
about stopwords, but that's exceedingly unsatisfactory.

Without digging into the internals of SpanNearQuery, we can still do better on 
this.  One proposal is to do what the basic highlighter does and risk false 
positives...behind the scenes modify calculator for evaluating to calculator 
evaluating~1.  This would then falsely match calculator zebra evaluating.  
PhraseQuery can have false positives, too, but it guarantees that the false hit 
has to be a stop word.  This solution would not do that.  So, is this better 
than no matches at all?


 [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
 classic QueryParser
 ---

 Key: LUCENE-5205
 URL: https://issues.apache.org/jira/browse/LUCENE-5205
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/queryparser
Reporter: Tim Allison
  Labels: patch
 Fix For: 4.7

 Attachments: LUCENE-5205-cleanup-tests.patch, 
 LUCENE-5205-date-pkg-prvt.patch, LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
 LUCENE-5205_dateTestReInitPkgPrvt.patch, LUCENE-5205_smallTestMods.patch, 
 LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt


 This parser extends QueryParserBase and includes functionality from:
 * Classic QueryParser: most of its syntax
 * SurroundQueryParser: recursive parsing for near and not clauses.
 * ComplexPhraseQueryParser: can handle near queries that include multiterms 
 (wildcard, fuzzy, regex, prefix),
 * AnalyzingQueryParser: has an option to analyze multiterms.
 At a high level, there's a first pass BooleanQuery/field parser and then a 
 span query parser handles all terminal nodes and phrases.
 Same as classic syntax:
 * term: test 
 * fuzzy: roam~0.8, roam~2
 * wildcard: te?t, test*, t*st
 * regex: /\[mb\]oat/
 * phrase: jakarta apache
 * phrase with slop: jakarta apache~3
 * default or clause: jakarta apache
 * grouping or clause: (jakarta apache)
 * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
 * multiple fields: title:lucene author:hatcher
  
 Main additions in SpanQueryParser syntax vs. classic syntax:
 * Can require in order for phrases with slop with the \~ operator: 
 jakarta apache\~3
 * Can specify not near: fever bieber!\~3,10 ::
 find fever but not if bieber appears within 3 words before or 10 
 words after it.
 * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
 apache\]~3 lucene\]\~4 :: 
 find jakarta within 3 words of apache, and that hit has to be within 
 four words before lucene
 * Can also use \[\] for single level phrasal queries instead of  as in: 
 \[jakarta apache\]
 * Can use or grouping clauses in phrasal queries: apache (lucene solr)\~3 
 :: find apache and then either lucene or solr within three words.
 * Can use multiterms in phrasal queries: jakarta\~1 ap*che\~2
 * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
 /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like jakarta within two 
 words of ap*che and that hit has to be within ten words of something like 
 solr or that lucene regex.
 * Can require at least x number of hits at boolean level: apache AND (lucene 
 solr tika)~2
 * Can use negative only query: -jakarta :: Find all docs 

[jira] [Commented] (SOLR-5818) distrib search with custom comparator does not quite work correctly

2014-03-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923919#comment-13923919
 ] 

Mark Miller commented on SOLR-5818:
---

+1 - LGTM.

 distrib search with custom comparator does not quite work correctly
 ---

 Key: SOLR-5818
 URL: https://issues.apache.org/jira/browse/SOLR-5818
 Project: Solr
  Issue Type: Bug
Reporter: Ryan Ernst
 Attachments: SOLR-5818.patch


 In QueryComponent.doFieldSortValues, a scorer is never set on a custom 
 comparator.  We just need to add a fake scorer that can pass through the 
 score from the DocList.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5730) make Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector configurable in Solr

2014-03-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923920#comment-13923920
 ] 

Robert Muir commented on SOLR-5730:
---

Hello, some things that might simplify some of the TODOs, is we changed the 
SortingMergePolicy API in LUCENE-5493 to just take Sort.

This means you can have multiple fields, they dont have to be numeric 
docvalues, and so on. So I think this can simplify the configuration of this 
thing too, e.g. you could just take a standard sort spec string and parse it 
with QueryParsing.getSort or whatever (some refactoring might be needed here). 

It would be good though, to check that Sort.needsScores() == false, as that 
makes no sense at index-time... I'll open an issue to add this check to 
SortingMergePolicy itself in lucene.

The other difference is, EarlyTerminatingSortingCollector now also takes a 
Sort, except really you should just pass the Sort being used for the Query (it 
does the proper checking against the segments to see if the segment was sorted 
in a compatible way, and if so, will optimize with early termination). Today 
this just checks that they are exactly equal, but in the future it can be 
smarter (LUCENE-5499).

Hopefully this makes the integration easier.

 make Lucene's SortingMergePolicy and EarlyTerminatingSortingCollector 
 configurable in Solr
 --

 Key: SOLR-5730
 URL: https://issues.apache.org/jira/browse/SOLR-5730
 Project: Solr
  Issue Type: New Feature
Reporter: Christine Poerschke
Priority: Minor

 Example configuration:
 solrconfig.xml
 {noformat}
 mergeSorter class=org.apache.solr.update.DefaultMergeSorterFactory/
 {noformat}
 schema.xml
 {noformat}
 mergeSorterKey class=org.apache.solr.schema.SingleFieldSorterFactory
   str name=fieldNametimestamp/str
   bool name=ascendingfalse/bool
 /mergeSorterKey
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5500:
---

 Summary: SortingMergePolicy should error if the Sort refers to the 
score
 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


It should throw an exception if Sort.needsScores() == true. This does not make 
sense at index-time.

I think there is no reason for this method to be package-private either (as its 
just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923924#comment-13923924
 ] 

Robert Muir commented on LUCENE-5500:
-

Note you will get an exception today: but not until the actual merge. The idea 
here is to fail when you configure the thing on indexwriter!

 SortingMergePolicy should error if the Sort refers to the score
 ---

 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 It should throw an exception if Sort.needsScores() == true. This does not 
 make sense at index-time.
 I think there is no reason for this method to be package-private either (as 
 its just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5500:


Attachment: LUCENE-5500.patch

simple patch. I added tests for both the MP and the FilterReader (as its public 
today)

 SortingMergePolicy should error if the Sort refers to the score
 ---

 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5500.patch


 It should throw an exception if Sort.needsScores() == true. This does not 
 make sense at index-time.
 I think there is no reason for this method to be package-private either (as 
 its just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5823) Add utility function for internal code to know if it is currently the overseer

2014-03-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923934#comment-13923934
 ] 

Mark Miller commented on SOLR-5823:
---

Cool. Couple comments:


bq. Looks like the ZooKeeper folks are planning to introduce a Path class to 
help with parsing in 3.5 

Seems we should pull it out into it's own method or utility static in the 
meantime?

I'd also add a warning about using it to the javadoc - most of this type of 
info is essentially free because it's in the cluster state, but this calls ZK 
and we try and do that sparingly.

I'm still not sure why this can't just be a thread in the Overseer class though 
and avoid this call altogether? That already would fail over as you need right?


 Add utility function for internal code to know if it is currently the overseer
 --

 Key: SOLR-5823
 URL: https://issues.apache.org/jira/browse/SOLR-5823
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Attachments: SOLR-5823.patch


 It would be useful if there was some Overseer equivalent to 
 CloudDescriptor.isLeader() that plugins running in solr could use to know At 
 this moment, am i the leader? 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks

2014-03-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923942#comment-13923942
 ] 

Mark Miller commented on SOLR-5477:
---

bq. SolrJ calls you mean methods like 
CollectionAdminRequest.createCollection().

Right - it can def come in a second issue, but it seems like just at least 
adding the async param is pretty low hanging fruit. 

 Async execution of OverseerCollectionProcessor tasks
 

 Key: SOLR-5477
 URL: https://issues.apache.org/jira/browse/SOLR-5477
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Anshum Gupta
 Attachments: SOLR-5477-CoreAdminStatus.patch, 
 SOLR-5477-updated.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch


 Typical collection admin commands are long running and it is very common to 
 have the requests get timed out.  It is more of a problem if the cluster is 
 very large.Add an option to run these commands asynchronously
 add an extra param async=true for all collection commands
 the task is written to ZK and the caller is returned a task id. 
 as separate collection admin command will be added to poll the status of the 
 task
 command=statusid=7657668909
 if id is not passed all running async tasks should be listed
 A separate queue is created to store in-process tasks . After the tasks are 
 completed the queue entry is removed. OverSeerColectionProcessor will perform 
 these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5477) Async execution of OverseerCollectionProcessor tasks

2014-03-07 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923960#comment-13923960
 ] 

Anshum Gupta commented on SOLR-5477:


[~markrmil...@gmail.com] Sure, I'll add that and put up another patch. It's 
just that I wanted to get it into trunk sooner than later considering that the 
patch touches a reasonable points in the code, which makes it tricky to forward 
port every time.

 Async execution of OverseerCollectionProcessor tasks
 

 Key: SOLR-5477
 URL: https://issues.apache.org/jira/browse/SOLR-5477
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Anshum Gupta
 Attachments: SOLR-5477-CoreAdminStatus.patch, 
 SOLR-5477-updated.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, SOLR-5477.patch, 
 SOLR-5477.patch, SOLR-5477.patch


 Typical collection admin commands are long running and it is very common to 
 have the requests get timed out.  It is more of a problem if the cluster is 
 very large.Add an option to run these commands asynchronously
 add an extra param async=true for all collection commands
 the task is written to ZK and the caller is returned a task id. 
 as separate collection admin command will be added to poll the status of the 
 task
 command=statusid=7657668909
 if id is not passed all running async tasks should be listed
 A separate queue is created to store in-process tasks . After the tasks are 
 completed the queue entry is removed. OverSeerColectionProcessor will perform 
 these tasks in multiple threads



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923990#comment-13923990
 ] 

Adrien Grand commented on LUCENE-5500:
--

+1

 SortingMergePolicy should error if the Sort refers to the score
 ---

 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5500.patch


 It should throw an exception if Sort.needsScores() == true. This does not 
 make sense at index-time.
 I think there is no reason for this method to be package-private either (as 
 its just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5491) Flexible StandardQueryParser fails on boost field

2014-03-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André  updated LUCENE-5491:
---

Fix Version/s: 4.8

 Flexible StandardQueryParser fails on boost field
 -

 Key: LUCENE-5491
 URL: https://issues.apache.org/jira/browse/LUCENE-5491
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.6, 4.7
Reporter: André 
 Fix For: 4.8


 The following exception 
 {noformat}
 java.lang.IllegalArgumentException: field name should not be null!
   at 
 org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36)
   at 
 org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59)
   at 
 org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255)
   at 
 org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168)
 {noformat}
 is caused by boosting a tokenizable phrase field within a group.
 {code:java}
 public void testFail() throws Exception {
   test((mimeType:\text-html\));
 }
 public void testOkay() throws Exception {
   test(mimeType:\text-html\);
 }
 static void test(String qs) throws Exception {
   Analyzer sa = new StandardAnalyzer(Version.LUCENE_46);
   StandardQueryParser qp = new StandardQueryParser(sa);
   qp.getFieldsBoost().put(mimeType, 1f);
   qp.parse(qs, content);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5491) NPE in Flexible StandardQueryParser on boosting

2014-03-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André  updated LUCENE-5491:
---

Summary: NPE in Flexible StandardQueryParser on boosting  (was: NPE in 
Flexible StandardQueryParser when boosting)

 NPE in Flexible StandardQueryParser on boosting
 ---

 Key: LUCENE-5491
 URL: https://issues.apache.org/jira/browse/LUCENE-5491
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.6, 4.7
Reporter: André 
 Fix For: 4.8


 The following exception 
 {noformat}
 java.lang.IllegalArgumentException: field name should not be null!
   at 
 org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36)
   at 
 org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59)
   at 
 org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255)
   at 
 org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168)
 {noformat}
 is caused by boosting a tokenizable phrase field within a group.
 {code:java}
 public void testFail() throws Exception {
   test((mimeType:\text-html\));
 }
 public void testOkay() throws Exception {
   test(mimeType:\text-html\);
 }
 static void test(String qs) throws Exception {
   Analyzer sa = new StandardAnalyzer(Version.LUCENE_46);
   StandardQueryParser qp = new StandardQueryParser(sa);
   qp.getFieldsBoost().put(mimeType, 1f);
   qp.parse(qs, content);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5491) NPE in Flexible StandardQueryParser when boosting

2014-03-07 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

André  updated LUCENE-5491:
---

Summary: NPE in Flexible StandardQueryParser when boosting  (was: Flexible 
StandardQueryParser fails on boost field)

 NPE in Flexible StandardQueryParser when boosting
 -

 Key: LUCENE-5491
 URL: https://issues.apache.org/jira/browse/LUCENE-5491
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.6, 4.7
Reporter: André 
 Fix For: 4.8


 The following exception 
 {noformat}
 java.lang.IllegalArgumentException: field name should not be null!
   at 
 org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36)
   at 
 org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59)
   at 
 org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255)
   at 
 org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168)
 {noformat}
 is caused by boosting a tokenizable phrase field within a group.
 {code:java}
 public void testFail() throws Exception {
   test((mimeType:\text-html\));
 }
 public void testOkay() throws Exception {
   test(mimeType:\text-html\);
 }
 static void test(String qs) throws Exception {
   Analyzer sa = new StandardAnalyzer(Version.LUCENE_46);
   StandardQueryParser qp = new StandardQueryParser(sa);
   qp.getFieldsBoost().put(mimeType, 1f);
   qp.parse(qs, content);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5491) NPE in Flexible StandardQueryParser on boosting

2014-03-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/LUCENE-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923999#comment-13923999
 ] 

André  commented on LUCENE-5491:


@[~adriano_crestani] I added a null check and it works fine.

 NPE in Flexible StandardQueryParser on boosting
 ---

 Key: LUCENE-5491
 URL: https://issues.apache.org/jira/browse/LUCENE-5491
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/queryparser
Affects Versions: 4.6, 4.7
Reporter: André 
 Fix For: 4.8


 The following exception 
 {noformat}
 java.lang.IllegalArgumentException: field name should not be null!
   at 
 org.apache.lucene.queryparser.flexible.core.config.FieldConfig.init(FieldConfig.java:36)
   at 
 org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler.getFieldConfig(QueryConfigHandler.java:59)
   at 
 org.apache.lucene.queryparser.flexible.standard.processors.BoostQueryNodeProcessor.postProcessNode(BoostQueryNodeProcessor.java:54)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:99)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processChildren(QueryNodeProcessorImpl.java:125)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.processIteration(QueryNodeProcessorImpl.java:97)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorImpl.process(QueryNodeProcessorImpl.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessorPipeline.process(QueryNodeProcessorPipeline.java:90)
   at 
 org.apache.lucene.queryparser.flexible.core.QueryParserHelper.parse(QueryParserHelper.java:255)
   at 
 org.apache.lucene.queryparser.flexible.standard.StandardQueryParser.parse(StandardQueryParser.java:168)
 {noformat}
 is caused by boosting a tokenizable phrase field within a group.
 {code:java}
 public void testFail() throws Exception {
   test((mimeType:\text-html\));
 }
 public void testOkay() throws Exception {
   test(mimeType:\text-html\);
 }
 static void test(String qs) throws Exception {
   Analyzer sa = new StandardAnalyzer(Version.LUCENE_46);
   StandardQueryParser qp = new StandardQueryParser(sa);
   qp.getFieldsBoost().put(mimeType, 1f);
   qp.parse(qs, content);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923998#comment-13923998
 ] 

ASF subversion and git services commented on LUCENE-5500:
-

Commit 1575306 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1575306 ]

LUCENE-5500: SortingMergePolicy should error if the Sort refers to the score

 SortingMergePolicy should error if the Sort refers to the score
 ---

 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5500.patch


 It should throw an exception if Sort.needsScores() == true. This does not 
 make sense at index-time.
 I think there is no reason for this method to be package-private either (as 
 its just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9600 - Still Failing!

2014-03-07 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9600/
Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 42217 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.compressing...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene3x...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene41...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene42...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene45...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene46...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc] Loading source files for package org.apache.lucene.util.fst...
  [javadoc] Loading source files for package org.apache.lucene.util.mutable...
  [javadoc] Loading source files for package org.apache.lucene.util.packed...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.7.0_60-ea
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Copying file 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
 to directory 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html...
  [javadoc] 1 warning

[...truncated 27 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
  [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
  [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
  [javadoc] Loading source files for package org.apache.lucene.analysis.br...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.charfilter...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ckb...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cn...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.commongrams...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound.hyphenation...
  [javadoc] Loading source files for package org.apache.lucene.analysis.core...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
  [javadoc] Loading source files for package org.apache.lucene.analysis.da...
  [javadoc] Loading source files for package org.apache.lucene.analysis.de...
  [javadoc] Loading source files for package org.apache.lucene.analysis.el...
  [javadoc] Loading source files for package org.apache.lucene.analysis.en...
  [javadoc] Loading source files for package org.apache.lucene.analysis.es...
  [javadoc] Loading 

[jira] [Resolved] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5500.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.8

 SortingMergePolicy should error if the Sort refers to the score
 ---

 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5500.patch


 It should throw an exception if Sort.needsScores() == true. This does not 
 make sense at index-time.
 I think there is no reason for this method to be package-private either (as 
 its just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5500) SortingMergePolicy should error if the Sort refers to the score

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924000#comment-13924000
 ] 

ASF subversion and git services commented on LUCENE-5500:
-

Commit 1575307 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1575307 ]

LUCENE-5500: SortingMergePolicy should error if the Sort refers to the score

 SortingMergePolicy should error if the Sort refers to the score
 ---

 Key: LUCENE-5500
 URL: https://issues.apache.org/jira/browse/LUCENE-5500
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5500.patch


 It should throw an exception if Sort.needsScores() == true. This does not 
 make sense at index-time.
 I think there is no reason for this method to be package-private either (as 
 its just useful sugar, it loops over each SortField and checks needsScores).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5501) Out-of-order collection testing

2014-03-07 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-5501:


 Summary: Out-of-order collection testing
 Key: LUCENE-5501
 URL: https://issues.apache.org/jira/browse/LUCENE-5501
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand


Collectors have the ability to declare whether or not they support out-of-order 
collection, but since most scorers score in order this is not well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5829) Add tag/exclude functionality to the ExpandComponent

2014-03-07 Thread Joel Bernstein (JIRA)
Joel Bernstein created SOLR-5829:


 Summary: Add tag/exclude functionality to the ExpandComponent
 Key: SOLR-5829
 URL: https://issues.apache.org/jira/browse/SOLR-5829
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Reporter: Joel Bernstein
 Fix For: 4.8


Adding tag/exclude functionality to the ExpandComponent would allow it to 
operate independently of the CollapsingQParserPlugin. For example:

q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent

The query above searches all documents limiting the results to type=parent. So 
the main result would contain only parent documents.

The expand component then excludes the type=parent filter and expands the 
groups based on the group_id field. 

Using this approach the main search result will contain only documents with 
type=parent and the expanded results will display the child documents for the 
group.

 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5829) Add tag/exclude functionality to the ExpandComponent

2014-03-07 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5829:
-

Description: 
Adding tag/exclude functionality to the ExpandComponent would allow it to 
operate independently of the CollapsingQParserPlugin. For example:

{code}
q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent
{code}
The query above searches all documents limiting the results to type=parent. So 
the main result would contain only parent documents.

The expand component then excludes the type=parent filter and expands the 
groups based on the group_id field. 

Using this approach the main search result will contain only documents with 
type=parent and the expanded results will display the child documents for the 
group.

 

  was:
Adding tag/exclude functionality to the ExpandComponent would allow it to 
operate independently of the CollapsingQParserPlugin. For example:

q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent

The query above searches all documents limiting the results to type=parent. So 
the main result would contain only parent documents.

The expand component then excludes the type=parent filter and expands the 
groups based on the group_id field. 

Using this approach the main search result will contain only documents with 
type=parent and the expanded results will display the child documents for the 
group.

 


 Add tag/exclude functionality to the ExpandComponent
 

 Key: SOLR-5829
 URL: https://issues.apache.org/jira/browse/SOLR-5829
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Reporter: Joel Bernstein
 Fix For: 4.8


 Adding tag/exclude functionality to the ExpandComponent would allow it to 
 operate independently of the CollapsingQParserPlugin. For example:
 {code}
 q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent
 {code}
 The query above searches all documents limiting the results to type=parent. 
 So the main result would contain only parent documents.
 The expand component then excludes the type=parent filter and expands the 
 groups based on the group_id field. 
 Using this approach the main search result will contain only documents with 
 type=parent and the expanded results will display the child documents for the 
 group.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5829) Add tag/exclude functionality to the ExpandComponent

2014-03-07 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5829:
-

Attachment: SOLR-5829.patch

Initial patch leverages the existing tag/exclude framework for tag/exclude 
faceting. Runs but needs tests.

 Add tag/exclude functionality to the ExpandComponent
 

 Key: SOLR-5829
 URL: https://issues.apache.org/jira/browse/SOLR-5829
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Reporter: Joel Bernstein
 Fix For: 4.8

 Attachments: SOLR-5829.patch


 Adding tag/exclude functionality to the ExpandComponent would allow it to 
 operate independently of the CollapsingQParserPlugin. For example:
 {code}
 q=*:*fq={!tag=parent}type=parentexpand=trueexpand.field=group_idexpand.exclude=parent
 {code}
 The query above searches all documents limiting the results to type=parent. 
 So the main result would contain only parent documents.
 The expand component then excludes the type=parent filter and expands the 
 groups based on the group_id field. 
 Using this approach the main search result will contain only documents with 
 type=parent and the expanded results will display the child documents for the 
 group.
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5501) Out-of-order collection testing

2014-03-07 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5501:
-

Attachment: LUCENE-5501.patch

Here is a simple patch I've been playing with:
 - AssertingWeight.scoresDocsOutOfOrder randomly returns true in order to 
trigger the use of our top docs collectors that tie-break on doc id,
 - AssertingScorer randomly scores in random order when the collector says it 
supports it

It found a bug in the grouping collector whose acceptDocsOutOfOrder method 
returns true although the collect method has a comment that explicitely says 
that the comparison works because doc IDs come in order.

 Out-of-order collection testing
 ---

 Key: LUCENE-5501
 URL: https://issues.apache.org/jira/browse/LUCENE-5501
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
 Attachments: LUCENE-5501.patch


 Collectors have the ability to declare whether or not they support 
 out-of-order collection, but since most scorers score in order this is not 
 well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5501) Out-of-order collection testing

2014-03-07 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924011#comment-13924011
 ] 

Robert Muir commented on LUCENE-5501:
-

Are you sure? I think its ok overall, but of course could be better:

From a line coverage perspective, most of these are at or very close to 100%

Looking at test contributions against each, all the collectors in TopScore* 
look like they get beat up pretty well.
TopField* is not so great, but with several tests trying to explicitly iterate 
over all of them (TestSearchAfter, TestExpressionSorts, etc). More tests for 
these sorting ones might be good, but I don't think the situation is so bad?

https://builds.apache.org/job/Lucene-Solr-Clover-trunk/clover/org/apache/lucene/search/TopScoreDocCollector.html#TopScoreDocCollector
https://builds.apache.org/job/Lucene-Solr-Clover-trunk/clover/org/apache/lucene/search/TopFieldCollector.html#TopFieldCollector


 Out-of-order collection testing
 ---

 Key: LUCENE-5501
 URL: https://issues.apache.org/jira/browse/LUCENE-5501
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
 Attachments: LUCENE-5501.patch


 Collectors have the ability to declare whether or not they support 
 out-of-order collection, but since most scorers score in order this is not 
 well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5501) Out-of-order collection testing

2014-03-07 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924032#comment-13924032
 ] 

Adrien Grand commented on LUCENE-5501:
--

I was not thinking about these collectors at all, I think they are very-well 
tested indeed! I was more thinking about more exotic collectors, like those 
that are used for grouping or joins that won't get out-of-order testing unless 
they use a boolean query.

 Out-of-order collection testing
 ---

 Key: LUCENE-5501
 URL: https://issues.apache.org/jira/browse/LUCENE-5501
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
 Attachments: LUCENE-5501.patch


 Collectors have the ability to declare whether or not they support 
 out-of-order collection, but since most scorers score in order this is not 
 well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5422) Postings lists deduplication

2014-03-07 Thread Vishmi Money (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924036#comment-13924036
 ] 

Vishmi Money commented on LUCENE-5422:
--

[~otis], thank you.

[~mikemccand], yes I agree with you. As you said if cost is added for merging 
of posts lists, despite the server space saved, it will affect for performance. 
Then we have to think about how we can achieve this desired performance while 
trying to save server space.
I will keep this in my mind when I look further in to this.

 Postings lists deduplication
 

 Key: LUCENE-5422
 URL: https://issues.apache.org/jira/browse/LUCENE-5422
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index
Reporter: Dmitry Kan
  Labels: gsoc2014

 The context:
 http://markmail.org/thread/tywtrjjcfdbzww6f
 Robert Muir and I have discussed what Robert eventually named postings
 lists deduplication at Berlin Buzzwords 2013 conference.
 The idea is to allow multiple terms to point to the same postings list to
 save space. This can be achieved by new index codec implementation, but this 
 jira is open to other ideas as well.
 The application / impact of this is positive for synonyms, exact / inexact
 terms, leading wildcard support via storing reversed term etc.
 For example, at the moment, when supporting exact (unstemmed) and inexact 
 (stemmed)
 searches, we store both unstemmed and stemmed variant of a word form and
 that leads to index bloating. That is why we had to remove the leading
 wildcard support via reversing a token on index and query time because of
 the same index size considerations.
 Comment from Mike McCandless:
 Neat idea!
 Would this idea allow a single term to point to (the union of) N other
 posting lists?  It seems like that's necessary e.g. to handle the
 exact/inexact case.
 And then, to produce the Docs/AndPositionsEnum you'd need to do the
 merge sort across those N posting lists?
 Such a thing might also be do-able as runtime only wrapper around the
 postings API (FieldsProducer), if you could at runtime do the reverse
 expansion (e.g. stem - all of its surface forms).
 Comment from Robert Muir:
 I think the exact/inexact is trickier (detecting it would be the hard
 part), and you are right, another solution might work better.
 but for the reverse wildcard and synonyms situation, it seems we could even
 detect it on write if we created some hash of the previous terms postings.
 if the hash matches for the current term, we know it might be a duplicate
 and would have to actually do the costly check they are the same.
 maybe there are better ways to do it, but it might be a fun postingformat
 experiment to try.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5476) Facet sampling

2014-03-07 Thread Rob Audenaerde (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924037#comment-13924037
 ] 

Rob Audenaerde commented on LUCENE-5476:


{quote}
...Given our test framework, randomness is not a big deal at all, since once we 
get a test failure, we can deterministically reproduce the failure (when there 
is no multi-threading)...
{quote}
Ok, this makes sense to me. 

{quote}
It looks like it hasn't changed? I mean besides the rename. So if I set 
sampleSize=100K, it's 100K whether there are 101K docs or 100M docs, right? Is 
that your intention?
{quote}
Correct, it is my intention. I actually prefer not to increase the 
{{sampleSize}} with more hits, as bigger samples are slower and 100K is a nice 
sample size anyway and more hits means more time. I adjust the sampleRatio so 
that the resulting set of documents is (close to) the {{sampleSize}}.

{quote}
I find this assert just redundant – if we always expect 5, we shouldn't assert 
that we received 5. If we say that very infrequently we might get 5 and we're 
OK with it .. what's the point of asserting that at all?
{quote}
Agreed with the 5. Asserting seems redundant, but is that not the point in 
unit-tests? The trick is that the assertion should still hold if you change the 
implementation.. 

I will add more next week. 

Btw. Is there an easy way to retrieve the total facet counts for a ordinal? 
When correcting facet counts it would a quick win to limit the number of 
estimated documents to the actual number of documents in the index that match 
that facet. (And maybe use the distribution as well, to make better estimates)

 Facet sampling
 --

 Key: LUCENE-5476
 URL: https://issues.apache.org/jira/browse/LUCENE-5476
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Rob Audenaerde
 Attachments: LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, 
 LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, LUCENE-5476.patch, 
 SamplingComparison_SamplingFacetsCollector.java, SamplingFacetsCollector.java


 With LUCENE-5339 facet sampling disappeared. 
 When trying to display facet counts on large datasets (10M documents) 
 counting facets is rather expensive, as all the hits are collected and 
 processed. 
 Sampling greatly reduced this and thus provided a nice speedup. Could it be 
 brought back?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 2330 - Still Failing

2014-03-07 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/2330/

All tests passed

Build Log:
[...truncated 3742 lines...]
[javac] Compiling 20 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build/misc/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/misc/src/java/org/apache/lucene/index/sorter/Sorter.java:238:
 cannot find symbol
[javac] symbol  : method compare(int,int)
[javac] location: class java.lang.Integer
[javac]   return Integer.compare(docID1, docID2); // docid order 
tiebreak
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 1 error

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:471:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:451:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:39:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/extra-targets.xml:37:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build.xml:534:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1998:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/module-build.xml:57:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:471:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1736:
 Compile failed; see the compiler error output for details.

Total time: 12 minutes 23 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 2330 - Still Failing

2014-03-07 Thread Robert Muir
Sorry, ill fix

On Fri, Mar 7, 2014 at 11:41 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/2330/

 All tests passed

 Build Log:
 [...truncated 3742 lines...]
 [javac] Compiling 20 source files to 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build/misc/classes/java
 [javac] 
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/misc/src/java/org/apache/lucene/index/sorter/Sorter.java:238:
  cannot find symbol
 [javac] symbol  : method compare(int,int)
 [javac] location: class java.lang.Integer
 [javac]   return Integer.compare(docID1, docID2); // docid order 
 tiebreak
 [javac] ^
 [javac] Note: Some input files use or override a deprecated API.
 [javac] Note: Recompile with -Xlint:deprecation for details.
 [javac] 1 error

 BUILD FAILED
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:471:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:451:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/build.xml:39:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/extra-targets.xml:37:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/build.xml:534:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1998:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/module-build.xml:57:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:471:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java6/lucene/common-build.xml:1736:
  Compile failed; see the compiler error output for details.

 Total time: 12 minutes 23 seconds
 Build step 'Invoke Ant' marked build as failure
 Archiving artifacts
 Recording test results
 Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9594 - Still Failing!

2014-03-07 Thread Michael McCandless
I committed a fix.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Mar 7, 2014 at 12:01 AM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9594/
 Java: 64bit/jdk1.7.0_60-ea-b07 -XX:+UseCompressedOops -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 42183 lines...]
   [javadoc] Generating Javadoc
   [javadoc] Javadoc execution
   [javadoc] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
   [javadoc] Loading source files for package org.apache.lucene...
   [javadoc] Loading source files for package org.apache.lucene.analysis...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.tokenattributes...
   [javadoc] Loading source files for package org.apache.lucene.codecs...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.compressing...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene3x...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene40...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene41...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene42...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene45...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene46...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.perfield...
   [javadoc] Loading source files for package org.apache.lucene.document...
   [javadoc] Loading source files for package org.apache.lucene.index...
   [javadoc] Loading source files for package org.apache.lucene.search...
   [javadoc] Loading source files for package 
 org.apache.lucene.search.payloads...
   [javadoc] Loading source files for package 
 org.apache.lucene.search.similarities...
   [javadoc] Loading source files for package org.apache.lucene.search.spans...
   [javadoc] Loading source files for package org.apache.lucene.store...
   [javadoc] Loading source files for package org.apache.lucene.util...
   [javadoc] Loading source files for package 
 org.apache.lucene.util.automaton...
   [javadoc] Loading source files for package org.apache.lucene.util.fst...
   [javadoc] Loading source files for package org.apache.lucene.util.mutable...
   [javadoc] Loading source files for package org.apache.lucene.util.packed...
   [javadoc] Constructing Javadoc information...
   [javadoc] Standard Doclet version 1.7.0_60-ea
   [javadoc] Building tree for all the packages and classes...
   [javadoc] Generating 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
   [javadoc] Copying file 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
  to directory 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
   [javadoc] Copying file 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
  to directory 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
   [javadoc] Building index for all the packages and classes...
   [javadoc] Building index for all classes...
   [javadoc] Generating 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html...
   [javadoc] 1 warning

 [...truncated 27 lines...]
   [javadoc] Generating Javadoc
   [javadoc] Javadoc execution
   [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
   [javadoc] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
   [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
   [javadoc] Loading source files for package org.apache.lucene.analysis.br...
   [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.charfilter...
   [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
   [javadoc] Loading source files for package org.apache.lucene.analysis.ckb...
   [javadoc] Loading source files for package org.apache.lucene.analysis.cn...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.commongrams...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.compound...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.compound.hyphenation...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.core...
   [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
   [javadoc] Loading source files for package org.apache.lucene.analysis.da...
   [javadoc] Loading source files for package org.apache.lucene.analysis.de...
   

[jira] [Created] (SOLR-5830) Elevate file hardcoded to load from either conf or data directory

2014-03-07 Thread David Stuart (JIRA)
David Stuart created SOLR-5830:
--

 Summary: Elevate file hardcoded to load from either conf or data 
directory
 Key: SOLR-5830
 URL: https://issues.apache.org/jira/browse/SOLR-5830
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.6.3, 4.7, 4.8, 5.0
Reporter: David Stuart


When loading the elevate.xml from the solrconfig the QueryElevationComponent 
class is hard code to look in either conf directory or the data directory. If a 
absolute path is defined it errors out as file not found as it is prepending 
conf and data directories in it check



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4654) Integrate Lucene's sorting and early query termination capabilities into Solr

2014-03-07 Thread Furkan KAMACI (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924081#comment-13924081
 ] 

Furkan KAMACI commented on SOLR-4654:
-

I am volunteer to work on this issue as a part of GSOC 14.

 Integrate Lucene's sorting and early query termination capabilities into Solr
 -

 Key: SOLR-4654
 URL: https://issues.apache.org/jira/browse/SOLR-4654
 Project: Solr
  Issue Type: Improvement
Reporter: Adrien Grand
Priority: Trivial
  Labels: gsoc2014

 I think there would be some interesting work to do to integrate Lucene's 
 sorting and early query termination capabilities into Solr, in particular 
 (just ideas, maybe they're not all interesting/useful):
  - configuring a SortingMergePolicy,
  - figuring out when the sort order of queries matches the sort order of the 
 index segments,
  - giving the ability to get approximated results when the query is not 
 sorted but only boosted by the sort order of the index,
  - integration with TimeLimitingCollector: maybe it's better to collect only 
 half of all segments than to fully collect half of the segments,
  - approximation of the number of matches based on the ratio of collected 
 documents,
  - ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5501) Out-of-order collection testing

2014-03-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924088#comment-13924088
 ] 

Michael McCandless commented on LUCENE-5501:


+1, I love this patch.  You shuffle the docIDs from the scorer before 
delivering to the collector, if the collector claims it can accept out-of-order 
hits.

LUCENE-4950 is related: I tried to fix AssertingIndexSearcher to use 
AssertingCollector but hit strange exceptions with ConstantScoreQuery that I 
never explained.  AssertingCollector would verify that if the collector said it 
could not accept docs out of order, then the scorer does not in fact deliver 
docs out of order.

Also, LUCENE-5487 will increase how often out-of-order scoring is allowed, 
because BooleanScorer will now allow the sub-scorers to score out of order.

 Out-of-order collection testing
 ---

 Key: LUCENE-5501
 URL: https://issues.apache.org/jira/browse/LUCENE-5501
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
 Attachments: LUCENE-5501.patch


 Collectors have the ability to declare whether or not they support 
 out-of-order collection, but since most scorers score in order this is not 
 well tested.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_60-ea-b07) - Build # 9600 - Still Failing!

2014-03-07 Thread Michael McCandless
Same issue, I committed a fix earlier.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Mar 7, 2014 at 11:11 AM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9600/
 Java: 64bit/jdk1.7.0_60-ea-b07 -XX:-UseCompressedOops -XX:+UseG1GC

 All tests passed

 Build Log:
 [...truncated 42217 lines...]
   [javadoc] Generating Javadoc
   [javadoc] Javadoc execution
   [javadoc] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
   [javadoc] Loading source files for package org.apache.lucene...
   [javadoc] Loading source files for package org.apache.lucene.analysis...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.tokenattributes...
   [javadoc] Loading source files for package org.apache.lucene.codecs...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.compressing...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene3x...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene40...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene41...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene42...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene45...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.lucene46...
   [javadoc] Loading source files for package 
 org.apache.lucene.codecs.perfield...
   [javadoc] Loading source files for package org.apache.lucene.document...
   [javadoc] Loading source files for package org.apache.lucene.index...
   [javadoc] Loading source files for package org.apache.lucene.search...
   [javadoc] Loading source files for package 
 org.apache.lucene.search.payloads...
   [javadoc] Loading source files for package 
 org.apache.lucene.search.similarities...
   [javadoc] Loading source files for package org.apache.lucene.search.spans...
   [javadoc] Loading source files for package org.apache.lucene.store...
   [javadoc] Loading source files for package org.apache.lucene.util...
   [javadoc] Loading source files for package 
 org.apache.lucene.util.automaton...
   [javadoc] Loading source files for package org.apache.lucene.util.fst...
   [javadoc] Loading source files for package org.apache.lucene.util.mutable...
   [javadoc] Loading source files for package org.apache.lucene.util.packed...
   [javadoc] Constructing Javadoc information...
   [javadoc] Standard Doclet version 1.7.0_60-ea
   [javadoc] Building tree for all the packages and classes...
   [javadoc] Generating 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
   [javadoc] Copying file 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
  to directory 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
   [javadoc] Copying file 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
  to directory 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/org/apache/lucene/search/doc-files...
   [javadoc] Building index for all the packages and classes...
   [javadoc] Building index for all classes...
   [javadoc] Generating 
 /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/build/docs/core/help-doc.html...
   [javadoc] 1 warning

 [...truncated 27 lines...]
   [javadoc] Generating Javadoc
   [javadoc] Javadoc execution
   [javadoc] warning: [options] bootstrap class path not set in conjunction 
 with -source 1.6
   [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
   [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
   [javadoc] Loading source files for package org.apache.lucene.analysis.br...
   [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.charfilter...
   [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
   [javadoc] Loading source files for package org.apache.lucene.analysis.ckb...
   [javadoc] Loading source files for package org.apache.lucene.analysis.cn...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.commongrams...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.compound...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.compound.hyphenation...
   [javadoc] Loading source files for package 
 org.apache.lucene.analysis.core...
   [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
   [javadoc] Loading source files for package org.apache.lucene.analysis.da...
   [javadoc] Loading source files for package 

[jira] [Updated] (SOLR-5830) Elevate file hardcoded to load from either conf or data directory

2014-03-07 Thread David Stuart (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Stuart updated SOLR-5830:
---

Attachment: SOLR-5830.patch

First pass on a patch.

 Elevate file hardcoded to load from either conf or data directory
 -

 Key: SOLR-5830
 URL: https://issues.apache.org/jira/browse/SOLR-5830
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.6.3, 4.7, 4.8, 5.0
Reporter: David Stuart
 Attachments: SOLR-5830.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 When loading the elevate.xml from the solrconfig the QueryElevationComponent 
 class is hard code to look in either conf directory or the data directory. If 
 a absolute path is defined it errors out as file not found as it is 
 prepending conf and data directories in it check



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5821) Search inconsistency on SolrCloud replicas

2014-03-07 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5821.
--

Resolution: Invalid

First, please raise issues like this on the user's before
raising a JIRA to be sure you are really seeing a bug 
rather than simply misunderstanding.

If your hypothesis is true, try specifying a secondary
known ordering. If scores are tied, then Solr/Lucene
will return the document in internal Lucene ID order,
and you're quite correct that the internal order may be
different in different shards.

Testing this should be as simple as specifying something
similar to 
sort=score desc, id asc


 Search inconsistency on SolrCloud replicas
 --

 Key: SOLR-5821
 URL: https://issues.apache.org/jira/browse/SOLR-5821
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
 Environment: SolrCloud:
 1 shard, 2 replicas
 Both instances/replicas have identical hardware/software:
 CPU(s): 4
 RAM: 8Gb
 HDD: 100Gb
 OS: CentOS 6.5
 ZooKeeper 3.4.5
 Tomcat 8.0.3
 Solr 4.6.1
 Servers are utilized to run Solr only.
Reporter: Maxim Novikov
Priority: Critical
  Labels: cloud, inconsistency, replica, search

 We use the following infrastructure:
 SolrCloud with 1 shard and 2 replicas. The index is built using 
 DataImportHandler (importing data from the database). The number of items in 
 the index can vary from 100 to 100,000,000.
 After indexing part of the data (not necessarily all the data, it is enough 
 to have a small number of items in the search index), we can observe that 
 Solr instances (replicas) return different results for the same search 
 queries. I believe it happens because some of the results have the same 
 scores, and Solr instances return those in a random order.
 PS This is a critical issue for us as we use a load balancer to scale Solr 
 through replicas, and as a result of this issue, we retrieve various results 
 for the same queries all the time. They are not necessarily completely 
 different, but even a couple of items that differ is a deal breaker.
 The expected behaviour would be to always get identical results for the same 
 search queries from all replicas. Otherwise, this cloud thing works just 
 unreliably.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5499) EarlyTerminatingSortingCollector shouldnt require exact Sort match

2014-03-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924098#comment-13924098
 ] 

Michael McCandless commented on LUCENE-5499:


+1

The search-time sort just has to be congruent with the index-time one.

 EarlyTerminatingSortingCollector shouldnt require exact Sort match
 --

 Key: LUCENE-5499
 URL: https://issues.apache.org/jira/browse/LUCENE-5499
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir

 Today EarlyTerminatingSortingCollector requires that the Sort match exactly 
 at query and at index time.
 However, now that you can use any Sort (e.g. with multiple sortfields), this 
 should be improved.
 For example, early termination is fine in the following case:
 * index-time: popularity desc, time desc
 * query-time: popularity desc



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5498) SortingAtomicReader should be package private

2014-03-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924093#comment-13924093
 ] 

Michael McCandless commented on LUCENE-5498:


+1

 SortingAtomicReader should be package private
 -

 Key: LUCENE-5498
 URL: https://issues.apache.org/jira/browse/LUCENE-5498
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 The intended purpose of this reader is to allow you to sort your entire index 
 with IW.addIndexes(IR).
 Perhaps we should supply some kind of tool to do this and hide the reader. 
 Its scary to think of someone using this for searching (based on its name and 
 docs, its probably not clear that it would be ridiculously slow)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Stalled unit tests

2014-03-07 Thread Michael McCandless
I just ran ant test under Solr; it took 4 minutes 25 seconds.

But, in my ~/build.properties I have:

tests.disableHdfs=true
tests.slow=false

Which makes things substantially faster, and also [seems to] sidestep
the Solr tests that false fail.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Mar 7, 2014 at 9:04 AM, Terry Smith sheb...@gmail.com wrote:
 Mike,

 Fair enough. I'll let them run for more than 30 minutes and see what
 happens.

 How long does it take on your machine? I'm happy to signup for the wiki and
 add some extra information to
 http://wiki.apache.org/lucene-java/HowToContribute for folks wanting to
 tinker with Lucene.

 Do the Lucene developers typically run a subset of the test suite to make
 committing cheaper?

 Thanks,

 --Terry



 On Fri, Mar 7, 2014 at 5:52 AM, Michael McCandless
 luc...@mikemccandless.com wrote:

 Unfortunately, some tests take a very long time, and the test infra
 will print these HEARTBEAT messages notifying you that they are still
 running.  They should eventually finish?

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Mar 6, 2014 at 5:09 PM, Terry Smith sheb...@gmail.com wrote:
  I'm sure that I'm just missing something obvious but I'm having trouble
  getting the unit tests to run to completion on my laptop and was hoping
  that
  someone would be kind enough to point me in the right direction.
 
  I've cloned the repository from GitHub
  (http://git.apache.org/lucene-solr.git) and checked out the latest
  commit on
  branch_4x.
 
  commit 6e06247cec1410f32592bfd307c1020b814def06
 
  Author: Robert Muir rm...@apache.org
 
  Date:   Thu Mar 6 19:54:07 2014 +
 
 
  disable slow solr tests in smoketester
 
 
 
  git-svn-id:
  https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1575025
  13f79535-47bb-0310-9956-ffa450edef68
 
 
  Executing ant clean test from the top level directory of the project
  shows
  the tests running but they seems to get stuck in loop with some stalled
  heartbeat messages. If I run the tests directly from lucene/ then they
  complete successfully after about 10 minutes.
 
  I'm using Java 6 under OS X (10.9.2).
 
  $ java -version
 
  java version 1.6.0_65
 
  Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
 
  Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
 
 
  My terminal lists repeating stalled heartbeat messages like so:
 
  HEARTBEAT J2 PID(20104@onyx.local): 2014-03-06T16:53:35, stalled for
  2111s
  at: HdfsLockFactoryTest.testBasic
 
  HEARTBEAT J0 PID(20106@onyx.local): 2014-03-06T16:53:47, stalled for
  2108s
  at: TestSurroundQueryParser.testQueryParser
 
  HEARTBEAT J1 PID(20103@onyx.local): 2014-03-06T16:54:11, stalled for
  2167s
  at: TestRecoveryHdfs.testBuffering
 
  HEARTBEAT J3 PID(20105@onyx.local): 2014-03-06T16:54:23, stalled for
  2165s
  at: HdfsDirectoryTest.testEOF
 
 
  My machine does have 3 java processes chewing CPU, see attached jstack
  dumps
  for more information.
 
  Should I expect the tests to complete on my platform? Do I need to
  specify
  any special flags to give them more memory or to avoid any bad apples?
 
  Thanks in advance,
 
  --Terry
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head

2014-03-07 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924108#comment-13924108
 ] 

David commented on SOLR-5773:
-

What did you change in your latest patch?

 CollapsingQParserPlugin should make elevated documents the group head
 -

 Key: SOLR-5773
 URL: https://issues.apache.org/jira/browse/SOLR-5773
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.6.1
Reporter: David
Assignee: Joel Bernstein
  Labels: collapse, solr
 Fix For: 4.8

 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, 
 SOLR-5773.patch, SOLR-5773.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Hi Joel,
 I sent you an email but I'm not sure if you received it or not. I ran into a 
 bit of trouble using the CollapsingQParserPlugin with elevated documents. To 
 explain it simply, I want to exclude grouped documents when one of the 
 members of the group are contained in the elevated document set. I'm not sure 
 this is possible currently because as you explain above elevated documents 
 are added to the request context after the original query is constructed.
 To try to better illustrate the problem. If I have 2 documents docid=1 and 
 docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 
 first in the results but I have elevated docid 1 then both documents are 
 shown in the results when I really only want the elevated document to be 
 shown in the results.
 Is this something that would be difficult to implement? Any help is 
 appreciated.
 I think the solution would be to remove the documents from liveDocs that 
 share the same groupid in the getBoostDocs() function. Let me know if this 
 makes any sense. I'll continue working towards a solution in the meantime.
 {code}
 private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, 
 SetString boosted) throws IOException {
   IntOpenHashSet boostDocs = null;
   if(boosted != null) {
 SchemaField idField = indexSearcher.getSchema().getUniqueKeyField();
 String fieldName = idField.getName();
 HashSetBytesRef localBoosts = new HashSet(boosted.size()*2);
 IteratorString boostedIt = boosted.iterator();
 while(boostedIt.hasNext()) {
   localBoosts.add(new BytesRef(boostedIt.next()));
 }
 boostDocs = new IntOpenHashSet(boosted.size()*2);
 ListAtomicReaderContextleaves = 
 indexSearcher.getTopReaderContext().leaves();
 TermsEnum termsEnum = null;
 DocsEnum docsEnum = null;
 for(AtomicReaderContext leaf : leaves) {
   AtomicReader reader = leaf.reader();
   int docBase = leaf.docBase;
   Bits liveDocs = reader.getLiveDocs();
   Terms terms = reader.terms(fieldName);
   termsEnum = terms.iterator(termsEnum);
   IteratorBytesRef it = localBoosts.iterator();
   while(it.hasNext()) {
 BytesRef ref = it.next();
 if(termsEnum.seekExact(ref)) {
   docsEnum = termsEnum.docs(liveDocs, docsEnum);
   int doc = docsEnum.nextDoc();
   if(doc != -1) {
 //Found the document.
 boostDocs.add(doc+docBase);
*// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY 
 THE DOCID //*
 it.remove();
   }
 }
   }
 }
   }
   return boostDocs;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5821) Search inconsistency on SolrCloud replicas

2014-03-07 Thread Maxim Novikov (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924107#comment-13924107
 ] 

Maxim Novikov commented on SOLR-5821:
-

Will this additional ordering not impact the performance of search? Considering 
100,000,000 records indexed from the database, and having about 400 search 
requests per second per 1 Solr instance.

 Search inconsistency on SolrCloud replicas
 --

 Key: SOLR-5821
 URL: https://issues.apache.org/jira/browse/SOLR-5821
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
 Environment: SolrCloud:
 1 shard, 2 replicas
 Both instances/replicas have identical hardware/software:
 CPU(s): 4
 RAM: 8Gb
 HDD: 100Gb
 OS: CentOS 6.5
 ZooKeeper 3.4.5
 Tomcat 8.0.3
 Solr 4.6.1
 Servers are utilized to run Solr only.
Reporter: Maxim Novikov
Priority: Critical
  Labels: cloud, inconsistency, replica, search

 We use the following infrastructure:
 SolrCloud with 1 shard and 2 replicas. The index is built using 
 DataImportHandler (importing data from the database). The number of items in 
 the index can vary from 100 to 100,000,000.
 After indexing part of the data (not necessarily all the data, it is enough 
 to have a small number of items in the search index), we can observe that 
 Solr instances (replicas) return different results for the same search 
 queries. I believe it happens because some of the results have the same 
 scores, and Solr instances return those in a random order.
 PS This is a critical issue for us as we use a load balancer to scale Solr 
 through replicas, and as a result of this issue, we retrieve various results 
 for the same queries all the time. They are not necessarily completely 
 different, but even a couple of items that differ is a deal breaker.
 The expected behaviour would be to always get identical results for the same 
 search queries from all replicas. Otherwise, this cloud thing works just 
 unreliably.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head

2014-03-07 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924106#comment-13924106
 ] 

David commented on SOLR-5773:
-

I've got it running in a sandbox environment. Seems to be functioning without 
error under load of up to 3000 requests per minute, though most of these 
queries don't have elevated documents in their result set. But I haven't seen 
any errors so far.

 CollapsingQParserPlugin should make elevated documents the group head
 -

 Key: SOLR-5773
 URL: https://issues.apache.org/jira/browse/SOLR-5773
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.6.1
Reporter: David
Assignee: Joel Bernstein
  Labels: collapse, solr
 Fix For: 4.8

 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, 
 SOLR-5773.patch, SOLR-5773.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Hi Joel,
 I sent you an email but I'm not sure if you received it or not. I ran into a 
 bit of trouble using the CollapsingQParserPlugin with elevated documents. To 
 explain it simply, I want to exclude grouped documents when one of the 
 members of the group are contained in the elevated document set. I'm not sure 
 this is possible currently because as you explain above elevated documents 
 are added to the request context after the original query is constructed.
 To try to better illustrate the problem. If I have 2 documents docid=1 and 
 docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 
 first in the results but I have elevated docid 1 then both documents are 
 shown in the results when I really only want the elevated document to be 
 shown in the results.
 Is this something that would be difficult to implement? Any help is 
 appreciated.
 I think the solution would be to remove the documents from liveDocs that 
 share the same groupid in the getBoostDocs() function. Let me know if this 
 makes any sense. I'll continue working towards a solution in the meantime.
 {code}
 private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, 
 SetString boosted) throws IOException {
   IntOpenHashSet boostDocs = null;
   if(boosted != null) {
 SchemaField idField = indexSearcher.getSchema().getUniqueKeyField();
 String fieldName = idField.getName();
 HashSetBytesRef localBoosts = new HashSet(boosted.size()*2);
 IteratorString boostedIt = boosted.iterator();
 while(boostedIt.hasNext()) {
   localBoosts.add(new BytesRef(boostedIt.next()));
 }
 boostDocs = new IntOpenHashSet(boosted.size()*2);
 ListAtomicReaderContextleaves = 
 indexSearcher.getTopReaderContext().leaves();
 TermsEnum termsEnum = null;
 DocsEnum docsEnum = null;
 for(AtomicReaderContext leaf : leaves) {
   AtomicReader reader = leaf.reader();
   int docBase = leaf.docBase;
   Bits liveDocs = reader.getLiveDocs();
   Terms terms = reader.terms(fieldName);
   termsEnum = terms.iterator(termsEnum);
   IteratorBytesRef it = localBoosts.iterator();
   while(it.hasNext()) {
 BytesRef ref = it.next();
 if(termsEnum.seekExact(ref)) {
   docsEnum = termsEnum.docs(liveDocs, docsEnum);
   int doc = docsEnum.nextDoc();
   if(doc != -1) {
 //Found the document.
 boostDocs.add(doc+docBase);
*// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY 
 THE DOCID //*
 it.remove();
   }
 }
   }
 }
   }
   return boostDocs;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Under what circumtances, termsEnum's next(), or seekExact(), o seekCeli() is more efficient?

2014-03-07 Thread Michael McCandless
On Wed, Mar 5, 2014 at 4:34 PM, hao yan hyan2...@gmail.com wrote:
 Hi, Michael

 1.We find actually both are costly. I am not sure what is the difference btw
 
 first next only once + seekExact from then on and always seekExact.  I
 mean, the first call of next and the first call of seekExact, they are
 different? If what next() does is to load a block of data and position to
 the beginning of th block and seekExact() is to load a block and position to
 the target, then next() should be more efficient, right?

The first next() call is not that different from seekExact: it must
load the block containing the first term and read bytes from it.

After that, next() should be cheaper than seekExact.

 2. Is multiFields/multiTerms/multiTermsEnum efficient ? We have a fixed
 number ( three) segments always. We want to search on the three segments for
 each query. Therefore we borrowed most of the code of multixxx.  Is there
 anyway to optimize this?

They are relatively efficient?  I mean, they must merge-sort the
terms, and manage N segments that might have a term under the hood,
but it's the best we can do (unless you can forceMerge).

But it's better to operate per-segment if you care about performance.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5773) CollapsingQParserPlugin should make elevated documents the group head

2014-03-07 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924123#comment-13924123
 ] 

David commented on SOLR-5773:
-

oh I see it looks like you just added another test

 CollapsingQParserPlugin should make elevated documents the group head
 -

 Key: SOLR-5773
 URL: https://issues.apache.org/jira/browse/SOLR-5773
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Affects Versions: 4.6.1
Reporter: David
Assignee: Joel Bernstein
  Labels: collapse, solr
 Fix For: 4.8

 Attachments: SOLR-5773.patch, SOLR-5773.patch, SOLR-5773.patch, 
 SOLR-5773.patch, SOLR-5773.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Hi Joel,
 I sent you an email but I'm not sure if you received it or not. I ran into a 
 bit of trouble using the CollapsingQParserPlugin with elevated documents. To 
 explain it simply, I want to exclude grouped documents when one of the 
 members of the group are contained in the elevated document set. I'm not sure 
 this is possible currently because as you explain above elevated documents 
 are added to the request context after the original query is constructed.
 To try to better illustrate the problem. If I have 2 documents docid=1 and 
 docid=2 and both have a groupid of 'a'. If a grouped query scores docid 2 
 first in the results but I have elevated docid 1 then both documents are 
 shown in the results when I really only want the elevated document to be 
 shown in the results.
 Is this something that would be difficult to implement? Any help is 
 appreciated.
 I think the solution would be to remove the documents from liveDocs that 
 share the same groupid in the getBoostDocs() function. Let me know if this 
 makes any sense. I'll continue working towards a solution in the meantime.
 {code}
 private IntOpenHashSet getBoostDocs(SolrIndexSearcher indexSearcher, 
 SetString boosted) throws IOException {
   IntOpenHashSet boostDocs = null;
   if(boosted != null) {
 SchemaField idField = indexSearcher.getSchema().getUniqueKeyField();
 String fieldName = idField.getName();
 HashSetBytesRef localBoosts = new HashSet(boosted.size()*2);
 IteratorString boostedIt = boosted.iterator();
 while(boostedIt.hasNext()) {
   localBoosts.add(new BytesRef(boostedIt.next()));
 }
 boostDocs = new IntOpenHashSet(boosted.size()*2);
 ListAtomicReaderContextleaves = 
 indexSearcher.getTopReaderContext().leaves();
 TermsEnum termsEnum = null;
 DocsEnum docsEnum = null;
 for(AtomicReaderContext leaf : leaves) {
   AtomicReader reader = leaf.reader();
   int docBase = leaf.docBase;
   Bits liveDocs = reader.getLiveDocs();
   Terms terms = reader.terms(fieldName);
   termsEnum = terms.iterator(termsEnum);
   IteratorBytesRef it = localBoosts.iterator();
   while(it.hasNext()) {
 BytesRef ref = it.next();
 if(termsEnum.seekExact(ref)) {
   docsEnum = termsEnum.docs(liveDocs, docsEnum);
   int doc = docsEnum.nextDoc();
   if(doc != -1) {
 //Found the document.
 boostDocs.add(doc+docBase);
*// HERE REMOVE ANY DOCUMENTS THAT SHARE THE GROUPID NOT ONLY 
 THE DOCID //*
 it.remove();
   }
 }
   }
 }
   }
   return boostDocs;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5818) distrib search with custom comparator does not quite work correctly

2014-03-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924124#comment-13924124
 ] 

ASF subversion and git services commented on SOLR-5818:
---

Commit 1575344 from [~rjernst] in branch 'dev/trunk'
[ https://svn.apache.org/r1575344 ]

SOLR-5818: distrib search with custom comparator does not quite work correctly

 distrib search with custom comparator does not quite work correctly
 ---

 Key: SOLR-5818
 URL: https://issues.apache.org/jira/browse/SOLR-5818
 Project: Solr
  Issue Type: Bug
Reporter: Ryan Ernst
 Fix For: 4.8, 5.0

 Attachments: SOLR-5818.patch


 In QueryComponent.doFieldSortValues, a scorer is never set on a custom 
 comparator.  We just need to add a fake scorer that can pass through the 
 score from the DocList.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5818) distrib search with custom comparator does not quite work correctly

2014-03-07 Thread Ryan Ernst (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated SOLR-5818:
-

Fix Version/s: 5.0
   4.8

 distrib search with custom comparator does not quite work correctly
 ---

 Key: SOLR-5818
 URL: https://issues.apache.org/jira/browse/SOLR-5818
 Project: Solr
  Issue Type: Bug
Reporter: Ryan Ernst
Assignee: Ryan Ernst
 Fix For: 4.8, 5.0

 Attachments: SOLR-5818.patch


 In QueryComponent.doFieldSortValues, a scorer is never set on a custom 
 comparator.  We just need to add a fake scorer that can pass through the 
 score from the DocList.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5818) distrib search with custom comparator does not quite work correctly

2014-03-07 Thread Ryan Ernst (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst reassigned SOLR-5818:


Assignee: Ryan Ernst

 distrib search with custom comparator does not quite work correctly
 ---

 Key: SOLR-5818
 URL: https://issues.apache.org/jira/browse/SOLR-5818
 Project: Solr
  Issue Type: Bug
Reporter: Ryan Ernst
Assignee: Ryan Ernst
 Fix For: 4.8, 5.0

 Attachments: SOLR-5818.patch


 In QueryComponent.doFieldSortValues, a scorer is never set on a custom 
 comparator.  We just need to add a fake scorer that can pass through the 
 score from the DocList.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5821) Search inconsistency on SolrCloud replicas

2014-03-07 Thread Maxim Novikov (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924127#comment-13924127
 ] 

Maxim Novikov commented on SOLR-5821:
-

PS Regarding misunderstanding and stuff like that... This behavior is 
unexpected for me. As I wrote, I have a load balancer that redirects queries to 
Solr's replicas having the only shard, and running the same query (even not 
specifying any additional parameters), I expect to retrieve the same results. 
You can tell anything about how Solr is implemented internally, but from the 
perspective of Solr's user (search's user) I should not care about that at all. 
That was the point. If you disagree and think that this is sort of a feature, 
not a bug/issue, that is still good to keep this stuff in JIRA. The other 
people who face the same issue will be able to find it, read Solr developers' 
responses, and judge for themselves whether this feature fits the search 
solution they want to get or not.

 Search inconsistency on SolrCloud replicas
 --

 Key: SOLR-5821
 URL: https://issues.apache.org/jira/browse/SOLR-5821
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
 Environment: SolrCloud:
 1 shard, 2 replicas
 Both instances/replicas have identical hardware/software:
 CPU(s): 4
 RAM: 8Gb
 HDD: 100Gb
 OS: CentOS 6.5
 ZooKeeper 3.4.5
 Tomcat 8.0.3
 Solr 4.6.1
 Servers are utilized to run Solr only.
Reporter: Maxim Novikov
Priority: Critical
  Labels: cloud, inconsistency, replica, search

 We use the following infrastructure:
 SolrCloud with 1 shard and 2 replicas. The index is built using 
 DataImportHandler (importing data from the database). The number of items in 
 the index can vary from 100 to 100,000,000.
 After indexing part of the data (not necessarily all the data, it is enough 
 to have a small number of items in the search index), we can observe that 
 Solr instances (replicas) return different results for the same search 
 queries. I believe it happens because some of the results have the same 
 scores, and Solr instances return those in a random order.
 PS This is a critical issue for us as we use a load balancer to scale Solr 
 through replicas, and as a result of this issue, we retrieve various results 
 for the same queries all the time. They are not necessarily completely 
 different, but even a couple of items that differ is a deal breaker.
 The expected behaviour would be to always get identical results for the same 
 search queries from all replicas. Otherwise, this cloud thing works just 
 unreliably.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-5488) FilteredQuery.explain does not honor FilterStrategy

2014-03-07 Thread Michael Busch (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch reassigned LUCENE-5488:
-

Assignee: Michael Busch

 FilteredQuery.explain does not honor FilterStrategy
 ---

 Key: LUCENE-5488
 URL: https://issues.apache.org/jira/browse/LUCENE-5488
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 4.6.1
Reporter: John Wang
Assignee: Michael Busch
 Attachments: LUCENE-5488.patch, LUCENE-5488.patch


 Some Filter implementations produce DocIdSets without the iterator() 
 implementation, such as o.a.l.facet.range.Range.getFilter(). It is done with 
 the intention to be used in conjunction with FilteredQuery with 
 FilterStrategy set to be QUERY_FIRST_FILTER_STRATEGY for performance reasons.
 However, this behavior is not honored by FilteredQuery.explain where 
 docidset.iterator is called regardless and causing such valid usages of above 
 filter types to fail.
 The fix is to check bits() first and and fall back to iterator if bits is 
 null. In which case, the input Filter is indeed bad.
 See attached unit test, which fails without this patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5823) Add utility function for internal code to know if it is currently the overseer

2014-03-07 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924138#comment-13924138
 ] 

Hoss Man commented on SOLR-5823:


miller  i talked a bit about this on IRC this morning, a few summary points...

* the reasons i'm looking for a general am i the leader type method, that can 
be run as part of a scheduled executor -- instead of adding a new processing 
thread to the Overseer class is two fold:
** i want the logic to be usable even if we aren't in cloud mode
** i'm trying to think about how other people who write plugins/components 
would be able to do the same thing w/o needing to modify the overseer.

* Tim's patch goes the route of ensuring every node can ask what is the name 
of the overseer node and then implements am i the overseer node by comparing 
our name with the overser name
** in the case of who is the shard leader and am i the shard leader that 
info is cached in the cluster state info, so calling those methods doesn't hit 
ZK everytime
** we don't want to cache the overseer info in a similar way, because it's 
risky and 99% of the time, nodes don't care who is the overseer

Which brought me to the key question where miller  i realized we had gotten 
side tracked...

* i don't really care about the what is the name of the overseer node case -- 
and most people shouldn't -- i'm really just looking for the am i currently 
the overseer? part of the equation
** this as a simple boolean should be a much easier question to answer 
efficiently, because of how the overseer election works -- if a node is the 
overseer, it's running hte overseer processing threads
** part of my confusion was the terminology: the idea of Leader is used a lot 
in the overseer code, but that's not refering to shard leader in the solr 
context, it's refering to the ZK jargon of leader election, in many cases (in 
the overseer classes) it refers to who is the (zk leader in charge of being 
the) overseer

At this point, miller got disconnected from IRC ... but digging in a bit and 
thinking about what he was telling me, it seems like we should be able to add 
an efficient ZkController.isOverseer() method (that doesn't have to hit Zk 
directly), by checking if the Overseer object is active or closed -- either 
with a new state boolean, or maybe just by checking the threads it manages for 
null


 Add utility function for internal code to know if it is currently the overseer
 --

 Key: SOLR-5823
 URL: https://issues.apache.org/jira/browse/SOLR-5823
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Attachments: SOLR-5823.patch


 It would be useful if there was some Overseer equivalent to 
 CloudDescriptor.isLeader() that plugins running in solr could use to know At 
 this moment, am i the leader? 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5496) Nuke fuzzyMinSim and replace with maxEdits for FuzzyQuery and its friends

2014-03-07 Thread Tim Allison (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated LUCENE-5496:


Attachment: LUCENE-5496-lucene_core_sandbox_v1.patch

This is a first pass at nuking minsims in Lucene core and sandbox in trunk.  
More work remains in queryparser and in Solr.  

I've Ignored the test in TestSlowFuzzyQuery2 for now...

Will continue work if anyone has an interest.  If not, this will go on hold.





 Nuke fuzzyMinSim and replace with maxEdits for FuzzyQuery and its friends
 -

 Key: LUCENE-5496
 URL: https://issues.apache.org/jira/browse/LUCENE-5496
 Project: Lucene - Core
  Issue Type: Task
  Components: core/queryparser, core/search
Affects Versions: 4.8, 5.0
Reporter: Tim Allison
Priority: Minor
 Attachments: LUCENE-5496-lucene_core_sandbox_v1.patch, 
 LUCENE-5496_4x_deprecations.patch


 As we get closer to 5.0, I propose adding some deprecations in the 
 queryparsers realm of 4.x.
 Are we ready to get rid of all fuzzyMinSims in trunk?  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5818) distrib search with custom comparator does not quite work correctly

2014-03-07 Thread Ryan Ernst (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst resolved SOLR-5818.
--

Resolution: Fixed

 distrib search with custom comparator does not quite work correctly
 ---

 Key: SOLR-5818
 URL: https://issues.apache.org/jira/browse/SOLR-5818
 Project: Solr
  Issue Type: Bug
Reporter: Ryan Ernst
Assignee: Ryan Ernst
 Fix For: 4.8, 5.0

 Attachments: SOLR-5818.patch


 In QueryComponent.doFieldSortValues, a scorer is never set on a custom 
 comparator.  We just need to add a fake scorer that can pass through the 
 score from the DocList.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >