[jira] [Created] (SOLR-6100) BlendedInfixSuggester and AnalyzingInfixSuggester are never closed on core shutdown (unremovable files on Windows)

2014-05-21 Thread Dawid Weiss (JIRA)
Dawid Weiss created SOLR-6100:
-

 Summary: BlendedInfixSuggester and AnalyzingInfixSuggester are 
never closed on core shutdown (unremovable files on Windows)
 Key: SOLR-6100
 URL: https://issues.apache.org/jira/browse/SOLR-6100
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0


In essense these classes are Closeable but neither SolrSuggester nor Suggester 
close them at the core shutdown time.

I'm also not sure what the difference is between SolrSuggester and Suggester 
and whether both or them are needed. They seem awfully similar...

I've fixed the problem with the attached patch on LUCENE-5650, but I'd 
appreciate if somebody with a deeper knowledge of Solr could chime in and 
confirm the patch is all right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5650) createTempDir and associated functions no longer create java.io.tmpdir

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004436#comment-14004436
 ] 

ASF subversion and git services commented on LUCENE-5650:
-

Commit 1596497 from [~dawidweiss] in branch 'dev/branches/lucene5650'
[ https://svn.apache.org/r1596497 ]

SOLR-6100, LUCENE-5650: fix an uncloseable file leak in solr suggesters.

 createTempDir and associated functions no longer create java.io.tmpdir
 --

 Key: LUCENE-5650
 URL: https://issues.apache.org/jira/browse/LUCENE-5650
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
Reporter: Ryan Ernst
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch, 
 LUCENE-5650.patch, dih.patch


 The recent refactoring to all the create temp file/dir functions (which is 
 great!) has a minor regression from what existed before.  With the old 
 {{LuceneTestCase.TEMP_DIR}}, the directory was created if it did not exist.  
 So, if you set {{java.io.tmpdir}} to {{./temp}}, then it would create that 
 dir within the per jvm working dir.  However, {{getBaseTempDirForClass()}} 
 now does asserts that check the dir exists, is a dir, and is writeable.
 Lucene uses {{.}} as {{java.io.tmpdir}}.  Then in the test security 
 manager, the per jvm cwd has read/write/execute permissions.  However, this 
 allows tests to write to their cwd, which I'm trying to protect against (by 
 setting cwd to read/execute in my test security manager).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6100) BlendedInfixSuggester and AnalyzingInfixSuggester are never closed on core shutdown (unremovable files on Windows)

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004435#comment-14004435
 ] 

ASF subversion and git services commented on SOLR-6100:
---

Commit 1596497 from [~dawidweiss] in branch 'dev/branches/lucene5650'
[ https://svn.apache.org/r1596497 ]

SOLR-6100, LUCENE-5650: fix an uncloseable file leak in solr suggesters.

 BlendedInfixSuggester and AnalyzingInfixSuggester are never closed on core 
 shutdown (unremovable files on Windows)
 --

 Key: SOLR-6100
 URL: https://issues.apache.org/jira/browse/SOLR-6100
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-6100.patch


 In essense these classes are Closeable but neither SolrSuggester nor 
 Suggester close them at the core shutdown time.
 I'm also not sure what the difference is between SolrSuggester and Suggester 
 and whether both or them are needed. They seem awfully similar...
 I've fixed the problem with the attached patch on LUCENE-5650, but I'd 
 appreciate if somebody with a deeper knowledge of Solr could chime in and 
 confirm the patch is all right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6100) BlendedInfixSuggester and AnalyzingInfixSuggester are never closed on core shutdown (unremovable files on Windows)

2014-05-21 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-6100:
--

Attachment: SOLR-6100.patch

 BlendedInfixSuggester and AnalyzingInfixSuggester are never closed on core 
 shutdown (unremovable files on Windows)
 --

 Key: SOLR-6100
 URL: https://issues.apache.org/jira/browse/SOLR-6100
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-6100.patch


 In essense these classes are Closeable but neither SolrSuggester nor 
 Suggester close them at the core shutdown time.
 I'm also not sure what the difference is between SolrSuggester and Suggester 
 and whether both or them are needed. They seem awfully similar...
 I've fixed the problem with the attached patch on LUCENE-5650, but I'd 
 appreciate if somebody with a deeper knowledge of Solr could chime in and 
 confirm the patch is all right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6098) SOLR console displaying JSON does not escape text properly

2014-05-21 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-6098:


Affects Version/s: 4.4

 SOLR console displaying JSON does not escape text properly
 --

 Key: SOLR-6098
 URL: https://issues.apache.org/jira/browse/SOLR-6098
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.4
Reporter: Kingston Duffie
Priority: Minor
 Fix For: 4.5


 In the SOLR admin web console, when displaying JSON response for Query, the 
 text is not being HTML escaped, so any text that happens to match HTML markup 
 is being processed as HTML. 
 For example, enter strikehello/strike in the q textbox and the 
 responseHeader will contain:
 q: body:hello where the hello portion is shown using strikeout.  
 This seems benign, but can be extremely confusing when viewing results, 
 because if your fields happen to contain, for example, f...@bar.com, this 
 will be completely missing (because the browser treats this as an invalid 
 tag).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6098) SOLR console displaying JSON does not escape text properly

2014-05-21 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) resolved SOLR-6098.
-

   Resolution: Duplicate
Fix Version/s: 4.5
 Assignee: Stefan Matheis (steffkes)

 SOLR console displaying JSON does not escape text properly
 --

 Key: SOLR-6098
 URL: https://issues.apache.org/jira/browse/SOLR-6098
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.4
Reporter: Kingston Duffie
Assignee: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.5


 In the SOLR admin web console, when displaying JSON response for Query, the 
 text is not being HTML escaped, so any text that happens to match HTML markup 
 is being processed as HTML. 
 For example, enter strikehello/strike in the q textbox and the 
 responseHeader will contain:
 q: body:hello where the hello portion is shown using strikeout.  
 This seems benign, but can be extremely confusing when viewing results, 
 because if your fields happen to contain, for example, f...@bar.com, this 
 will be completely missing (because the browser treats this as an invalid 
 tag).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5309) Investigate ShardSplitTest failures

2014-05-21 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004472#comment-14004472
 ] 

Shalin Shekhar Mangar commented on SOLR-5309:
-

I am looking at these failure again today. Yeah, it's been that busy around 
here :(

I implemented a RateLimitedDirectoryFactory for Solr with a very small limit 
and forced ShardSplitTest to use it always. This helped reproduce the issue for 
me. I have finally managed to track down the root cause. It always perplexed me 
that the difference between expected and actual doc counts was almost always 1.

Whenever we add/delete documents during shard splitting, we synchronously 
forward the request to the appropriate sub-shard. For add requests, a single 
sub-shard is selected but for delete by ids, we weren't selecting a single 
sub-shard. Instead we are forwarding the delete by id to all sub-shards. This 
works out fine and doesn't cause any damage in practice because the id exists 
only on one shard. However, when one sub-shard (the right one) accepts the 
delete and the other rejects it (maybe because it became active in the 
mean-time) then the client (ShardSplitTest) gets an error back and assumes that 
the delete did not succeed whereas it actually succeeded on the right sub-shard.

We always advise our users to retry update operations upon failure and they 
would be fine if they follow this advise during shard splitting also. 
ShardSplitTest unfortunately doesn't follow that advice and just counts 
success/failures and ends up with an inconsistent state.

I'll start by fixing delete-by-id to route requests to the correct (single) 
sub-shard and enabling this test again.

 Investigate ShardSplitTest failures
 ---

 Key: SOLR-5309
 URL: https://issues.apache.org/jira/browse/SOLR-5309
 Project: Solr
  Issue Type: Task
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Blocker

 Investigate why ShardSplitTest if failing sporadically.
 Some recent failures:
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3328/
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7760/
 http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/861/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5309) Investigate ShardSplitTest failures

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004490#comment-14004490
 ] 

ASF subversion and git services commented on SOLR-5309:
---

Commit 1596510 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1596510 ]

SOLR-5309: Fix DUP.processDelete to route delete-by-id to one sub-shard only. 
Enable ShardSplitTest again.

 Investigate ShardSplitTest failures
 ---

 Key: SOLR-5309
 URL: https://issues.apache.org/jira/browse/SOLR-5309
 Project: Solr
  Issue Type: Task
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Blocker

 Investigate why ShardSplitTest if failing sporadically.
 Some recent failures:
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3328/
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7760/
 http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/861/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6101) Shard splitting doesn't work in legacyCloud=false mode

2014-05-21 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-6101:
---

 Summary: Shard splitting doesn't work in legacyCloud=false mode
 Key: SOLR-6101
 URL: https://issues.apache.org/jira/browse/SOLR-6101
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0


When we invoke splitshard Collection API against a cluster with 
legacyCloud=false, we get the following errors:

{code}
2014-05-15 21:07:58,986 
[Overseer-163819091268403216-ec2-x.compute-1.amazonaws.com:8986_solr-n_51]
 ERROR solr.cloud.OverseerCollectionProcessor  - Collection splitshard of 
splitshard failed:org.apache.solr.common.SolrException: Could not find 
coreNodeName
at 
org.apache.solr.cloud.OverseerCollectionProcessor.waitForCoreNodeName(OverseerCollectionProcessor.java:1504)
at 
org.apache.solr.cloud.OverseerCollectionProcessor.splitShard(OverseerCollectionProcessor.java:1255)
at 
org.apache.solr.cloud.OverseerCollectionProcessor.processMessage(OverseerCollectionProcessor.java:472)
at 
org.apache.solr.cloud.OverseerCollectionProcessor.run(OverseerCollectionProcessor.java:248)
at java.lang.Thread.run(Thread.java:745)

2014-05-15 21:07:59,003 
[Overseer-163819091268403216-ec2-xxx.compute-1.amazonaws.com:8986_solr-n_51]
 INFO  solr.cloud.OverseerCollectionProcessor  - Overseer Collection Processor: 
Message id:/overseer/collection-queue-work/qn-18 complete, 
response:{success={null={responseHeader={status=0,QTime=1}},null={responseHeader={status=0,QTime=1}}},split117278106116750={responseHeader={status=0,QTime=0},STATUS=failed,Response=Error
 CREATEing SolrCore '3M_shard1_1_replica1': non legacy mode coreNodeName 
missing 
shard=shard1_1name=3M_shard1_1_replica1action=CREATEcollection=3Mwt=javabinqt=/admin/coresasync=split117278106116750version=2},Operation
 splitshard caused exception:=org.apache.solr.common.SolrException: Could not 
find coreNodeName,exception={msg=Could not find coreNodeName,rspCode=500}}
{code}

The sub-shard replica (leader) creation fails due to:
{code}
{
responseHeader: {
status: 0,
QTime: 0
},
STATUS: failed,
Response: Error CREATEing SolrCore '3M_shard1_0_replica1': non legacy mode 
coreNodeName missing 
shard=shard1_0name=3M_shard1_0_replica1action=CREATEcollection=3Mwt=javabinqt=/admin/coresasync=split117278099904930version=2
}
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5675) ID postings format

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004504#comment-14004504
 ] 

ASF subversion and git services commented on LUCENE-5675:
-

Commit 1596512 from [~mikemccand] in branch 'dev/branches/lucene5675'
[ https://svn.apache.org/r1596512 ]

LUCENE-5675: fix nocommits

 ID postings format
 

 Key: LUCENE-5675
 URL: https://issues.apache.org/jira/browse/LUCENE-5675
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir

 Today the primary key lookup in lucene is not that great for systems like 
 solr and elasticsearch that have versioning in front of IndexWriter.
 To some extend BlockTree can sometimes help avoid seeks by telling you the 
 term does not exist for a segment. But this technique (based on FST prefix) 
 is fragile. The only other choice today is bloom filters, which use up huge 
 amounts of memory.
 I don't think we are using everything we know: particularly the version 
 semantics.
 Instead, if the FST for the terms index used an algebra that represents the 
 max version for any subtree, we might be able to answer that there is no term 
 T with version  V in that segment very efficiently.
 Also ID fields dont need postings lists, they dont need stats like 
 docfreq/totaltermfreq, etc this stuff is all implicit. 
 As far as API, i think for users to provide IDs with versions to such a PF, 
 a start would to set a payload or whatever on the term field to get it thru 
 indexwriter to the codec. And a consumer of the codec can just cast the 
 Terms to a subclass that exposes the FST to do this version check efficiently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5650) createTempDir and associated functions no longer create java.io.tmpdir

2014-05-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004515#comment-14004515
 ] 

Dawid Weiss commented on LUCENE-5650:
-

All tests passed for me with the current state of the branch (including 
nightlies).

 createTempDir and associated functions no longer create java.io.tmpdir
 --

 Key: LUCENE-5650
 URL: https://issues.apache.org/jira/browse/LUCENE-5650
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
Reporter: Ryan Ernst
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch, 
 LUCENE-5650.patch, dih.patch


 The recent refactoring to all the create temp file/dir functions (which is 
 great!) has a minor regression from what existed before.  With the old 
 {{LuceneTestCase.TEMP_DIR}}, the directory was created if it did not exist.  
 So, if you set {{java.io.tmpdir}} to {{./temp}}, then it would create that 
 dir within the per jvm working dir.  However, {{getBaseTempDirForClass()}} 
 now does asserts that check the dir exists, is a dir, and is writeable.
 Lucene uses {{.}} as {{java.io.tmpdir}}.  Then in the test security 
 manager, the per jvm cwd has read/write/execute permissions.  However, this 
 allows tests to write to their cwd, which I'm trying to protect against (by 
 setting cwd to read/execute in my test security manager).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5689) FieldInfo.setDocValuesGen should not be public.

2014-05-21 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004557#comment-14004557
 ] 

Shai Erera commented on LUCENE-5689:


ReaderAndUpdates already clones all FIs and then updates the dvGen of the ones 
that are updated now. So cloning again is silly ... perhaps we can get rid of 
it some day, but I agree, let's remove the public first. And yes, if you modify 
the dvGen on an AtomicReader, you might hit weird exceptions like FNFE when the 
reader will try to lookup the field's dv-gen'd file.

 FieldInfo.setDocValuesGen should not be public.
 ---

 Key: LUCENE-5689
 URL: https://issues.apache.org/jira/browse/LUCENE-5689
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5689.patch


 its currently public and users can modify it. We made this class mostly 
 immutable long ago: remember its returned by the atomicreader API!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5679) Consolidate IndexWriter.deleteDocuments()

2014-05-21 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004560#comment-14004560
 ] 

Shai Erera commented on LUCENE-5679:


I'm not sure how critical it is Uwe. Yes, it means users need to recompile 
their app's code, but this is minor? It's not like they need to change the 
code, only recompile it. I am still waiting for someone to say that he upgrades 
his search app to a newer Lucene version by simply dropping the new jar  
4.9 already includes changes to runtime behavior and some back-compat changes.

 Consolidate IndexWriter.deleteDocuments()
 -

 Key: LUCENE-5679
 URL: https://issues.apache.org/jira/browse/LUCENE-5679
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5679.patch


 Spinoff from here: http://markmail.org/message/7kjlaizqdh7kst4d. We should 
 consolidate the various IW.deleteDocuments().



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5688) NumericDocValues fields with sparse data can be compressed better

2014-05-21 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004562#comment-14004562
 ] 

Adrien Grand commented on LUCENE-5688:
--

+1 to using binary search on an in-memory {{MonotonicBlockPackedReader}} to 
implement sparse doc values. 

 NumericDocValues fields with sparse data can be compressed better 
 --

 Key: LUCENE-5688
 URL: https://issues.apache.org/jira/browse/LUCENE-5688
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Varun Thacker
Priority: Minor
 Attachments: LUCENE-5688.patch


 I ran into this problem where I had a dynamic field in Solr and indexed data 
 into lots of fields. For each field only a few documents had actual values 
 and the remaining documents the default value ( 0 ) got indexed. Now when I 
 merge segments, the index size jumps up.
 For example I have 10 segments - Each with 1 DV field. When I merge segments 
 into 1 that segment will contain all 10 DV fields with lots if 0s. 
 This was the motivation behind trying to come up with a compression for a use 
 case like this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6102) The 'addreplica' Collection API does not support property params

2014-05-21 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-6102:
---

 Summary: The 'addreplica' Collection API does not support property 
params
 Key: SOLR-6102
 URL: https://issues.apache.org/jira/browse/SOLR-6102
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8.1, 4.8
Reporter: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0


All Collection APIs except 'addreplica', support passing core properties in the 
property.XXX format. Such property params are passed directly the core admin 
APIs invoked by these collection APIs.

Not supporting these params is a bug and we should fix it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5689) FieldInfo.setDocValuesGen should not be public.

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004614#comment-14004614
 ] 

ASF subversion and git services commented on LUCENE-5689:
-

Commit 1596553 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1596553 ]

LUCENE-5689: FieldInfo.setDocValuesGen should not be public

 FieldInfo.setDocValuesGen should not be public.
 ---

 Key: LUCENE-5689
 URL: https://issues.apache.org/jira/browse/LUCENE-5689
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5689.patch


 its currently public and users can modify it. We made this class mostly 
 immutable long ago: remember its returned by the atomicreader API!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5689) FieldInfo.setDocValuesGen should not be public.

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004617#comment-14004617
 ] 

ASF subversion and git services commented on LUCENE-5689:
-

Commit 1596555 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596555 ]

LUCENE-5689: FieldInfo.setDocValuesGen should not be public

 FieldInfo.setDocValuesGen should not be public.
 ---

 Key: LUCENE-5689
 URL: https://issues.apache.org/jira/browse/LUCENE-5689
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5689.patch


 its currently public and users can modify it. We made this class mostly 
 immutable long ago: remember its returned by the atomicreader API!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5689) FieldInfo.setDocValuesGen should not be public.

2014-05-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5689.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.9

 FieldInfo.setDocValuesGen should not be public.
 ---

 Key: LUCENE-5689
 URL: https://issues.apache.org/jira/browse/LUCENE-5689
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5689.patch


 its currently public and users can modify it. We made this class mostly 
 immutable long ago: remember its returned by the atomicreader API!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5636) SegmentCommitInfo continues to list unneeded gen'd files

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004644#comment-14004644
 ] 

ASF subversion and git services commented on LUCENE-5636:
-

Commit 1596570 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1596570 ]

LUCENE-5618, LUCENE-5636: write each DocValues update in a separate file; stop 
referencing old fieldInfos files

 SegmentCommitInfo continues to list unneeded gen'd files
 

 Key: LUCENE-5636
 URL: https://issues.apache.org/jira/browse/LUCENE-5636
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5636.patch


 I thought I handled it in LUCENE-5246, but turns out I didn't handle it 
 fully. I'll upload a patch which improves the test to expose the bug. I know 
 where it is, but I'm not sure how to fix it without breaking index 
 back-compat. Can we do that on experimental features?
 The problem is that if you update different fields in different gens, the 
 FieldInfos files of older gens remain referenced (still!!). I open a new 
 issue since LUCENE-5246 is already resolved and released, so don't want to 
 mess up our JIRA...
 The severity of the bug is that unneeded files are still referenced in the 
 index. Everything still works correctly, it's just that .fnm files are still 
 there. But as I wrote, I'm still not sure how to solve it without requiring 
 apps that use dv updates to reindex.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004643#comment-14004643
 ] 

ASF subversion and git services commented on LUCENE-5618:
-

Commit 1596570 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1596570 ]

LUCENE-5618, LUCENE-5636: write each DocValues update in a separate file; stop 
referencing old fieldInfos files

 DocValues updates send wrong fieldinfos to codec producers
 --

 Key: LUCENE-5618
 URL: https://issues.apache.org/jira/browse/LUCENE-5618
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Shai Erera
Priority: Blocker
 Fix For: 4.9

 Attachments: LUCENE-5618.patch, LUCENE-5618.patch, LUCENE-5618.patch, 
 LUCENE-5618.patch, LUCENE-5618.patch


 Spinoff from LUCENE-5616.
 See the example there, docvalues readers get a fieldinfos, but it doesn't 
 contain the correct ones, so they have invalid field numbers at read time.
 This should really be fixed. Maybe a simple solution is to not write 
 batches of fields in updates but just have only one field per gen? 
 This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6091) Race condition in prioritizeOverseerNodes can trigger extra QUIT operations

2014-05-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004671#comment-14004671
 ] 

Noble Paul commented on SOLR-6091:
--

[~mewmewball] I implemented this and I see the race condition happening in my 
cluster

 Race condition in prioritizeOverseerNodes can trigger extra QUIT operations
 ---

 Key: SOLR-6091
 URL: https://issues.apache.org/jira/browse/SOLR-6091
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7, 4.8
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0

 Attachments: SOLR-6091.patch


 When using the overseer roles feature, there is a possibility of more than 
 one thread executing the prioritizeOverseerNodes method and extra QUIT 
 commands being inserted into the overseer queue.
 At a minimum, the prioritizeOverseerNodes should be synchronized to avoid a 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6103) Add DateRangeField

2014-05-21 Thread David Smiley (JIRA)
David Smiley created SOLR-6103:
--

 Summary: Add DateRangeField
 Key: SOLR-6103
 URL: https://issues.apache.org/jira/browse/SOLR-6103
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley


LUCENE-5648 introduced a date range index  search capability in the spatial 
module. This issue is for a corresponding Solr FieldType to be named 
DateRangeField. LUCENE-5648 includes a parseCalendar(String) method that 
parses a superset of Solr's strict date format.  It also parses partial dates 
(e.g.: 2014-10  has month specificity), and the trailing 'Z' is optional, and a 
leading +/- may be present (minus indicates BC era), and * means all-time.  
The proposed field type would use it to parse a string and also both ends of a 
range query, but furthermore it will also allow an arbitrary range query of the 
form {{calspec TO calspec}} such as:
{noformat}2000 TO 2014-05-21T10{noformat}
Which parses as the year 2000 thru 2014 May 21st 10am (GMT). 
I suggest this syntax because it is aligned with Lucene's range query syntax.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6103) Add DateRangeField

2014-05-21 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004709#comment-14004709
 ] 

David Smiley commented on SOLR-6103:


It just occurred to me that {noformat}* TO 2014{noformat} ought to be supported 
but it doesn't work -- I'll fix that in LUCENE-5648.

Perhaps the range syntax should include matching '[' and ']'?  It's only 
pertinent for indexing ranges; at query time you might as well use the normal 
range query syntax.  One aspect I haven't considered is exclusive boundaries, 
but I think it's generally a non-issue because of the rounding this field 
supports.

Note that LUCENE-5648 is still only v5/trunk for the moment.

 Add DateRangeField
 --

 Key: SOLR-6103
 URL: https://issues.apache.org/jira/browse/SOLR-6103
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley

 LUCENE-5648 introduced a date range index  search capability in the spatial 
 module. This issue is for a corresponding Solr FieldType to be named 
 DateRangeField. LUCENE-5648 includes a parseCalendar(String) method that 
 parses a superset of Solr's strict date format.  It also parses partial dates 
 (e.g.: 2014-10  has month specificity), and the trailing 'Z' is optional, and 
 a leading +/- may be present (minus indicates BC era), and * means 
 all-time.  The proposed field type would use it to parse a string and also 
 both ends of a range query, but furthermore it will also allow an arbitrary 
 range query of the form {{calspec TO calspec}} such as:
 {noformat}2000 TO 2014-05-21T10{noformat}
 Which parses as the year 2000 thru 2014 May 21st 10am (GMT). 
 I suggest this syntax because it is aligned with Lucene's range query syntax. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004714#comment-14004714
 ] 

ASF subversion and git services commented on LUCENE-5618:
-

Commit 1596582 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596582 ]

LUCENE-5618, LUCENE-5636: write each DocValues update in a separate file; stop 
referencing old fieldInfos files

 DocValues updates send wrong fieldinfos to codec producers
 --

 Key: LUCENE-5618
 URL: https://issues.apache.org/jira/browse/LUCENE-5618
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Shai Erera
Priority: Blocker
 Fix For: 4.9

 Attachments: LUCENE-5618.patch, LUCENE-5618.patch, LUCENE-5618.patch, 
 LUCENE-5618.patch, LUCENE-5618.patch


 Spinoff from LUCENE-5616.
 See the example there, docvalues readers get a fieldinfos, but it doesn't 
 contain the correct ones, so they have invalid field numbers at read time.
 This should really be fixed. Maybe a simple solution is to not write 
 batches of fields in updates but just have only one field per gen? 
 This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5636) SegmentCommitInfo continues to list unneeded gen'd files

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004715#comment-14004715
 ] 

ASF subversion and git services commented on LUCENE-5636:
-

Commit 1596582 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596582 ]

LUCENE-5618, LUCENE-5636: write each DocValues update in a separate file; stop 
referencing old fieldInfos files

 SegmentCommitInfo continues to list unneeded gen'd files
 

 Key: LUCENE-5636
 URL: https://issues.apache.org/jira/browse/LUCENE-5636
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5636.patch


 I thought I handled it in LUCENE-5246, but turns out I didn't handle it 
 fully. I'll upload a patch which improves the test to expose the bug. I know 
 where it is, but I'm not sure how to fix it without breaking index 
 back-compat. Can we do that on experimental features?
 The problem is that if you update different fields in different gens, the 
 FieldInfos files of older gens remain referenced (still!!). I open a new 
 issue since LUCENE-5246 is already resolved and released, so don't want to 
 mess up our JIRA...
 The severity of the bug is that unneeded files are still referenced in the 
 index. Everything still works correctly, it's just that .fnm files are still 
 there. But as I wrote, I'm still not sure how to solve it without requiring 
 apps that use dv updates to reindex.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5618) DocValues updates send wrong fieldinfos to codec producers

2014-05-21 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5618.


   Resolution: Fixed
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x.

 DocValues updates send wrong fieldinfos to codec producers
 --

 Key: LUCENE-5618
 URL: https://issues.apache.org/jira/browse/LUCENE-5618
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Shai Erera
Priority: Blocker
 Fix For: 4.9

 Attachments: LUCENE-5618.patch, LUCENE-5618.patch, LUCENE-5618.patch, 
 LUCENE-5618.patch, LUCENE-5618.patch


 Spinoff from LUCENE-5616.
 See the example there, docvalues readers get a fieldinfos, but it doesn't 
 contain the correct ones, so they have invalid field numbers at read time.
 This should really be fixed. Maybe a simple solution is to not write 
 batches of fields in updates but just have only one field per gen? 
 This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5636) SegmentCommitInfo continues to list unneeded gen'd files

2014-05-21 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5636.


   Resolution: Fixed
Fix Version/s: 5.0
   4.9

Fixed in LUCENE-5618

 SegmentCommitInfo continues to list unneeded gen'd files
 

 Key: LUCENE-5636
 URL: https://issues.apache.org/jira/browse/LUCENE-5636
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5636.patch


 I thought I handled it in LUCENE-5246, but turns out I didn't handle it 
 fully. I'll upload a patch which improves the test to expose the bug. I know 
 where it is, but I'm not sure how to fix it without breaking index 
 back-compat. Can we do that on experimental features?
 The problem is that if you update different fields in different gens, the 
 FieldInfos files of older gens remain referenced (still!!). I open a new 
 issue since LUCENE-5246 is already resolved and released, so don't want to 
 mess up our JIRA...
 The severity of the bug is that unneeded files are still referenced in the 
 index. Everything still works correctly, it's just that .fnm files are still 
 there. But as I wrote, I'm still not sure how to solve it without requiring 
 apps that use dv updates to reindex.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6104) The 'addreplica' Collection API does not support async parameter

2014-05-21 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-6104:
---

 Summary: The 'addreplica' Collection API does not support async 
parameter
 Key: SOLR-6104
 URL: https://issues.apache.org/jira/browse/SOLR-6104
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8.1, 4.8
Reporter: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0


The 'addreplica' API does not support an 'async' parameter which was added by 
SOLR-5477.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6102) The 'addreplica' Collection API does not support property params

2014-05-21 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-6102.
-

Resolution: Invalid

Oops, looks like I was too quick in opening this issue. The 'addreplica' API 
does support setting core properties but it is not documented in the Solr 
reference guide.

 The 'addreplica' Collection API does not support property params
 

 Key: SOLR-6102
 URL: https://issues.apache.org/jira/browse/SOLR-6102
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8, 4.8.1
Reporter: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0


 All Collection APIs except 'addreplica', support passing core properties in 
 the property.XXX format. Such property params are passed directly the core 
 admin APIs invoked by these collection APIs.
 Not supporting these params is a bug and we should fix it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5683) Improve SegmentReader.getXXXDocValues

2014-05-21 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004742#comment-14004742
 ] 

Shai Erera commented on LUCENE-5683:


I implemented it, many tests fail in CheckIndex on ClassCastException. So this 
is the current code:

{code}
FieldInfo fi = getDVField(field, DocValuesType.BINARY);
if (fi == null) {
  return null;
}

MapString,Object dvFields = docValuesLocal.get();
BinaryDocValues dvs = (BinaryDocValues) dvFields.get(field);
if (dvs == null) {
  // initialize
  ...
}
{code}

And I changed it so that the FieldInfo part is inside the {{if}} (lazily 
initialize). The reason for the ClassCastException is that if you previously 
asked for a NUMERIC field w/ same name, it got into the map, therefore the code 
happily tries to case it to a NumericDocValues, or BinaryDocValues and hits the 
exception.

So I'm not sure this optimization is right .. but also that it's worth 
complicating the code w/ e.g. instanceof checks?

 Improve SegmentReader.getXXXDocValues
 -

 Key: LUCENE-5683
 URL: https://issues.apache.org/jira/browse/LUCENE-5683
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Shai Erera
Assignee: Shai Erera

 Today we do two hash lookups, where in most cases a single one is enough. 
 E.g. SR.getNumericDocValues initializes the FieldInfo (first lookup in 
 FieldInfos), however if that field was already initialized, we can simply 
 check dvFields.get(). This can be improved in all getXXXDocValues as well as 
 getDocsWithField.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-5648) SolrCore#getStatistics() should nest open searchers' stats

2014-05-21 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan closed SOLR-5648.
-

Resolution: Invalid

bq. 1) I'm not sure i really understand what this adds – isn't every registered 
searcher (which should include every open searcher if there are more then one) 
already listed in the infoRegistry (so it's stats are surfaced in /admin/mbeans 
and via JMX) ?

you're right! that's much better.

 SolrCore#getStatistics() should nest open searchers' stats
 --

 Key: SOLR-5648
 URL: https://issues.apache.org/jira/browse/SOLR-5648
 Project: Solr
  Issue Type: Task
Reporter: Shikhar Bhushan
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-5648.patch, oldestSearcherStaleness.gif, 
 openSearchers.gif


 {{SolrIndexSearcher}} leaks are a notable cause of garbage collection issues 
 in codebases with custom components.
 So it is useful to be able to access monitoring information about what 
 searchers are currently open, and in turn access their stats e.g. 
 {{openedAt}}.
 This can be nested via {{SolrCore#getStatistics()}} which has a 
 {{_searchers}} collection of all open searchers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5675) ID postings format

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004774#comment-14004774
 ] 

ASF subversion and git services commented on LUCENE-5675:
-

Commit 1596599 from [~mikemccand] in branch 'dev/branches/lucene5675'
[ https://svn.apache.org/r1596599 ]

LUCENE-5675: go back to sending deleted docs to PostingsFormat on flush; move 
'skip deleted docs' into IDVPF

 ID postings format
 

 Key: LUCENE-5675
 URL: https://issues.apache.org/jira/browse/LUCENE-5675
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir

 Today the primary key lookup in lucene is not that great for systems like 
 solr and elasticsearch that have versioning in front of IndexWriter.
 To some extend BlockTree can sometimes help avoid seeks by telling you the 
 term does not exist for a segment. But this technique (based on FST prefix) 
 is fragile. The only other choice today is bloom filters, which use up huge 
 amounts of memory.
 I don't think we are using everything we know: particularly the version 
 semantics.
 Instead, if the FST for the terms index used an algebra that represents the 
 max version for any subtree, we might be able to answer that there is no term 
 T with version  V in that segment very efficiently.
 Also ID fields dont need postings lists, they dont need stats like 
 docfreq/totaltermfreq, etc this stuff is all implicit. 
 As far as API, i think for users to provide IDs with versions to such a PF, 
 a start would to set a payload or whatever on the term field to get it thru 
 indexwriter to the codec. And a consumer of the codec can just cast the 
 Terms to a subclass that exposes the FST to do this version check efficiently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5675) ID postings format

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004789#comment-14004789
 ] 

ASF subversion and git services commented on LUCENE-5675:
-

Commit 1596602 from [~mikemccand] in branch 'dev/branches/lucene5675'
[ https://svn.apache.org/r1596602 ]

LUCENE-5675: finish reverting 'do not send deleted docs to PostingsFormat on 
flush'

 ID postings format
 

 Key: LUCENE-5675
 URL: https://issues.apache.org/jira/browse/LUCENE-5675
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir

 Today the primary key lookup in lucene is not that great for systems like 
 solr and elasticsearch that have versioning in front of IndexWriter.
 To some extend BlockTree can sometimes help avoid seeks by telling you the 
 term does not exist for a segment. But this technique (based on FST prefix) 
 is fragile. The only other choice today is bloom filters, which use up huge 
 amounts of memory.
 I don't think we are using everything we know: particularly the version 
 semantics.
 Instead, if the FST for the terms index used an algebra that represents the 
 max version for any subtree, we might be able to answer that there is no term 
 T with version  V in that segment very efficiently.
 Also ID fields dont need postings lists, they dont need stats like 
 docfreq/totaltermfreq, etc this stuff is all implicit. 
 As far as API, i think for users to provide IDs with versions to such a PF, 
 a start would to set a payload or whatever on the term field to get it thru 
 indexwriter to the codec. And a consumer of the codec can just cast the 
 Terms to a subclass that exposes the FST to do this version check efficiently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used

2014-05-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004796#comment-14004796
 ] 

Shikhar Bhushan commented on SOLR-6105:
---

paging [~shalinmangar] in case you have any idea what might be going on

 DebugComponent NPE when single-pass distributed search is used
 --

 Key: SOLR-6105
 URL: https://issues.apache.org/jira/browse/SOLR-6105
 Project: Solr
  Issue Type: Bug
Reporter: Shikhar Bhushan
Priority: Minor

 I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID  
 score are requested, which enables the single-pass distributed search 
 optimization from SOLR-1880.
 The NPE originates on this line in DebugComponent.finishStage():
 {noformat}
 int idx = sdoc.positionInResponse;
 {noformat}
 indicating an ID that is in the explain but missing in the resultIds.
 I'm afraid I haven't been able to reproduce this in 
 {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket 
 in any case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used

2014-05-21 Thread Shikhar Bhushan (JIRA)
Shikhar Bhushan created SOLR-6105:
-

 Summary: DebugComponent NPE when single-pass distributed search is 
used
 Key: SOLR-6105
 URL: https://issues.apache.org/jira/browse/SOLR-6105
 Project: Solr
  Issue Type: Bug
Reporter: Shikhar Bhushan
Priority: Minor


I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID  
score are requested, which enables the single-pass distributed search 
optimization from SOLR-1880.

The NPE originates on this line in DebugComponent.finishStage():

{noformat}
int idx = sdoc.positionInResponse;
{noformat}

indicating an ID that is in the explain but missing in the resultIds.

I'm afraid I haven't been able to reproduce this in 
{{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket 
in any case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-5693:
--

 Summary: don't write deleted documents on flush
 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless


When we flush a new segment, sometimes some documents are born deleted, e.g. 
if the app did a IW.deleteDocuments that matched some not-yet-flushed documents.

We already compute the liveDocs on flush, but then we continue (wastefully) to 
send those known-deleted documents to all Codec parts.

I started to implement this on LUCENE-5675 but it was too controversial.

Also, I expect typically the number of deleted docs is 0, or small, so not 
writing born deleted docs won't be much of a win for most apps.  Still it 
seems silly to write them, consuming IO/CPU in the process, only to consume 
more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6105) DebugComponent NPE when single-pass distributed search is used

2014-05-21 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004799#comment-14004799
 ] 

Shikhar Bhushan commented on SOLR-6105:
---

also paging [~vzhovtiuk] - presumably you're using this feature in your app. 
does debugQuery=true work ok for you?

 DebugComponent NPE when single-pass distributed search is used
 --

 Key: SOLR-6105
 URL: https://issues.apache.org/jira/browse/SOLR-6105
 Project: Solr
  Issue Type: Bug
Reporter: Shikhar Bhushan
Priority: Minor

 I'm seeing NPE's in {{DebugComponent}} with debugQuery=true when just ID  
 score are requested, which enables the single-pass distributed search 
 optimization from SOLR-1880.
 The NPE originates on this line in DebugComponent.finishStage():
 {noformat}
 int idx = sdoc.positionInResponse;
 {noformat}
 indicating an ID that is in the explain but missing in the resultIds.
 I'm afraid I haven't been able to reproduce this in 
 {{DistributedQueryComponentOptimizationTest}}, but wanted to open this ticket 
 in any case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004826#comment-14004826
 ] 

ASF subversion and git services commented on LUCENE-4236:
-

Commit 1596606 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1596606 ]

LUCENE-4236: add a new test for crazy corner cases of coord() handling

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004821#comment-14004821
 ] 

Robert Muir commented on LUCENE-5693:
-

This only makes sense for postings though.

How can we avoid writing deleted documents in:
* stored fields and term vectors (which we arent flushing)
* docvalues (we would need to remap ordinals)

By writing them some places and not writing them other places, we open the 
possibility of extremely confusing corner cases and bugs.

 don't write deleted documents on flush
 --

 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless

 When we flush a new segment, sometimes some documents are born deleted, 
 e.g. if the app did a IW.deleteDocuments that matched some not-yet-flushed 
 documents.
 We already compute the liveDocs on flush, but then we continue (wastefully) 
 to send those known-deleted documents to all Codec parts.
 I started to implement this on LUCENE-5675 but it was too controversial.
 Also, I expect typically the number of deleted docs is 0, or small, so not 
 writing born deleted docs won't be much of a win for most apps.  Still it 
 seems silly to write them, consuming IO/CPU in the process, only to consume 
 more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004828#comment-14004828
 ] 

ASF subversion and git services commented on LUCENE-4236:
-

Commit 1596607 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596607 ]

LUCENE-4236: add a new test for crazy corner cases of coord() handling

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004835#comment-14004835
 ] 

Shai Erera commented on LUCENE-5693:


Today we apply the deletes (update the bitset) when a Reader is being 
requested. At that point, we have a SegmentReader at hand and we can resolve 
the delete-by-Term/Query to the actual doc IDs ... how would we do that while 
the segment is flushed? How do we know which documents were associated with 
{{Term t}}, while it was sent as a delete?

When I worked on LUCENE-5189 (NumericDocValues update), I had the same thought 
-- why flush the original numeric value when the document has already been 
updated? But I had the same issue - which documents were affected by the update 
Term.

 don't write deleted documents on flush
 --

 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless

 When we flush a new segment, sometimes some documents are born deleted, 
 e.g. if the app did a IW.deleteDocuments that matched some not-yet-flushed 
 documents.
 We already compute the liveDocs on flush, but then we continue (wastefully) 
 to send those known-deleted documents to all Codec parts.
 I started to implement this on LUCENE-5675 but it was too controversial.
 Also, I expect typically the number of deleted docs is 0, or small, so not 
 writing born deleted docs won't be much of a win for most apps.  Still it 
 seems silly to write them, consuming IO/CPU in the process, only to consume 
 more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4236:


Attachment: LUCENE-4236.patch

Here's the patch. I think its ready.

I committed the new test already to trunk/4.x.

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5285) Solr response format should support child Docs

2014-05-21 Thread Arcadius Ahouansou (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004850#comment-14004850
 ] 

Arcadius Ahouansou commented on SOLR-5285:
--

Thanks [~varunthacker]  and all for the great work.
 [~hossman] Any chance this will get into 4.9?

Thanks.

 Solr response format should support child Docs
 --

 Key: SOLR-5285
 URL: https://issues.apache.org/jira/browse/SOLR-5285
 Project: Solr
  Issue Type: New Feature
Reporter: Varun Thacker
 Fix For: 4.9, 5.0

 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin


 Solr has added support for taking childDocs as input ( only XML till now ). 
 It's currently used for BlockJoinQuery. 
 I feel that if a user indexes a document with child docs, even if he isn't 
 using the BJQ features and is just searching which results in a hit on the 
 parentDoc, it's childDocs should be returned in the response format.
 [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would 
 be the place to add childDocs to the response.
 Now given a docId one needs to find out all the childDoc id's. A couple of 
 approaches which I could think of are 
 1. Maintain the relation between a parentDoc and it's childDocs during 
 indexing time in maybe a separate index?
 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a 
 parentDoc it finds out all the childDocs but this requires a childScorer.
 Am I missing something obvious on how to find the relation between a 
 parentDoc and it's childDocs because none of the above solutions for this 
 look right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6086) Replica active during Warming

2014-05-21 Thread ludovic Boutros (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludovic Boutros updated SOLR-6086:
--

Attachment: SOLR-6086.patch

I checked the differences in the logs and in the code.

The problem occures when:
- a node is restarted 
- Peer Sync failed (no /get handler for instance, should it become mandatory 
?)
- the node is already synced (nothing to replicate)

or :

- a node is restarted and this is the leader (I do not know if it only appends 
with a lonely leader...)
- the node is already synced (nothing to replicate)

For the first case,

I think this is a side effect of the modification in SOLR-4965. 

If Peer Sync is succesfull, in the code an explicit commit is called. And 
there's a comment which says:

{code:title=RecoveryStrategy.java|borderStyle=solid}
// force open a new searcher
core.getUpdateHandler().commit(new CommitUpdateCommand(req, false));
{code}

This is not the case if Peer Sync failed.
Just adding this line is enough to correct this issue.

Here is a patch with a test which reproduce the problem and the correction (to 
be applied to the branch 4x).

I am working on the second case.

 Replica active during Warming
 -

 Key: SOLR-6086
 URL: https://issues.apache.org/jira/browse/SOLR-6086
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1, 4.8.1
Reporter: ludovic Boutros
 Attachments: SOLR-6086.patch


 At least with Solr 4.6.1, replica are considered as active during the warming 
 process.
 This means that if you restart a replica or create a new one, queries will  
 be send to this replica and the query will hang until the end of the warming  
 process (If cold searchers are not used).
 You cannot add or restart a node silently anymore.
 I think that the fact that the replica is active is not a bad thing.
 But, the HttpShardHandler and the CloudSolrServer class should take the 
 warming process in account.
 Currently, I have developped a new very simple component which check that a 
 searcher is registered.
 I am also developping custom HttpShardHandler and CloudSolrServer classes 
 which will check the warming process in addition to the ACTIVE status in the 
 cluster state.
 This seems to be more a workaround than a solution but that's all I can do in 
 this version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6086) Replica active during Warming

2014-05-21 Thread ludovic Boutros (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ludovic Boutros updated SOLR-6086:
--

Affects Version/s: 4.8.1

 Replica active during Warming
 -

 Key: SOLR-6086
 URL: https://issues.apache.org/jira/browse/SOLR-6086
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1, 4.8.1
Reporter: ludovic Boutros
 Attachments: SOLR-6086.patch


 At least with Solr 4.6.1, replica are considered as active during the warming 
 process.
 This means that if you restart a replica or create a new one, queries will  
 be send to this replica and the query will hang until the end of the warming  
 process (If cold searchers are not used).
 You cannot add or restart a node silently anymore.
 I think that the fact that the replica is active is not a bad thing.
 But, the HttpShardHandler and the CloudSolrServer class should take the 
 warming process in account.
 Currently, I have developped a new very simple component which check that a 
 searcher is registered.
 I am also developping custom HttpShardHandler and CloudSolrServer classes 
 which will check the warming process in addition to the ACTIVE status in the 
 cluster state.
 This seems to be more a workaround than a solution but that's all I can do in 
 this version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6086) Replica active during Warming

2014-05-21 Thread ludovic Boutros (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004856#comment-14004856
 ] 

ludovic Boutros edited comment on SOLR-6086 at 5/21/14 4:23 PM:


I checked the differences in the logs and in the code.

The problem occures when:
- a node is restarted 
- Peer Sync failed (no /get handler for instance, should it become mandatory 
?)
- the node is already synced (nothing to replicate)

or :

- a node is restarted and this is the leader (I do not know if it only appends 
with a lonely leader...)
- the node is already synced (nothing to replicate)

For the first case,

I think this is a side effect of the modification in SOLR-4965. 

If Peer Sync is succesfull, in the code an explicit commit is called. And 
there's a comment which says:

{code:title=RecoveryStrategy.java|borderStyle=solid}
// force open a new searcher
core.getUpdateHandler().commit(new CommitUpdateCommand(req, false));
{code}

This is not the case if Peer Sync failed.
Just adding this line is enough to correct this issue.

Here is a patch with a test which reproduces the problem and the correction (to 
be applied to the branch 4x).

I am working on the second case.


was (Author: lboutros):
I checked the differences in the logs and in the code.

The problem occures when:
- a node is restarted 
- Peer Sync failed (no /get handler for instance, should it become mandatory 
?)
- the node is already synced (nothing to replicate)

or :

- a node is restarted and this is the leader (I do not know if it only appends 
with a lonely leader...)
- the node is already synced (nothing to replicate)

For the first case,

I think this is a side effect of the modification in SOLR-4965. 

If Peer Sync is succesfull, in the code an explicit commit is called. And 
there's a comment which says:

{code:title=RecoveryStrategy.java|borderStyle=solid}
// force open a new searcher
core.getUpdateHandler().commit(new CommitUpdateCommand(req, false));
{code}

This is not the case if Peer Sync failed.
Just adding this line is enough to correct this issue.

Here is a patch with a test which reproduce the problem and the correction (to 
be applied to the branch 4x).

I am working on the second case.

 Replica active during Warming
 -

 Key: SOLR-6086
 URL: https://issues.apache.org/jira/browse/SOLR-6086
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1, 4.8.1
Reporter: ludovic Boutros
 Attachments: SOLR-6086.patch


 At least with Solr 4.6.1, replica are considered as active during the warming 
 process.
 This means that if you restart a replica or create a new one, queries will  
 be send to this replica and the query will hang until the end of the warming  
 process (If cold searchers are not used).
 You cannot add or restart a node silently anymore.
 I think that the fact that the replica is active is not a bad thing.
 But, the HttpShardHandler and the CloudSolrServer class should take the 
 warming process in account.
 Currently, I have developped a new very simple component which check that a 
 searcher is registered.
 I am also developping custom HttpShardHandler and CloudSolrServer classes 
 which will check the warming process in addition to the ACTIVE status in the 
 cluster state.
 This seems to be more a workaround than a solution but that's all I can do in 
 this version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1146: POMs out of sync

2014-05-21 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1146/

1 tests failed.
FAILED:  org.apache.solr.cloud.HttpPartitionTest.testDistribSearch

Error Message:
No registered leader was found after waiting for 6ms , collection: 
c8n_1x3_lf slice: shard1

Stack Trace:
org.apache.solr.common.SolrException: No registered leader was found after 
waiting for 6ms , collection: c8n_1x3_lf slice: shard1
at 
__randomizedtesting.SeedInfo.seed([7A06522654ACE583:FBE0DC3E23F385BF]:0)
at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:567)
at 
org.apache.solr.cloud.HttpPartitionTest.testRf3WithLeaderFailover(HttpPartitionTest.java:348)
at 
org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:148)




Build Log:
[...truncated 54769 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:490: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:182: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77:
 Java returned: 1

Total time: 191 minutes 17 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004889#comment-14004889
 ] 

Michael McCandless commented on LUCENE-4236:


+1, this is a great cleanup: more understandable than what we have today.

Maybe we should leave FilterScorer package private until there's a need for 
public?

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004890#comment-14004890
 ] 

Robert Muir commented on LUCENE-4236:
-

Good idea. If we have a need somewhere else we can open it up.

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004902#comment-14004902
 ] 

Michael McCandless commented on LUCENE-5693:


bq. how would we do that while the segment is flushed?

We do it in FreqProxTermsWriter.applyDeletes; since we know the terms to be 
deleted, and we have the BytesRefHash, it's easy.

 don't write deleted documents on flush
 --

 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless

 When we flush a new segment, sometimes some documents are born deleted, 
 e.g. if the app did a IW.deleteDocuments that matched some not-yet-flushed 
 documents.
 We already compute the liveDocs on flush, but then we continue (wastefully) 
 to send those known-deleted documents to all Codec parts.
 I started to implement this on LUCENE-5675 but it was too controversial.
 Also, I expect typically the number of deleted docs is 0, or small, so not 
 writing born deleted docs won't be much of a win for most apps.  Still it 
 seems silly to write them, consuming IO/CPU in the process, only to consume 
 more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004906#comment-14004906
 ] 

Michael McCandless commented on LUCENE-5693:


bq. This only makes sense for postings though.

Right, postings is much easier than doc values.  But postings are also the most 
costly to merge.

bq. By writing them some places and not writing them other places, we open the 
possibility of extremely confusing corner cases and bugs.

I disagree: I think we discover places that are relying on deleted docs 
behavior, i.e. test bugs.  When I did this on LUCENE-5675 there were only a few 
places that relied on deleted docs.

 don't write deleted documents on flush
 --

 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless

 When we flush a new segment, sometimes some documents are born deleted, 
 e.g. if the app did a IW.deleteDocuments that matched some not-yet-flushed 
 documents.
 We already compute the liveDocs on flush, but then we continue (wastefully) 
 to send those known-deleted documents to all Codec parts.
 I started to implement this on LUCENE-5675 but it was too controversial.
 Also, I expect typically the number of deleted docs is 0, or small, so not 
 writing born deleted docs won't be much of a win for most apps.  Still it 
 seems silly to write them, consuming IO/CPU in the process, only to consume 
 more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6101) Shard splitting doesn't work in legacyCloud=false mode

2014-05-21 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-6101:


Attachment: SOLR-6101.patch

Changes:
# ShardSplitTest switches to using legacyCloud=false randomly
# Shard splitting uses addReplica API to create replicas instead of using core 
admin create API directly. I had to introduce a wait loop for sub-shard to be 
created by overseer before we can call addReplica.

 Shard splitting doesn't work in legacyCloud=false mode
 --

 Key: SOLR-6101
 URL: https://issues.apache.org/jira/browse/SOLR-6101
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0

 Attachments: SOLR-6101.patch


 When we invoke splitshard Collection API against a cluster with 
 legacyCloud=false, we get the following errors:
 {code}
 2014-05-15 21:07:58,986 
 [Overseer-163819091268403216-ec2-x.compute-1.amazonaws.com:8986_solr-n_51]
  ERROR solr.cloud.OverseerCollectionProcessor  - Collection splitshard of 
 splitshard failed:org.apache.solr.common.SolrException: Could not find 
 coreNodeName
 at 
 org.apache.solr.cloud.OverseerCollectionProcessor.waitForCoreNodeName(OverseerCollectionProcessor.java:1504)
 at 
 org.apache.solr.cloud.OverseerCollectionProcessor.splitShard(OverseerCollectionProcessor.java:1255)
 at 
 org.apache.solr.cloud.OverseerCollectionProcessor.processMessage(OverseerCollectionProcessor.java:472)
 at 
 org.apache.solr.cloud.OverseerCollectionProcessor.run(OverseerCollectionProcessor.java:248)
 at java.lang.Thread.run(Thread.java:745)
 2014-05-15 21:07:59,003 
 [Overseer-163819091268403216-ec2-xxx.compute-1.amazonaws.com:8986_solr-n_51]
  INFO  solr.cloud.OverseerCollectionProcessor  - Overseer Collection 
 Processor: Message id:/overseer/collection-queue-work/qn-18 complete, 
 response:{success={null={responseHeader={status=0,QTime=1}},null={responseHeader={status=0,QTime=1}}},split117278106116750={responseHeader={status=0,QTime=0},STATUS=failed,Response=Error
  CREATEing SolrCore '3M_shard1_1_replica1': non legacy mode coreNodeName 
 missing 
 shard=shard1_1name=3M_shard1_1_replica1action=CREATEcollection=3Mwt=javabinqt=/admin/coresasync=split117278106116750version=2},Operation
  splitshard caused exception:=org.apache.solr.common.SolrException: Could not 
 find coreNodeName,exception={msg=Could not find coreNodeName,rspCode=500}}
 {code}
 The sub-shard replica (leader) creation fails due to:
 {code}
 {
 responseHeader: {
 status: 0,
 QTime: 0
 },
 STATUS: failed,
 Response: Error CREATEing SolrCore '3M_shard1_0_replica1': non legacy mode 
 coreNodeName missing 
 shard=shard1_0name=3M_shard1_0_replica1action=CREATEcollection=3Mwt=javabinqt=/admin/coresasync=split117278099904930version=2
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5650) createTempDir and associated functions no longer create java.io.tmpdir

2014-05-21 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004911#comment-14004911
 ] 

Ryan Ernst commented on LUCENE-5650:


+1, everything looks good to me (and test pass for me as well).

 createTempDir and associated functions no longer create java.io.tmpdir
 --

 Key: LUCENE-5650
 URL: https://issues.apache.org/jira/browse/LUCENE-5650
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
Reporter: Ryan Ernst
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch, 
 LUCENE-5650.patch, dih.patch


 The recent refactoring to all the create temp file/dir functions (which is 
 great!) has a minor regression from what existed before.  With the old 
 {{LuceneTestCase.TEMP_DIR}}, the directory was created if it did not exist.  
 So, if you set {{java.io.tmpdir}} to {{./temp}}, then it would create that 
 dir within the per jvm working dir.  However, {{getBaseTempDirForClass()}} 
 now does asserts that check the dir exists, is a dir, and is writeable.
 Lucene uses {{.}} as {{java.io.tmpdir}}.  Then in the test security 
 manager, the per jvm cwd has read/write/execute permissions.  However, this 
 allows tests to write to their cwd, which I'm trying to protect against (by 
 setting cwd to read/execute in my test security manager).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5693:
---

Attachment: LUCENE-5693.patch

Patch, decoupled from LUCENE-5675.  Tests pass.

The trickiest one was the new TestFieldCacheVsDocValues: it heavily
relies on being able to read deleted docs from postings, which I think
is invalid.

I also had to fix CheckIndex to not verify term vectors for deleted
docs; I think that's fair.

The core fix is easy: FreqProxFields (passed to the PostingsWriterat
flush) just skips the deleted docs.

Also, this uncovered a bug in ToParentBJQ.explain's handling of
deleted docs.


 don't write deleted documents on flush
 --

 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
 Attachments: LUCENE-5693.patch


 When we flush a new segment, sometimes some documents are born deleted, 
 e.g. if the app did a IW.deleteDocuments that matched some not-yet-flushed 
 documents.
 We already compute the liveDocs on flush, but then we continue (wastefully) 
 to send those known-deleted documents to all Codec parts.
 I started to implement this on LUCENE-5675 but it was too controversial.
 Also, I expect typically the number of deleted docs is 0, or small, so not 
 writing born deleted docs won't be much of a win for most apps.  Still it 
 seems silly to write them, consuming IO/CPU in the process, only to consume 
 more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5495) Recovery strategy for leader partitioned from replica case.

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004999#comment-14004999
 ] 

ASF subversion and git services commented on SOLR-5495:
---

Commit 1596636 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1596636 ]

SOLR-5495: Print cluster state in assertion failure messages if a leader cannot 
be found to determine root cause of HttpPartitionTest failures in Jenkins.

 Recovery strategy for leader partitioned from replica case.
 ---

 Key: SOLR-5495
 URL: https://issues.apache.org/jira/browse/SOLR-5495
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Timothy Potter
 Attachments: SOLR-5495.patch, SOLR-5495.patch, SOLR-5495.patch


 We need to work out a strategy for the case of:
 Leader and replicas can still talk to ZooKeeper, Leader cannot talk to 
 replica.
 We punted on this in initial design, but I'd like to get something in.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5495) Recovery strategy for leader partitioned from replica case.

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005009#comment-14005009
 ] 

ASF subversion and git services commented on SOLR-5495:
---

Commit 1596637 from [~thelabdude] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596637 ]

SOLR-5495: Print cluster state in assertion failure messages if a leader cannot 
be found to determine root cause of HttpPartitionTest failures in Jenkins

 Recovery strategy for leader partitioned from replica case.
 ---

 Key: SOLR-5495
 URL: https://issues.apache.org/jira/browse/SOLR-5495
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller
Assignee: Timothy Potter
 Attachments: SOLR-5495.patch, SOLR-5495.patch, SOLR-5495.patch


 We need to work out a strategy for the case of:
 Leader and replicas can still talk to ZooKeeper, Leader cannot talk to 
 replica.
 We punted on this in initial design, but I'd like to get something in.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005037#comment-14005037
 ] 

ASF subversion and git services commented on LUCENE-4236:
-

Commit 1596640 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1596640 ]

LUCENE-4236: cleanup/optimize BooleanScorer in-order creation

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5285) Solr response format should support child Docs

2014-05-21 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5285:
---

Attachment: SOLR-5285.patch

Hey Varun,

I didn't get very far digging into your patch, because i started by looking at 
your new randomized test in SolrExampleTests and encountered some problems...

1) the first time i tried running your new randomized test, i got an NPE -- it 
didn't reproduce reliable though, because your test called new Random() 
instead of leveraging the test-framework (ant precommit will warn you about 
stuff like this)

2) Side note: there's no need to randomize which response parser is used when 
you add test methods to SolrExampleTests -- every method there gets picked up 
automatically by the subclasses which ensure they are all run with every 
writer/parser.

3) When started looking into fixing the use of random() in your test, I 
realized that the assertions in the test weren't very strong.  What i was 
refering to in my earlier comment was having a test that attempted to use the 
transformer on a result set that included docs with children, and docs w/o 
children; and asserting that every child returned really was a decendent of the 
specified doc by comparing with what we _know_ for a fact we indexed -- your 
test wasn't really doing any of that.

In the attached patch, i've overhauled 
{{SolrExampleTests.testChildDoctransformer()}} along the lines of what i was 
describing, but this has exposed a ClassCastException in the transformer.  I 
haven't had a chance to dig into what's happening, but for some odd reason it 
only seems to manifest itself when the XML Response Writer is used...

{noformat}
hossman@frisbee:~/lucene/dev/solr/solrj$ ant test 
-Dtests.method=testChildDoctransformer -Dtests.seed=720251997BEC4F70 
-Dtests.slow=true -Dtests.locale=sk -Dtests.timezone=Pacific/Fiji 
-Dtests.file.encoding=UTF-8

...

   [junit4]   2 11768 T20 C1 oasc.SolrException.log ERROR 
null:java.lang.ClassCastException: org.apache.lucene.document.Field cannot be 
cast to java.lang.String
   [junit4]   2at 
org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformerFactory.java:142)
   [junit4]   2at 
org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:254)
   [junit4]   2at 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:172)
   [junit4]   2at 
org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:111)
   [junit4]   2at 
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:40)
   [junit4]   2at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:760)
   [junit4]   2at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:428)
   [junit4]   2at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
   [junit4]   2at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
   [junit4]   2at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
   [junit4]   2at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
   [junit4]   2at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
   [junit4]   2at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
   [junit4]   2at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
   [junit4]   2at 
org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
   [junit4]   2at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
   [junit4]   2at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
   [junit4]   2at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
   [junit4]   2at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
   [junit4]   2at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   [junit4]   2at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   [junit4]   2at 
org.eclipse.jetty.server.Server.handle(Server.java:368)
   [junit4]   2at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
   [junit4]   2at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
   [junit4]   2at 

[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005070#comment-14005070
 ] 

ASF subversion and git services commented on LUCENE-4236:
-

Commit 1596646 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596646 ]

LUCENE-4236: cleanup/optimize BooleanScorer in-order creation

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-4236.
-

Resolution: Fixed

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6088) Add query re-ranking with the ReRankingQParserPlugin

2014-05-21 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-6088:
-

Attachment: SOLR-6088.patch

New patch with all tests and precommit passing.



 Add query re-ranking with the ReRankingQParserPlugin
 

 Key: SOLR-6088
 URL: https://issues.apache.org/jira/browse/SOLR-6088
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Joel Bernstein
 Attachments: SOLR-6088.patch, SOLR-6088.patch, SOLR-6088.patch


 This ticket introduces the ReRankingQParserPlugin which adds query 
 Reranking/Rescoring for Solr. It leverages the new RankQuery framework to 
 plug-in the new Lucene QueryRescorer.
 See ticket LUCENE-5489 for details on the use case.
 Sample syntax:
 {code}
 q={!rerank mainQuery=$qq reRankQuery=$rqq reRankDocs=200}
 {code}
 In the example above the mainQuery is executed and 200 docs are collected and 
 re-ranked based on the results of the reRankQuery. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4236) clean up booleanquery conjunction optimizations a bit

2014-05-21 Thread Mikhail Khludnev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005094#comment-14005094
 ] 

Mikhail Khludnev commented on LUCENE-4236:
--

[~rcmuir] great job!

btw, i wonder if Solr is allowed to search with BooleanScorer (term-at-time)?

 clean up booleanquery conjunction optimizations a bit
 -

 Key: LUCENE-4236
 URL: https://issues.apache.org/jira/browse/LUCENE-4236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch, 
 LUCENE-4236.patch, LUCENE-4236.patch, LUCENE-4236.patch


 After LUCENE-3505, I want to do a slight cleanup:
 * compute the term conjunctions optimization in scorer(), so its applied even 
 if we have optional and prohibited clauses that dont exist in the segment 
 (e.g. return null)
 * use the term conjunctions optimization when optional.size() == 
 minShouldMatch, as that means they are all mandatory, too.
 * don't return booleanscorer1 when optional.size() == minShouldMatch, because 
 it means we have required clauses and in general BS2 should do a much better 
 job (e.g. use advance).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005099#comment-14005099
 ] 

ASF subversion and git services commented on SOLR-5468:
---

Commit 1596652 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1596652 ]

SOLR-5468: Improve reporting of cluster state when assertions fail; to help 
diagnose cause of Jenkins failures.

 Option to enforce a majority quorum approach to accepting updates in SolrCloud
 --

 Key: SOLR-5468
 URL: https://issues.apache.org/jira/browse/SOLR-5468
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.5
 Environment: All
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch


 I've been thinking about how SolrCloud deals with write-availability using 
 in-sync replica sets, in which writes will continue to be accepted so long as 
 there is at least one healthy node per shard.
 For a little background (and to verify my understanding of the process is 
 correct), SolrCloud only considers active/healthy replicas when acknowledging 
 a write. Specifically, when a shard leader accepts an update request, it 
 forwards the request to all active/healthy replicas and only considers the 
 write successful if all active/healthy replicas ack the write. Any down / 
 gone replicas are not considered and will sync up with the leader when they 
 come back online using peer sync or snapshot replication. For instance, if a 
 shard has 3 nodes, A, B, C with A being the current leader, then writes to 
 the shard will continue to succeed even if B  C are down.
 The issue is that if a shard leader continues to accept updates even if it 
 loses all of its replicas, then we have acknowledged updates on only 1 node. 
 If that node, call it A, then fails and one of the previous replicas, call it 
 B, comes back online before A does, then any writes that A accepted while the 
 other replicas were offline are at risk to being lost. 
 SolrCloud does provide a safe-guard mechanism for this problem with the 
 leaderVoteWait setting, which puts any replicas that come back online before 
 node A into a temporary wait state. If A comes back online within the wait 
 period, then all is well as it will become the leader again and no writes 
 will be lost. As a side note, sys admins definitely need to be made more 
 aware of this situation as when I first encountered it in my cluster, I had 
 no idea what it meant.
 My question is whether we want to consider an approach where SolrCloud will 
 not accept writes unless there is a majority of replicas available to accept 
 the write? For my example, under this approach, we wouldn't accept writes if 
 both BC failed, but would if only C did, leaving A  B online. Admittedly, 
 this lowers the write-availability of the system, so may be something that 
 should be tunable?
 From Mark M: Yeah, this is kind of like one of many little features that we 
 have just not gotten to yet. I’ve always planned for a param that let’s you 
 say how many replicas an update must be verified on before responding 
 success. Seems to make sense to fail that type of request early if you notice 
 there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5650) createTempDir and associated functions no longer create java.io.tmpdir

2014-05-21 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005092#comment-14005092
 ] 

Dawid Weiss commented on LUCENE-5650:
-

Please commit it to trunk, Ryan! I'll be at work in ~9hours so should something 
pop up in jenkins runs I'll take care of these.

 createTempDir and associated functions no longer create java.io.tmpdir
 --

 Key: LUCENE-5650
 URL: https://issues.apache.org/jira/browse/LUCENE-5650
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/test
Reporter: Ryan Ernst
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch, 
 LUCENE-5650.patch, dih.patch


 The recent refactoring to all the create temp file/dir functions (which is 
 great!) has a minor regression from what existed before.  With the old 
 {{LuceneTestCase.TEMP_DIR}}, the directory was created if it did not exist.  
 So, if you set {{java.io.tmpdir}} to {{./temp}}, then it would create that 
 dir within the per jvm working dir.  However, {{getBaseTempDirForClass()}} 
 now does asserts that check the dir exists, is a dir, and is writeable.
 Lucene uses {{.}} as {{java.io.tmpdir}}.  Then in the test security 
 manager, the per jvm cwd has read/write/execute permissions.  However, this 
 allows tests to write to their cwd, which I'm trying to protect against (by 
 setting cwd to read/execute in my test security manager).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5309) Investigate ShardSplitTest failures

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005143#comment-14005143
 ] 

ASF subversion and git services commented on SOLR-5309:
---

Commit 1596661 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596661 ]

SOLR-5309: Fix DUP.processDelete to route delete-by-id to one sub-shard only. 
Enable ShardSplitTest again.

 Investigate ShardSplitTest failures
 ---

 Key: SOLR-5309
 URL: https://issues.apache.org/jira/browse/SOLR-5309
 Project: Solr
  Issue Type: Task
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Blocker

 Investigate why ShardSplitTest if failing sporadically.
 Some recent failures:
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/3328/
 http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7760/
 http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/861/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5285) Solr response format should support child Docs

2014-05-21 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-5285:


Attachment: SOLR-5285.patch

Fixed the class cast exception.

This passes for me now -
{noformat} 
ant test -Dtests.method=testChildDoctransformer -Dtests.seed=720251997BEC4F70 
-Dtests.slow=true -Dtests.locale=sk -Dtests.timezone=Pacific/Fiji 
-Dtests.file.encoding=UTF-8
{noformat}

Also ran it over 20 times and it is passing.

 Solr response format should support child Docs
 --

 Key: SOLR-5285
 URL: https://issues.apache.org/jira/browse/SOLR-5285
 Project: Solr
  Issue Type: New Feature
Reporter: Varun Thacker
 Fix For: 4.9, 5.0

 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 javabin_backcompat_child_docs.bin


 Solr has added support for taking childDocs as input ( only XML till now ). 
 It's currently used for BlockJoinQuery. 
 I feel that if a user indexes a document with child docs, even if he isn't 
 using the BJQ features and is just searching which results in a hit on the 
 parentDoc, it's childDocs should be returned in the response format.
 [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would 
 be the place to add childDocs to the response.
 Now given a docId one needs to find out all the childDoc id's. A couple of 
 approaches which I could think of are 
 1. Maintain the relation between a parentDoc and it's childDocs during 
 indexing time in maybe a separate index?
 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a 
 parentDoc it finds out all the childDocs but this requires a childScorer.
 Am I missing something obvious on how to find the relation between a 
 parentDoc and it's childDocs because none of the above solutions for this 
 look right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5648) Index/search multi-valued time durations

2014-05-21 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-5648:
-

Attachment: LUCENE-5648.patch

Updated patch:
* Support ranges like 2014 TO 2014-03 which is semantically the same thing as 
2014-01 TO 2014-03.  This means you can now do [* TO whatever].
* Parses calendar ranges.  This means you can round-trip toString() and 
parseShape() wether it's a single Calendar value 2014-05 or a range [* TO 
2013].

 Index/search multi-valued time durations
 

 Key: LUCENE-5648
 URL: https://issues.apache.org/jira/browse/LUCENE-5648
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Attachments: LUCENE-5648.patch, LUCENE-5648.patch, LUCENE-5648.patch, 
 LUCENE-5648.patch


 If you need to index a date/time duration, then the way to do that is to have 
 a pair of date fields; one for the start and one for the end -- pretty 
 straight-forward. But if you need to index a variable number of durations per 
 document, then the options aren't pretty, ranging from denormalization, to 
 joins, to using Lucene spatial with 2D as described 
 [here|http://wiki.apache.org/solr/SpatialForTimeDurations].  Ideally it would 
 be easier to index durations, and work in a more optimal way.
 This issue implements the aforementioned feature using Lucene-spatial with a 
 new single-dimensional SpatialPrefixTree implementation. Unlike the other two 
 SPT implementations, it's not based on floating point numbers. It will have a 
 Date based customization that indexes levels at meaningful quantities like 
 seconds, minutes, hours, etc.  The point of that alignment is to make it 
 faster to query across meaningful ranges (i.e. [2000 TO 2014]) and to enable 
 a follow-on issue to facet on the data in a really fast way.
 I'll expect to have a working patch up this week.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6103) Add DateRangeField

2014-05-21 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-6103:
---

Attachment: SOLR-6103.patch

I updated LUCENE-5608 with the date range parsing, and now I added the 
DateRangeField here with tests.  Examples on how to index ranges are below. 
Note that they aren't necessarily explicit ranges, it can be implied by 
referring specifying a date instance to a desired granularity.  It includes the 
same syntax Solr supports, though doesn't do DateMath.
{noformat}
[* TO *]
2014-05-21T12:00:00.000Z
[2000 TO 2014-05-21]
{noformat}

By default, at search time the predicate is intersects, which means it'll match 
any overlap with an indexed date range.  It can be specified with op as a 
local-param.
{noformat}
q=dateRange:2014-05-21
q={!field f=dateRange op=Contains v=[1999 TO 2001]}
{noformat}
I opted for this new op local-param instead of using Lucene-spatial's awkward 
SpatialArgsParser format which looks like Intersects(foo).

 Add DateRangeField
 --

 Key: SOLR-6103
 URL: https://issues.apache.org/jira/browse/SOLR-6103
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley
 Attachments: SOLR-6103.patch


 LUCENE-5648 introduced a date range index  search capability in the spatial 
 module. This issue is for a corresponding Solr FieldType to be named 
 DateRangeField. LUCENE-5648 includes a parseCalendar(String) method that 
 parses a superset of Solr's strict date format.  It also parses partial dates 
 (e.g.: 2014-10  has month specificity), and the trailing 'Z' is optional, and 
 a leading +/- may be present (minus indicates BC era), and * means 
 all-time.  The proposed field type would use it to parse a string and also 
 both ends of a range query, but furthermore it will also allow an arbitrary 
 range query of the form {{calspec TO calspec}} such as:
 {noformat}2000 TO 2014-05-21T10{noformat}
 Which parses as the year 2000 thru 2014 May 21st 10am (GMT). 
 I suggest this syntax because it is aligned with Lucene's range query syntax. 
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6091) Race condition in prioritizeOverseerNodes can trigger extra QUIT operations

2014-05-21 Thread Jessica Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005264#comment-14005264
 ] 

Jessica Cheng commented on SOLR-6091:
-

[~noble.paul] Do you mean you still see race condition with this implementation 
(wrong overseer quitting), or do you mean that you have caught the race 
condition in your cluster?

 Race condition in prioritizeOverseerNodes can trigger extra QUIT operations
 ---

 Key: SOLR-6091
 URL: https://issues.apache.org/jira/browse/SOLR-6091
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7, 4.8
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 4.9, 5.0

 Attachments: SOLR-6091.patch


 When using the overseer roles feature, there is a possibility of more than 
 one thread executing the prioritizeOverseerNodes method and extra QUIT 
 commands being inserted into the overseer queue.
 At a minimum, the prioritizeOverseerNodes should be synchronized to avoid a 
 race condition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5285) Solr response format should support child Docs

2014-05-21 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5285:
---

Attachment: SOLR-5285.patch

Hey Varun,

I'd started looking ~ChildDocTransformerFactory.java:142 before i saw your new 
patch -- comparing the old code with the new code it still seems like this is 
more brittle than it needs to be (particularly in cases where the uniqueKey 
field type isn't a string -- ie: a TrieIntField)

I've attached an update that eliminates (most) of that brittle casting code to 
rely on the FieldType methods instead ... i still want to review the rest of 
the patch in more depth, but i wanted to go ahead and attach this update ASAP 
so you could take a look (and because i'm not sure how much more patch 
reviewing time i'll get in before i leave town tomorrow)





 Solr response format should support child Docs
 --

 Key: SOLR-5285
 URL: https://issues.apache.org/jira/browse/SOLR-5285
 Project: Solr
  Issue Type: New Feature
Reporter: Varun Thacker
 Fix For: 4.9, 5.0

 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, javabin_backcompat_child_docs.bin


 Solr has added support for taking childDocs as input ( only XML till now ). 
 It's currently used for BlockJoinQuery. 
 I feel that if a user indexes a document with child docs, even if he isn't 
 using the BJQ features and is just searching which results in a hit on the 
 parentDoc, it's childDocs should be returned in the response format.
 [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would 
 be the place to add childDocs to the response.
 Now given a docId one needs to find out all the childDoc id's. A couple of 
 approaches which I could think of are 
 1. Maintain the relation between a parentDoc and it's childDocs during 
 indexing time in maybe a separate index?
 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a 
 parentDoc it finds out all the childDocs but this requires a childScorer.
 Am I missing something obvious on how to find the relation between a 
 parentDoc and it's childDocs because none of the above solutions for this 
 look right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to enforce a majority quorum approach to accepting updates in SolrCloud

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005319#comment-14005319
 ] 

ASF subversion and git services commented on SOLR-5468:
---

Commit 1596703 from [~thelabdude] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1596703 ]

SOLR-5468: report replication factor that was achieved for an update request if 
requested by the client application; port from trunk

 Option to enforce a majority quorum approach to accepting updates in SolrCloud
 --

 Key: SOLR-5468
 URL: https://issues.apache.org/jira/browse/SOLR-5468
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.5
 Environment: All
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch


 I've been thinking about how SolrCloud deals with write-availability using 
 in-sync replica sets, in which writes will continue to be accepted so long as 
 there is at least one healthy node per shard.
 For a little background (and to verify my understanding of the process is 
 correct), SolrCloud only considers active/healthy replicas when acknowledging 
 a write. Specifically, when a shard leader accepts an update request, it 
 forwards the request to all active/healthy replicas and only considers the 
 write successful if all active/healthy replicas ack the write. Any down / 
 gone replicas are not considered and will sync up with the leader when they 
 come back online using peer sync or snapshot replication. For instance, if a 
 shard has 3 nodes, A, B, C with A being the current leader, then writes to 
 the shard will continue to succeed even if B  C are down.
 The issue is that if a shard leader continues to accept updates even if it 
 loses all of its replicas, then we have acknowledged updates on only 1 node. 
 If that node, call it A, then fails and one of the previous replicas, call it 
 B, comes back online before A does, then any writes that A accepted while the 
 other replicas were offline are at risk to being lost. 
 SolrCloud does provide a safe-guard mechanism for this problem with the 
 leaderVoteWait setting, which puts any replicas that come back online before 
 node A into a temporary wait state. If A comes back online within the wait 
 period, then all is well as it will become the leader again and no writes 
 will be lost. As a side note, sys admins definitely need to be made more 
 aware of this situation as when I first encountered it in my cluster, I had 
 no idea what it meant.
 My question is whether we want to consider an approach where SolrCloud will 
 not accept writes unless there is a majority of replicas available to accept 
 the write? For my example, under this approach, we wouldn't accept writes if 
 both BC failed, but would if only C did, leaving A  B online. Admittedly, 
 this lowers the write-availability of the system, so may be something that 
 should be tunable?
 From Mark M: Yeah, this is kind of like one of many little features that we 
 have just not gotten to yet. I’ve always planned for a param that let’s you 
 say how many replicas an update must be verified on before responding 
 success. Seems to make sense to fail that type of request early if you notice 
 there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5675) ID postings format

2014-05-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005348#comment-14005348
 ] 

ASF subversion and git services commented on LUCENE-5675:
-

Commit 1596708 from [~mikemccand] in branch 'dev/branches/lucene5675'
[ https://svn.apache.org/r1596708 ]

LUCENE-5675: working on ant precommit

 ID postings format
 

 Key: LUCENE-5675
 URL: https://issues.apache.org/jira/browse/LUCENE-5675
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir

 Today the primary key lookup in lucene is not that great for systems like 
 solr and elasticsearch that have versioning in front of IndexWriter.
 To some extend BlockTree can sometimes help avoid seeks by telling you the 
 term does not exist for a segment. But this technique (based on FST prefix) 
 is fragile. The only other choice today is bloom filters, which use up huge 
 amounts of memory.
 I don't think we are using everything we know: particularly the version 
 semantics.
 Instead, if the FST for the terms index used an algebra that represents the 
 max version for any subtree, we might be able to answer that there is no term 
 T with version  V in that segment very efficiently.
 Also ID fields dont need postings lists, they dont need stats like 
 docfreq/totaltermfreq, etc this stuff is all implicit. 
 As far as API, i think for users to provide IDs with versions to such a PF, 
 a start would to set a payload or whatever on the term field to get it thru 
 indexwriter to the codec. And a consumer of the codec can just cast the 
 Terms to a subclass that exposes the FST to do this version check efficiently.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5285) Solr response format should support child Docs

2014-05-21 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005349#comment-14005349
 ] 

Hoss Man commented on SOLR-5285:


{quote}
bq.  Why is the tag name in the JSON format childDocs but in the XML format 
it's childDoc (no plural) ? ... seems like those should be consistent.

I guess because in JSON the input is a JSON array hence childDocs, while in 
XML we use multiple childDoc tags to represent nested documents.
{quote}

That makes sense -- but now has me thinking back to the proposed usage in your 
earliest comment on this issue: why create a new {{childDoc}} element in the 
XML at all? why not just re-use {{doc}} (nested inside the existing 
{{doc}}) ... that seems like the most straight forward solution, and from 
what i can tell, that would probably simplify the changes to 
XMLResponseParser.java as well wouldn't it?

speaking of which -- i don't understand the need for changing the method sig 
for {{XMLResponseParser.readDocument}} ... why can't the method construct the 
SolrDocument objects itself?

bq. Added a non mandatory parameter called numChildDocs which makes it 
configurable. Although I'm not sure if the name is correct.

hmmm, yeah ... for consistency with the top level query we could use something 
like rows but the risk for confusion there seems like it outweighs the 
consistency factor.

how about limit ?

bq. Added a non mandatory parameter called childFilter ...

look good ... in general ChildDocTransformerFactory looks pretty good to me now 
-- although I just noticed a typo in the SolrException msg if parentFilter is 
null ... it refers to which -- but that doesn't apply here.

bq. 2. Created a new binary file for backcompatibility and forwardcompatibility.

I might be missing something, buti don't think 
{{testBackCompatForSolrDocumentWithChildDocs}} is actually asserting anything 
related to the child docs -- because it uses {{assertSolrDocumentEquals}}, but 
that method hasn't been updated to know about child docs, has it?



To sum up:

* In general, i think the current patch looks great
* remaining concerns about implementation:
** {{testBackCompatForSolrDocumentWithChildDocs}} doesn't seem valid to me w/o 
changes to {{assertSolrDocumentEquals}}
** err msg typo in {{ChildDocTransformerFactory}} needs fixed
** method sig change in {{XMLResponseParser.readDocument}} seems unneccesasary
* remaining questions about the API:
** better name for {{numChildDocs}} ? ... how about {{limit}} ?
** why use {{childDoc}} in XML instead of {{doc}} ?

 Solr response format should support child Docs
 --

 Key: SOLR-5285
 URL: https://issues.apache.org/jira/browse/SOLR-5285
 Project: Solr
  Issue Type: New Feature
Reporter: Varun Thacker
 Fix For: 4.9, 5.0

 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, 
 SOLR-5285.patch, javabin_backcompat_child_docs.bin


 Solr has added support for taking childDocs as input ( only XML till now ). 
 It's currently used for BlockJoinQuery. 
 I feel that if a user indexes a document with child docs, even if he isn't 
 using the BJQ features and is just searching which results in a hit on the 
 parentDoc, it's childDocs should be returned in the response format.
 [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would 
 be the place to add childDocs to the response.
 Now given a docId one needs to find out all the childDoc id's. A couple of 
 approaches which I could think of are 
 1. Maintain the relation between a parentDoc and it's childDocs during 
 indexing time in maybe a separate index?
 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a 
 parentDoc it finds out all the childDocs but this requires a childScorer.
 Am I missing something obvious on how to find the relation between a 
 parentDoc and it's childDocs because none of the above solutions for this 
 look right.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6106) Sometimes all the cores on a SolrCloud node cannot find their config when intializing the ManagedResourceStorage storageIO impl

2014-05-21 Thread Timothy Potter (JIRA)
Timothy Potter created SOLR-6106:


 Summary: Sometimes all the cores on a SolrCloud node cannot find 
their config when intializing the ManagedResourceStorage storageIO impl
 Key: SOLR-6106
 URL: https://issues.apache.org/jira/browse/SOLR-6106
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor


Had one of my many nodes have problems initializing all cores due to the 
following problem. It was resolved by restarting the node (hence the minor 
classification).

2014-05-21 20:39:17,898 [coreLoadExecutor-4-thread-27] ERROR 
solr.core.CoreContainer  - Unable to create core: small46_shard1_replica1
org.apache.solr.common.SolrException: Could not find config name for 
collection:small46
at org.apache.solr.core.SolrCore.init(SolrCore.java:858)
at org.apache.solr.core.SolrCore.init(SolrCore.java:641)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:556)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:261)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.solr.common.SolrException: Could not find config name for 
collection:small46
at 
org.apache.solr.rest.ManagedResourceStorage.newStorageIO(ManagedResourceStorage.java:99)
at org.apache.solr.core.SolrCore.initRestManager(SolrCore.java:2339)
at org.apache.solr.core.SolrCore.init(SolrCore.java:845)
... 10 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5693) don't write deleted documents on flush

2014-05-21 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005450#comment-14005450
 ] 

Robert Muir commented on LUCENE-5693:
-

{quote}
I disagree: I think we discover places that are relying on deleted docs 
behavior, i.e. test bugs. When I did this on LUCENE-5675 there were only a few 
places that relied on deleted docs.
{quote}

That's not the complexity i'm concerned about. I'm talking about bugs in lucene 
itself because shit like the following happens:
* various codec apis unable to cope with writing 0 doc segments because all the 
docs were deleted
* various codec apis with corner case bugs because stuff like 'maxdoc' in 
segmentinfo they are fed is inconsistent with what they saw.
* various index/search apis unable to cope with docid X appears in codec api Y 
but not codec api Z where its expected to exist.
* slow O(n) passes thru indexwriter apis to recalculate and reshuffle ordinals 
and stuff like that.
* corner case bugs like incorrect statistics.
* additional complexity inside indexwriter/codecs to handle this, when just 
merging away would be better.

So if we want to rename the issue to as a special case, don't write deleted 
postings on flush and remove the TODO about changing this for things like DV, 
then I'm fine.

But otherwise, if this is intended to be a precedent of how things should work, 
then I strongly feel we should not do this. The additional complexity and 
corner cases are simply not worth it.

 don't write deleted documents on flush
 --

 Key: LUCENE-5693
 URL: https://issues.apache.org/jira/browse/LUCENE-5693
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
 Attachments: LUCENE-5693.patch


 When we flush a new segment, sometimes some documents are born deleted, 
 e.g. if the app did a IW.deleteDocuments that matched some not-yet-flushed 
 documents.
 We already compute the liveDocs on flush, but then we continue (wastefully) 
 to send those known-deleted documents to all Codec parts.
 I started to implement this on LUCENE-5675 but it was too controversial.
 Also, I expect typically the number of deleted docs is 0, or small, so not 
 writing born deleted docs won't be much of a win for most apps.  Still it 
 seems silly to write them, consuming IO/CPU in the process, only to consume 
 more IO/CPU later for merging to re-delete them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5648) Index/search multi-valued time durations

2014-05-21 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14005576#comment-14005576
 ] 

David Smiley commented on LUCENE-5648:
--

I was putting some thought into the different ways of indexing durations, 
listing pros  cons.  The approach here should work very well but it has two 
main down-sides of note:
* Overlapping or adjacent ranges are effectively coalesced, which impacts the 
semantics of Contains  Within.  To be clear, it's a non-issue if the multiple 
durations for a given field on a document don't touch.  But if you wanted to 
index say \[2000 TO 2014] and \[2006 TO 2007] then it's as if the 2nd range 
doesn't even exist.  The document won't match for IsWithin a query of 
\[2006-2008].
* The worst-case number of terms generated for a range at index-time is pretty 
high.  If you wanted to index Long.MIN_VALUE+1 TO Long.MAX_VALUE-1 (which spans 
hundreds of millions of years), we're talking about 14k terms(*).  But it's 
certainly not commonly that bad unless you were indexing random milliseconds at 
random millennia. Indexing a 2 adjacent month duration in the same year is only 
7 terms.  At search time, lots of hypothetical terms in a duration isn't an 
issue for RPTs algorithms for the common case of a sparsely populated term 
space.

Interestingly, using a 2D prefix-tree for single-dimensional durations 
expressed as points doesn't have these shortcomings.  But that approach is 
slower to search than this approach (more possible terms in a search area; it's 
half of the square of the number of terms in this 1D tree), and is not amenable 
to terms-enumeration style interval faceting that I'll be doing next.

(*) The number of terms currently being generated would be cut by ~40-50% once 
LUCENE-4942 gets done.

 Index/search multi-valued time durations
 

 Key: LUCENE-5648
 URL: https://issues.apache.org/jira/browse/LUCENE-5648
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Attachments: LUCENE-5648.patch, LUCENE-5648.patch, LUCENE-5648.patch, 
 LUCENE-5648.patch


 If you need to index a date/time duration, then the way to do that is to have 
 a pair of date fields; one for the start and one for the end -- pretty 
 straight-forward. But if you need to index a variable number of durations per 
 document, then the options aren't pretty, ranging from denormalization, to 
 joins, to using Lucene spatial with 2D as described 
 [here|http://wiki.apache.org/solr/SpatialForTimeDurations].  Ideally it would 
 be easier to index durations, and work in a more optimal way.
 This issue implements the aforementioned feature using Lucene-spatial with a 
 new single-dimensional SpatialPrefixTree implementation. Unlike the other two 
 SPT implementations, it's not based on floating point numbers. It will have a 
 Date based customization that indexes levels at meaningful quantities like 
 seconds, minutes, hours, etc.  The point of that alignment is to make it 
 faster to query across meaningful ranges (i.e. [2000 TO 2014]) and to enable 
 a follow-on issue to facet on the data in a really fast way.
 I'll expect to have a working patch up this week.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-Linux (32bit/ibm-j9-jdk7) - Build # 10236 - Failure!

2014-05-21 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10236/
Java: 32bit/ibm-j9-jdk7 
-Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;}

1 tests failed.
REGRESSION:  org.apache.lucene.document.TestLazyDocument.testLazy

Error Message:
read past EOF: SlicedIndexInput(SlicedIndexInput(_0.tis in 
RAMInputStream(name=_0.cfs)) in RAMInputStream(name=_0.cfs) 
slice=2021238:3239819)

Stack Trace:
java.io.EOFException: read past EOF: SlicedIndexInput(SlicedIndexInput(_0.tis 
in RAMInputStream(name=_0.cfs)) in RAMInputStream(name=_0.cfs) 
slice=2021238:3239819)
at 
__randomizedtesting.SeedInfo.seed([B526C0B1365A2211:84FFCDE38606F6BA]:0)
at 
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:265)
at 
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:51)
at org.apache.lucene.store.DataInput.readVInt(DataInput.java:120)
at 
org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:218)
at 
org.apache.lucene.store.MockIndexInputWrapper.readVInt(MockIndexInputWrapper.java:161)
at org.apache.lucene.codecs.lucene3x.TermBuffer.read(TermBuffer.java:61)
at 
org.apache.lucene.codecs.lucene3x.SegmentTermEnum.next(SegmentTermEnum.java:142)
at 
org.apache.lucene.codecs.lucene3x.SegmentTermEnum.scanTo(SegmentTermEnum.java:175)
at 
org.apache.lucene.codecs.lucene3x.TermInfosReader.seekEnum(TermInfosReader.java:282)
at 
org.apache.lucene.codecs.lucene3x.TermInfosReader.get(TermInfosReader.java:207)
at 
org.apache.lucene.codecs.lucene3x.TermInfosReader.terms(TermInfosReader.java:352)
at 
org.apache.lucene.codecs.lucene3x.Lucene3xFields$PreTermsEnum.reset(Lucene3xFields.java:687)
at 
org.apache.lucene.codecs.lucene3x.Lucene3xFields$PreTerms.iterator(Lucene3xFields.java:180)
at org.apache.lucene.index.TermContext.build(TermContext.java:94)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:165)
at 
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:684)
at 
org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:59)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
at 
org.apache.lucene.document.TestLazyDocument.testLazy(TestLazyDocument.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:619)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/ibm-j9-jdk7) - Build # 10236 - Failure!

2014-05-21 Thread Robert Muir
This won't reproduce on an oracle JVM: I think its a j9 bug? Can we
update our J9s in jenkins? Looks like there are new ones available
with lots of fixes.

On Wed, May 21, 2014 at 11:38 PM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10236/
 Java: 32bit/ibm-j9-jdk7 
 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;}

 1 tests failed.
 REGRESSION:  org.apache.lucene.document.TestLazyDocument.testLazy

 Error Message:
 read past EOF: SlicedIndexInput(SlicedIndexInput(_0.tis in 
 RAMInputStream(name=_0.cfs)) in RAMInputStream(name=_0.cfs) 
 slice=2021238:3239819)

 Stack Trace:
 java.io.EOFException: read past EOF: SlicedIndexInput(SlicedIndexInput(_0.tis 
 in RAMInputStream(name=_0.cfs)) in RAMInputStream(name=_0.cfs) 
 slice=2021238:3239819)
 at 
 __randomizedtesting.SeedInfo.seed([B526C0B1365A2211:84FFCDE38606F6BA]:0)
 at 
 org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:265)
 at 
 org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:51)
 at org.apache.lucene.store.DataInput.readVInt(DataInput.java:120)
 at 
 org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:218)
 at 
 org.apache.lucene.store.MockIndexInputWrapper.readVInt(MockIndexInputWrapper.java:161)
 at 
 org.apache.lucene.codecs.lucene3x.TermBuffer.read(TermBuffer.java:61)
 at 
 org.apache.lucene.codecs.lucene3x.SegmentTermEnum.next(SegmentTermEnum.java:142)
 at 
 org.apache.lucene.codecs.lucene3x.SegmentTermEnum.scanTo(SegmentTermEnum.java:175)
 at 
 org.apache.lucene.codecs.lucene3x.TermInfosReader.seekEnum(TermInfosReader.java:282)
 at 
 org.apache.lucene.codecs.lucene3x.TermInfosReader.get(TermInfosReader.java:207)
 at 
 org.apache.lucene.codecs.lucene3x.TermInfosReader.terms(TermInfosReader.java:352)
 at 
 org.apache.lucene.codecs.lucene3x.Lucene3xFields$PreTermsEnum.reset(Lucene3xFields.java:687)
 at 
 org.apache.lucene.codecs.lucene3x.Lucene3xFields$PreTerms.iterator(Lucene3xFields.java:180)
 at org.apache.lucene.index.TermContext.build(TermContext.java:94)
 at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:165)
 at 
 org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:684)
 at 
 org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:59)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
 at 
 org.apache.lucene.document.TestLazyDocument.testLazy(TestLazyDocument.java:84)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
 at java.lang.reflect.Method.invoke(Method.java:619)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
  

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1592 - Failure!

2014-05-21 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1592/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 11216 lines...]
   [junit4] JVM J0: stderr was not empty, see: 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140522_042326_570.syserr
   [junit4]  JVM J0: stderr (verbatim) 
   [junit4] java(215,0x134ae9000) malloc: *** error for object 0x134bd8320: 
pointer being freed was not allocated
   [junit4] *** set a breakpoint in malloc_error_break to debug
   [junit4]  JVM J0: EOF 

[...truncated 1 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre/bin/java 
-XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=274640F91B914DCD -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false 
-Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 
-DtempDir=. -Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.leaveTemporary=false -Dtests.filterstacks=true -Dtests.disableHdfs=true 
-Dfile.encoding=UTF-8 -classpath