date:20160609

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/16959/
Java: 64bit/jdk-9-ea+121 -XX:+UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.handler.TestReqParamsAPI.test

Error Message:
Could not get expected value  'CY val' for path 'response/params/y/c' full 
output: {   "responseHeader":{ "status":0, "QTime":0},   "response":{   
  "znodeVersion":0, "params":{"x":{ "a":"A val", "b":"B 
val", "":{"v":0},  from server:  http://127.0.0.1:34305/collection1

Stack Trace:
java.lang.AssertionError: Could not get expected value  'CY val' for path 
'response/params/y/c' full output: {
  "responseHeader":{
"status":0,
"QTime":0},
  "response":{
"znodeVersion":0,
"params":{"x":{
"a":"A val",
"b":"B val",
"":{"v":0},  from server:  http://127.0.0.1:34305/collection1
at 
__randomizedtesting.SeedInfo.seed([8B83457C87460750:3D77AA629BA6AA8]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.core.TestSolrConfigHandler.testForResponseElement(TestSolrConfigHandler.java:481)
at 
org.apache.solr.handler.TestReqParamsAPI.testReqParams(TestReqParamsAPI.java:160)
at 
org.apache.solr.handler.TestReqParamsAPI.test(TestReqParamsAPI.java:62)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native 
Method)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62)
at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:531)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at

[jira] [Commented] (SOLR-9174) After Solr 5.5, mm parameter doesn't work properly

2016-06-09 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323914#comment-15323914
 ] 

Steve Rowe commented on SOLR-9174:
--

David, when I run your query against 5.4.1, it does indeed match 3 documents, 
but I see two problems:

1. The parsedquery includes no minShouldMatch at all:

{noformat}
"parsedquery":"(+(DisjunctionMaxQuery((text:ipod)) 
(+DisjunctionMaxQuery((text:power)) 
+DisjunctionMaxQuery((text:nonexistentword)/no_coord",
{noformat}

2. The mm spec "2>-1" means: if there are two or fewer optional clauses, then 
all clauses are required; if there are more than 2 optional clauses, then all 
but one is required.  In your query, there are only two optional clauses, so 
both are required in the case that minShouldMatch applies.

By contrast, in 5.5.0, the parsedquery includes a minShouldMatch of 2:

{noformat}
"parsedquery":"(+((DisjunctionMaxQuery((text:ipod)) 
(+DisjunctionMaxQuery((text:power)) 
+DisjunctionMaxQuery((text:nonexistentword~2))/no_coord",
{noformat}

If I change the mm spec to "1>-1" (if more than one optional clause, then all 
but one is required), then the parsedquery includes a minShouldMatch of 1, and 
I get 3 hits.

So I don't think you were "having it both ways"; rather, your explicit mm spec 
was being ignored when you included explicit operators in queries against Solr 
< 5.5.0.


> After Solr 5.5, mm parameter doesn't work properly
> --
>
> Key: SOLR-9174
> URL: https://issues.apache.org/jira/browse/SOLR-9174
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers, search
>Affects Versions: 5.5, 6.0, 6.0.1
>Reporter: Issei Nishigata
>
> “mm" parameter does not work properly, when I set "q.op=AND” after Solr 5.5.
> In Solr 5.4, mm parameter works expectedly with the following setting.
> [schema]
> {code:xml}
> 
>   
>  maxGramSize="2"/>
>   
> 
> {code}
> [request]
> {quote}
> http://localhost:8983/solr/collection1/select?defType=edismax=AND=2=solar
> {quote}
> After Solr 5.5, the result will not be the same as Solr 5.4.
> [Solr 5.4]
> {code:xml}
> 
> ...
>   
> 2
> solar
> edismax
> AND
>   
> ...
> 
>   
> 0
> 
>   solr
> 
>   
> 
> 
>   solar
>   solar
>   
>   (+DisjunctionMaxQuerytext:so text:ol text:la text:ar)~2/no_coord
>   
>   +(((text:so text:ol text:la 
> text:ar)~2))
>   ...
> 
> {code}
> [Solr 6.0.1]
> {code:xml}
> 
> ...
>   
> 2
> solar
> edismax
> AND
>   
> ...
> 
>   
> solar
> solar
> 
> (+DisjunctionMaxQuery(((+text:so +text:ol +text:la +text:ar/no_coord
> 
> +((+text:so +text:ol +text:la 
> +text:ar))
> ...
> {code}
> As shown above, parsedquery also differs from Solr 5.4 and Solr 6.0.1(after 
> Solr 5.5).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7326) TestSimpleTextPostingsFormat.testInvertedWrite() failure: An SPI class of type PostingsFormat with name 'SimpleText' does not exist

2016-06-09 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-7326:
---
Description: 
My Jenkins found a master seed that reproduces on branch_6x but not on 
branch_6_1:

{noformat}
Checking out Revision 963206969eddca6ec743f5f0901e0abdfeacd3cc 
(refs/remotes/origin/master)
[...]
  2> NOTE: reproduce with: ant test  -Dtestcase=TestSimpleTextPostingsFormat 
-Dtests.method=testInvertedWrite -Dtests.seed=7555A4CF724BDB74 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=es -Dtests.timezone=Australia/Brisbane -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
[23:09:22.113] ERROR   0.23s J3 | 
TestSimpleTextPostingsFormat.testInvertedWrite <<<
   > Throwable #1: java.lang.IllegalArgumentException: An SPI class of type 
org.apache.lucene.codecs.PostingsFormat with name 'SimpleText' does not exist.  
You need to add the corresponding JAR file supporting this SPI to your 
classpath.  The current classpath supports the following names: [MockRandom, 
RAMOnly, LuceneFixedGap, LuceneVarGapFixedInterval, 
LuceneVarGapDocFreqInterval, TestBloomFilteredLucenePostings, Asserting, 
BlockTreeOrds, BloomFilter, Direct, FSTOrd50, FST50, Memory, AutoPrefix, 
Lucene50]
   >at 
__randomizedtesting.SeedInfo.seed([7555A4CF724BDB74:6C21E9040C1050A8]:0)
   >at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:116)
   >at 
org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:112)
   >at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:258)
   >at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:341)
   >at 
org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:106)
   >at org.apache.lucene.index.SegmentReader.(SegmentReader.java:67)
   >at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61)
   >at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:53)
   >at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:675)
   >at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:76)
   >at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
   >at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:383)
   >at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:313)
   >at 
org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:519)
[...]
  2> NOTE: test params are: codec=Asserting(Lucene62): {}, docValues:{}, 
maxPointsInLeafNode=1657, maxMBSortInHeap=5.09627287608914, 
sim=RandomSimilarity(queryNorm=true,coord=crazy): {f_DOCS_AND_FREQS=DFR 
I(ne)Z(0.3), field=DFR I(ne)L3(800.0), 
f_DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS=DFI(Saturated), 
f_DOCS_AND_FREQS_AND_POSITIONS=IB SPL-L1, titleTokenized=IB LL-L3(800.0), 
body=DFR I(F)L1, f_DOCS=DFR I(n)L2}, locale=es, timezone=Australia/Brisbane
  2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_77 
(64-bit)/cpus=16,threads=1,free=257107688,total=486539264
{noformat}

  was:
My Jenkins found a master seed that reproduces on branch_6x but not on 
branch_6_1:

{noformat}
Checking out Revision 963206969eddca6ec743f5f0901e0abdfeacd3cc 
(refs/remotes/origin/master)
[...]
  2> NOTE: reproduce with: ant test  -Dtestcase=TestSimpleTextPostingsFormat 
-Dtests.method=testInvertedWrite -Dtests.seed=7555A4CF724BDB74 
-Dtests.multiplier=2 -Dtests.nightly=t
rue -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=es -Dtests.timezone=Australia/Brisbane -Dtests.asserts=true 
-Dtests.fi
le.encoding=UTF-8
[23:09:22.113] ERROR   0.23s J3 | 
TestSimpleTextPostingsFormat.testInvertedWrite <<<
   > Throwable #1: java.lang.IllegalArgumentException: An SPI class of type 
org.apache.lucene.codecs.PostingsFormat with name 'SimpleText' does not exist.  
You need to add the co
rresponding JAR file supporting this SPI to your classpath.  The current 
classpath supports the following names: [MockRandom, RAMOnly, LuceneFixedGap, 
LuceneVarGapFixedInterval, 
LuceneVarGapDocFreqInterval, TestBloomFilteredLucenePostings, Asserting, 
BlockTreeOrds, BloomFilter, Direct, FSTOrd50, FST50, Memory, AutoPrefix, 
Lucene50]
   >at 
__randomizedtesting.SeedInfo.seed([7555A4CF724BDB74:6C21E9040C1050A8]:0)
   >at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:116)
   >at 
org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:112)
   >at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:258)
   >at

[jira] [Created] (LUCENE-7326) TestSimpleTextPostingsFormat.testInvertedWrite() failure: An SPI class of type PostingsFormat with name 'SimpleText' does not exist

2016-06-09 Thread Steve Rowe (JIRA)

Steve Rowe created LUCENE-7326:
--

 Summary: TestSimpleTextPostingsFormat.testInvertedWrite() failure: 
An SPI class of type PostingsFormat with name 'SimpleText' does not exist
 Key: LUCENE-7326
 URL: https://issues.apache.org/jira/browse/LUCENE-7326
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Steve Rowe


My Jenkins found a master seed that reproduces on branch_6x but not on 
branch_6_1:

{noformat}
Checking out Revision 963206969eddca6ec743f5f0901e0abdfeacd3cc 
(refs/remotes/origin/master)
[...]
  2> NOTE: reproduce with: ant test  -Dtestcase=TestSimpleTextPostingsFormat 
-Dtests.method=testInvertedWrite -Dtests.seed=7555A4CF724BDB74 
-Dtests.multiplier=2 -Dtests.nightly=t
rue -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=es -Dtests.timezone=Australia/Brisbane -Dtests.asserts=true 
-Dtests.fi
le.encoding=UTF-8
[23:09:22.113] ERROR   0.23s J3 | 
TestSimpleTextPostingsFormat.testInvertedWrite <<<
   > Throwable #1: java.lang.IllegalArgumentException: An SPI class of type 
org.apache.lucene.codecs.PostingsFormat with name 'SimpleText' does not exist.  
You need to add the co
rresponding JAR file supporting this SPI to your classpath.  The current 
classpath supports the following names: [MockRandom, RAMOnly, LuceneFixedGap, 
LuceneVarGapFixedInterval, 
LuceneVarGapDocFreqInterval, TestBloomFilteredLucenePostings, Asserting, 
BlockTreeOrds, BloomFilter, Direct, FSTOrd50, FST50, Memory, AutoPrefix, 
Lucene50]
   >at 
__randomizedtesting.SeedInfo.seed([7555A4CF724BDB74:6C21E9040C1050A8]:0)
   >at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:116)
   >at 
org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:112)
   >at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:258)
   >at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:341)
   >at 
org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:106)
   >at org.apache.lucene.index.SegmentReader.(SegmentReader.java:67)
   >at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61)
   >at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:53)
   >at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:675)
   >at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:76)
   >at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
   >at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:383)
   >at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:313)
   >at 
org.apache.lucene.index.BasePostingsFormatTestCase.testInvertedWrite(BasePostingsFormatTestCase.java:519)
[...]
  2> NOTE: test params are: codec=Asserting(Lucene62): {}, docValues:{}, 
maxPointsInLeafNode=1657, maxMBSortInHeap=5.09627287608914, 
sim=RandomSimilarity(queryNorm=true,coord=crazy): {f_DOCS_AND_FREQS=DFR 
I(ne)Z(0.3), field=DFR I(ne)L3(800.0), 
f_DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS=DFI(Saturated), 
f_DOCS_AND_FREQS_AND_POSITIONS=IB SPL-L1, titleTokenized=IB LL-L3(800.0), 
body=DFR I(F)L1, f_DOCS=DFR I(n)L2}, locale=es, timezone=Australia/Brisbane
  2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 1.8.0_77 
(64-bit)/cpus=16,threads=1,free=257107688,total=486539264
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-06-09 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323805#comment-15323805
 ] 

Hoss Man commented on SOLR-5944:


bq. I had a question come up today that I wanted to ask for posterity. What 
gets returned between the time we update one of these and a commit occurs? The 
old value or the new one? I'd assumed the old one but wanted to be sure.

in theory, it's exactly identical to the existing behavior of an atomic update: 
searching returns only the committed values, an RTG will return the "new" 
(uncommitted) value.


> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-06-09 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323803#comment-15323803
 ] 

Erick Erickson commented on SOLR-5944:
--

I had a question come up today that I wanted to ask for posterity. What gets 
returned between the time we update one of these and a commit occurs? The old 
value or the new one? I'd assumed the old one but wanted to be sure.

And explicitly this _only_ applies to fields for which indexed=false I see, 
which answers another of the questions that came up.


> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7287) New lemma-tizer plugin for ukrainian language.

2016-06-09 Thread Andriy Rysin (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323800#comment-15323800
 ] 

Andriy Rysin commented on LUCENE-7287:
--

Ok, I've imported lucene-sorl and the Ukrainian analyzer project from 
[~mr_gambal] into Eclipse and looked through the code.
Unfortunately we can't use the whole morfologik package as is - it's very 
specific for Polish. We could still probably use part of morfologik for compact 
dictionary representation. The whole Ukrainian dictionary in this format with 
POS tags is ~1.6MB compared to 98M in csv and we could probably make it smaller 
if we strip the tags.
There are several things I'd like to note:
1) this dictionary is for inflections (not related words) so this stemming will 
be producing lemmas not quite root words (this is probably ok and in some cases 
even better?)
2) as this is dictionary-based stemming it won't stem unknown words (but 
dictionary contains ~200K lemmas so it should give good output)
3) as Ukrainian has high level of inflection (nouns produce up to 7 forms, 
direct verbs up to 20, reverse verbs up to 30 forms) with many rules and 
exceptions developing quality rule-base stemming will not be trivial
4) I was planning to work on Ukrainian analyzer in a separate project but if 
it's better for the review process I can fork lucene-solr and work inside the 
fork
5) I am thinking to create org.apache.lucene.analysis.uk classes based on 
[~mr_gambal]'s work and the csv file we have and once it's working try more 
compact representation

The question: once we have it working shall we include the dictionary in the 
lucene project or make it an external dependency (like with 
morfologik-polish.jar)? First is simpler but second will allow easy updates for 
the dictionary (which I can see being actively developed for another year or 
two) and also will keep the binary blob out of the project. I am leaning 
towards second but open for discussion.



> New lemma-tizer plugin for ukrainian language.
> --
>
> Key: LUCENE-7287
> URL: https://issues.apache.org/jira/browse/LUCENE-7287
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Dmytro Hambal
>Priority: Minor
>  Labels: analysis, language, plugin
>
> Hi all,
> I wonder whether you are interested in supporting a plugin which provides a 
> mapping between ukrainian word forms and their lemmas. Some tests and docs go 
> out-of-the-box =) .
> https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer
> It's really simple but still works and generates some value for its users.
> More: https://github.com/elastic/elasticsearch/issues/18303



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-06-09 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323784#comment-15323784
 ] 

Hoss Man commented on SOLR-5944:



bq. When update2 (say a partial update) arrives before update1 (say a full 
update, on which update2 depends), *then the call for indexing update2 is a 
blocking call* (which finishes either after update1 is indexed, or timeout is 
reached).

Ahhh... now it makes sense to me. The part I wasn't getting before was that 
update2 blocks on the replica until it sees the update1 it is dependent on.

I feel like there is probably a way we could write a more sophisticate "grey 
box" type test for this leveraging callbacks in the DebugFilter, but I'm having 
trouble working out what that would really look like.

I think the hueristic approach you're taking here is generall fine for now (as 
a way to _try_ to run the updates in a given order even though we know there 
are no garuntees) but i have a few suggestions to improve things:

* lots more comments in the test code to make it clear that we use multiple 
threads because each update may block if it depends on another update
* replace the comments on the sleep calls to make it clear that while we can't 
garuntee/trust what order the updates are executed in since multiple threads 
are involved, we're trying to bias the thread scheduling to run them in the 
order submitted
** (the wording right now seems definitive and makes the code look clearly 
suspicious)
* create {{atLeast(3)}} updates instead of just a fixed set of "3" so we 
increase our odds of finding potential bugs when more then one update is out of 
order.
* loop over multiple (random) permutations of orderings of the updates
** don't worry about wether a given ordering is actually correct, that's a 
valid random ordering for the purposes of the test
** a simple comment saying we know it's possible but it doesn't affect any 
assumptions/assertions in the test is fine
* for each random permutation, execute it (and check the results) multiple times
** this will help increase the odds that the thread scheduling actaully winds 
up running our updates in the order we were hoping for.
* essentially this should be a a micro "stress test" of updates in arbitrary 
order

Something like...

{code}
final String ID = "0";
final int numUpdates = atLeast(3);
final int numPermutationTotest = atLeast(5);
for (int p = 0; p < numPermutationTotest; p++) {
  del("*:*);
  commit();
  index("id",ID, ...); // goes to all replicas
  commit();
  long version = assertExpectedValuesViaRTG(LEADER, ID, ...);
  List updates = makeListOfSequentialSimulatedUpdates(ID, 
version, numUpdates);
  for (UpdateRequest req : updates) {
assertEquals(0, REPLICA_1.requets(req).getStatus());
  }
  Collections.shuffle(updates, random());
  // this method is where you'd comment the hell out of why we use threads for 
this,
  // and can be re-used in the other place where a threadpool is used...
  assertSendUpdatesInThreadsWithDelay(REPLICA_0, updates, 100ms);
  for (SolrClient client : NONLEADERS) [
// assert value on replica matches original value + numUpdates
  }
}
{code}



As a related matter -- if we are expecting a replica to "block & eventually 
time out" when it sees an out of order update, then there should be a white box 
test asserting the expected failure situation as well -- something like...

{code}
final String ID = "0";
del("*:*);
commit();
index("id",ID, ...);
UpdateRequest req = simulatedUpdateRequest(version + 1, ID, ...);
Timer timer = new Timer();
timer.start();
SolrServerException e = expectThrows(() -> { REPLICA_0.request(req); });
timer.stop();
assert( /* elapsed time of timer is at least the X that we expect it to block 
for */ )
assert(e.getgetHttpStatusMesg().contains("something we expect it to say if the 
update was out of order"))
assertEquls(/* whatever we expect in this case */, e.getHttpStatusCode());
{code}


> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
>

[JENKINS] Lucene-Solr-master-Solaris (64bit/jdk1.8.0) - Build # 638 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Solaris/638/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseG1GC

No tests ran.

Build Log:
[...truncated 11409 lines...]
ERROR: Connection was broken: java.io.IOException: Unexpected termination of 
the channel
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at 
java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325)
at 
java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794)
at 
java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
at java.io.ObjectInputStream.(ObjectInputStream.java:299)
at 
hudson.remoting.ObjectInputStreamEx.(ObjectInputStreamEx.java:48)
at 
hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

Build step 'Invoke Ant' marked build as failure
ERROR: Step ‘Archive the artifacts’ failed: no workspace for 
Lucene-Solr-master-Solaris #638
ERROR: Step ‘Scan for compiler warnings’ failed: no workspace for 
Lucene-Solr-master-Solaris #638
ERROR: Step ‘Publish JUnit test result report’ failed: no workspace for 
Lucene-Solr-master-Solaris #638
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-6.x-MacOSX (64bit/jdk1.8.0) - Build # 197 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-MacOSX/197/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication

Error Message:
[/Users/jenkins/workspace/Lucene-Solr-6.x-MacOSX/solr/build/solr-core/test/J0/temp/solr.handler.TestReplicationHandler_828895B9C1874636-001/solr-instance-002/./collection1/data,
 
/Users/jenkins/workspace/Lucene-Solr-6.x-MacOSX/solr/build/solr-core/test/J0/temp/solr.handler.TestReplicationHandler_828895B9C1874636-001/solr-instance-002/./collection1/data/index.20160610113428377,
 
/Users/jenkins/workspace/Lucene-Solr-6.x-MacOSX/solr/build/solr-core/test/J0/temp/solr.handler.TestReplicationHandler_828895B9C1874636-001/solr-instance-002/./collection1/data/index.20160610113428637]
 expected:<2> but was:<3>

Stack Trace:
java.lang.AssertionError: 
[/Users/jenkins/workspace/Lucene-Solr-6.x-MacOSX/solr/build/solr-core/test/J0/temp/solr.handler.TestReplicationHandler_828895B9C1874636-001/solr-instance-002/./collection1/data,
 
/Users/jenkins/workspace/Lucene-Solr-6.x-MacOSX/solr/build/solr-core/test/J0/temp/solr.handler.TestReplicationHandler_828895B9C1874636-001/solr-instance-002/./collection1/data/index.20160610113428377,
 
/Users/jenkins/workspace/Lucene-Solr-6.x-MacOSX/solr/build/solr-core/test/J0/temp/solr.handler.TestReplicationHandler_828895B9C1874636-001/solr-instance-002/./collection1/data/index.20160610113428637]
 expected:<2> but was:<3>
at 
__randomizedtesting.SeedInfo.seed([828895B9C1874636:75FB7BE1076FE9D0]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.handler.TestReplicationHandler.checkForSingleIndex(TestReplicationHandler.java:902)
at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:1334)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at

[JENKINS] Lucene-Solr-6.x-Linux (32bit/jdk1.8.0_92) - Build # 865 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/865/
Java: 32bit/jdk1.8.0_92 -client -XX:+UseG1GC

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.schema.TestManagedSchemaAPI

Error Message:
ObjectTracker found 4 object(s) that were not released!!! [TransactionLog, 
MDCAwareThreadPoolExecutor, MockDirectoryWrapper, MockDirectoryWrapper]

Stack Trace:
java.lang.AssertionError: ObjectTracker found 4 object(s) that were not 
released!!! [TransactionLog, MDCAwareThreadPoolExecutor, MockDirectoryWrapper, 
MockDirectoryWrapper]
at __randomizedtesting.SeedInfo.seed([344274A1430716F1]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNull(Assert.java:551)
at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:257)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:834)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 11090 lines...]
   [junit4] Suite: org.apache.solr.schema.TestManagedSchemaAPI
   [junit4]   2> Creating dataDir: 
/home/jenkins/workspace/Lucene-Solr-6.x-Linux/solr/build/solr-core/test/J1/temp/solr.schema.TestManagedSchemaAPI_344274A1430716F1-001/init-core-data-001
   [junit4]   2> 671495 INFO  
(SUITE-TestManagedSchemaAPI-seed#[344274A1430716F1]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized ssl (true) and clientAuth (true) via: 
@org.apache.solr.util.RandomizeSSL(reason=, ssl=NaN, value=NaN, clientAuth=NaN)
   [junit4]   2> 671497 INFO  
(SUITE-TestManagedSchemaAPI-seed#[344274A1430716F1]-worker) [] 
o.a.s.c.ZkTestServer STARTING ZK TEST SERVER
   [junit4]   2> 671497 INFO  (Thread-1068) [] o.a.s.c.ZkTestServer client 
port:0.0.0.0/0.0.0.0:0
   [junit4]   2> 671497 INFO  (Thread-1068) [] o.a.s.c.ZkTestServer 
Starting server
   [junit4]   2> 671597 INFO  
(SUITE-TestManagedSchemaAPI-seed#[344274A1430716F1]-worker) [] 
o.a.s.c.ZkTestServer start zk server on port:34033
   [junit4]   2> 671597 INFO  
(SUITE-TestManagedSchemaAPI-seed#[344274A1430716F1]-worker) [] 
o.a.s.c.c.SolrZkClient Using default ZkCredentialsProvider
   [junit4]   2> 671598 INFO  
(SUITE-TestManagedSchemaAPI-seed#[344274A1430716F1]-worker) [] 
o.a.s.c.c.ConnectionManager Waiting for client to connect to ZooKeeper
   [junit4]   2> 671600 INFO  (zkCallback-453-thread-1) [] 
o.a.s.c.c.ConnectionManager Watcher 
org.apache.solr.common.cloud.ConnectionManager@b1a1b5 name:ZooKeeperConnection 
Watcher:127.0.0.1:34033 got event WatchedEvent state:SyncConnected type:None 
path:null path:null type:None
   [junit4]   2> 671600 INFO  
(SUITE-TestManagedSchemaAPI-seed#[344274A1430716F1]-worker) [] 
o.a.s.c.c.ConnectionManager Client is connected to ZooKeeper
   [junit4]   2> 671601 INFO

[jira] [Comment Edited] (SOLR-9200) Add Delegation Token Support to Solr

2016-06-09 Thread Gregory Chanan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323649#comment-15323649
 ] 

Gregory Chanan edited comment on SOLR-9200 at 6/10/16 12:25 AM:


I started working on this.  One issue I immediately hit was HADOOP-12767 -- it 
appears upgrading the httpclient version causes a null check to need to be 
inserted on the path of delegation token checking.

Also note that HADOOP-12767 was fixed in hadoop 2.8 but the latest stable 
release is 2.7.2.


was (Author: gchanan):
I started working on this.  One issue I immediately hit was HADOOP-12767 -- it 
appears upgrading the httpclient version causes a null check to need to be 
inserted on the path of delegation token checking.

> Add Delegation Token Support to Solr
> 
>
> Key: SOLR-9200
> URL: https://issues.apache.org/jira/browse/SOLR-9200
> Project: Solr
>  Issue Type: New Feature
>  Components: security
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
>
> SOLR-7468 added support for kerberos authentication via the hadoop 
> authentication filter.  Hadoop also has support for an authentication filter 
> that supports delegation tokens, which allow authenticated users the ability 
> to grab/renew/delete a token that can be used to bypass the normal 
> authentication path for a time.  This is useful in a variety of use cases:
> 1) distributed clients (e.g. MapReduce) where each client may not have access 
> to the user's kerberos credentials.  Instead, the job runner can grab a 
> delegation token and use that during task execution.
> 2) If the load on the kerberos server is too high, delegation tokens can 
> avoid hitting the kerberos server after the first request
> 3) If requests/permissions need to be delegated to another user: the more 
> privileged user can request a delegation token that can be passed to the less 
> privileged user.
> Note to self:
> In 
> https://issues.apache.org/jira/browse/SOLR-7468?focusedCommentId=14579636=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14579636
>  I made the following comment which I need to investigate further, since I 
> don't know if anything changed in this area:
> {quote}3) I'm a little concerned with the "NoContext" code in KerberosPlugin 
> moving forward (I understand this is more a generic auth question than 
> kerberos specific). For example, in the latest version of the filter we are 
> using at Cloudera, we play around with the ServletContext in order to pass 
> information around 
> (https://github.com/cloudera/lucene-solr/blob/cdh5-4.10.3_5.4.2/solr/core/src/java/org/apache/solr/servlet/SolrHadoopAuthenticationFilter.java#L106).
>  Is there any way we can get the actual ServletContext in a plugin?{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9200) Add Delegation Token Support to Solr

2016-06-09 Thread Gregory Chanan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323649#comment-15323649
 ] 

Gregory Chanan commented on SOLR-9200:
--

I started working on this.  One issue I immediately hit was HADOOP-12767 -- it 
appears upgrading the httpclient version causes a null check to need to be 
inserted on the path of delegation token checking.

> Add Delegation Token Support to Solr
> 
>
> Key: SOLR-9200
> URL: https://issues.apache.org/jira/browse/SOLR-9200
> Project: Solr
>  Issue Type: New Feature
>  Components: security
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
>
> SOLR-7468 added support for kerberos authentication via the hadoop 
> authentication filter.  Hadoop also has support for an authentication filter 
> that supports delegation tokens, which allow authenticated users the ability 
> to grab/renew/delete a token that can be used to bypass the normal 
> authentication path for a time.  This is useful in a variety of use cases:
> 1) distributed clients (e.g. MapReduce) where each client may not have access 
> to the user's kerberos credentials.  Instead, the job runner can grab a 
> delegation token and use that during task execution.
> 2) If the load on the kerberos server is too high, delegation tokens can 
> avoid hitting the kerberos server after the first request
> 3) If requests/permissions need to be delegated to another user: the more 
> privileged user can request a delegation token that can be passed to the less 
> privileged user.
> Note to self:
> In 
> https://issues.apache.org/jira/browse/SOLR-7468?focusedCommentId=14579636=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14579636
>  I made the following comment which I need to investigate further, since I 
> don't know if anything changed in this area:
> {quote}3) I'm a little concerned with the "NoContext" code in KerberosPlugin 
> moving forward (I understand this is more a generic auth question than 
> kerberos specific). For example, in the latest version of the filter we are 
> using at Cloudera, we play around with the ServletContext in order to pass 
> information around 
> (https://github.com/cloudera/lucene-solr/blob/cdh5-4.10.3_5.4.2/solr/core/src/java/org/apache/solr/servlet/SolrHadoopAuthenticationFilter.java#L106).
>  Is there any way we can get the actual ServletContext in a plugin?{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-6.1 - Build # 1 - Failure

2016-06-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-6.1/1/

2 tests failed.
FAILED:  org.apache.solr.hadoop.MorphlineBasicMiniMRTest.mrRun

Error Message:
Failed on local exception: java.io.IOException: Broken pipe; Host Details : 
local host is: "lucene1-us-west/10.41.0.5"; destination host is: 
"lucene1-us-west.apache.org":33597; 

Stack Trace:
java.io.IOException: Failed on local exception: java.io.IOException: Broken 
pipe; Host Details : local host is: "lucene1-us-west/10.41.0.5"; destination 
host is: "lucene1-us-west.apache.org":33597; 
at 
__randomizedtesting.SeedInfo.seed([67FCD500885A0325:69AE610E89CC312A]:0)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy111.getClusterMetrics(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy112.getClusterMetrics(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461)
at 
org.apache.hadoop.mapred.ResourceMgrDelegate.getClusterMetrics(ResourceMgrDelegate.java:151)
at 
org.apache.hadoop.mapred.YARNRunner.getClusterMetrics(YARNRunner.java:179)
at 
org.apache.hadoop.mapreduce.Cluster.getClusterStatus(Cluster.java:246)
at org.apache.hadoop.mapred.JobClient$3.run(JobClient.java:719)
at org.apache.hadoop.mapred.JobClient$3.run(JobClient.java:717)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.mapred.JobClient.getClusterStatus(JobClient.java:717)
at 
org.apache.solr.hadoop.MapReduceIndexerTool.run(MapReduceIndexerTool.java:645)
at 
org.apache.solr.hadoop.MapReduceIndexerTool.run(MapReduceIndexerTool.java:608)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.solr.hadoop.MorphlineBasicMiniMRTest.mrRun(MorphlineBasicMiniMRTest.java:364)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)

[JENKINS] Lucene-Solr-6.1-Linux (64bit/jdk1.8.0_92) - Build # 7 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.1-Linux/7/
Java: 64bit/jdk1.8.0_92 -XX:+UseCompressedOops -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.handler.TestReqParamsAPI.test

Error Message:
Could not get expected value  'CY val' for path 'response/params/y/c' full 
output: {   "responseHeader":{ "status":0, "QTime":0},   "response":{   
  "znodeVersion":0, "params":{"x":{ "a":"A val", "b":"B 
val", "":{"v":0},  from server:  https://127.0.0.1:38739/collection1

Stack Trace:
java.lang.AssertionError: Could not get expected value  'CY val' for path 
'response/params/y/c' full output: {
  "responseHeader":{
"status":0,
"QTime":0},
  "response":{
"znodeVersion":0,
"params":{"x":{
"a":"A val",
"b":"B val",
"":{"v":0},  from server:  https://127.0.0.1:38739/collection1
at 
__randomizedtesting.SeedInfo.seed([C9C144B8420A7A25:41957B62ECF617DD]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.core.TestSolrConfigHandler.testForResponseElement(TestSolrConfigHandler.java:481)
at 
org.apache.solr.handler.TestReqParamsAPI.testReqParams(TestReqParamsAPI.java:160)
at 
org.apache.solr.handler.TestReqParamsAPI.test(TestReqParamsAPI.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323589#comment-15323589
 ] 

ASF GitHub Bot commented on SOLR-8744:
--

Github user dragonsinth commented on the issue:

https://github.com/apache/lucene-solr/pull/42
  
FYI: both of these SHAs passed the test suite


> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr issue #42: SOLR-8744 blockedTasks

2016-06-09 Thread dragonsinth

Github user dragonsinth commented on the issue:

https://github.com/apache/lucene-solr/pull/42
  
FYI: both of these SHAs passed the test suite


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?

2016-06-09 Thread Nicholas Knize (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-7325:
---
Attachment: LUCENE-7325.patch

The attached patch fixes a bug in {{GeoPointPrefixTermsEnum}} and removes 
{{GeoPointTestUtil}} from {{TestGeoPointQuery}}. On my machine the random 
testing passed a couple hundred beast iterations, but Mike's simple test still 
fails. So are we sure the random tests are exercising these boundary conditions 
as often as we think? Or did I miss something?

Anyway, the failure is related to LUCENE-7166. Specifically, [This 
change|https://github.com/apache/lucene-solr/commit/f8ea8b855e43fc0a2fa434ab8c366de810047c8f#diff-abb82bc6b100f659fad4b75e44c018cdL63]
 which Robert's comment explains quite nicely. The fix (mentioned in the 
comment) is to move to the stable 64 bit encoding in LUCENE-7186. But 
GeoPointField will need bwc with the 62 bit encoding. If we want this for 6.1 
(which I think we will need?) it should probably be labeled as a blocker.

> GeoPointInBBoxQuery no longer includes boundaries?
> --
>
> Key: LUCENE-7325
> URL: https://issues.apache.org/jira/browse/LUCENE-7325
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Michael McCandless
>Priority: Blocker
> Attachments: LUCENE-7325.patch, LUCENE-7325.patch
>
>
> {{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
> and 6.0, but in 6.1 something broke.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7374) Backup/Restore should provide a param for specifying the directory implementation it should use

2016-06-09 Thread Hrishikesh Gadre (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323481#comment-15323481
 ] 

Hrishikesh Gadre commented on SOLR-7374:


[~markrmil...@gmail.com] Let me know if you need anything from my side. If you 
could post the test params (for failure), I can take a look.

> Backup/Restore should provide a param for specifying the directory 
> implementation it should use
> ---
>
> Key: SOLR-7374
> URL: https://issues.apache.org/jira/browse/SOLR-7374
> Project: Solr
>  Issue Type: Bug
>Reporter: Varun Thacker
>Assignee: Mark Miller
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7374.patch, SOLR-7374.patch, SOLR-7374.patch
>
>
> Currently when we create a backup we use SimpleFSDirectory to write the 
> backup indexes. Similarly during a restore we open the index using 
> FSDirectory.open . 
> We should provide a param called {{directoryImpl}} or {{type}} which will be 
> used to specify the Directory implementation to backup the index. 
> Likewise during a restore you would need to specify the directory impl which 
> was used during backup so that the index can be opened correctly.
> This param will address the problem that currently if a user is running Solr 
> on HDFS there is no way to use the backup/restore functionality as the 
> directory is hardcoded.
> With this one could be running Solr on a local FS but backup the index on 
> HDFS etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323478#comment-15323478
 ] 

Scott Blum commented on SOLR-8744:
--

See second commit in https://github.com/apache/lucene-solr/pull/42

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323476#comment-15323476
 ] 

ASF GitHub Bot commented on SOLR-8744:
--

GitHub user dragonsinth opened a pull request:

https://github.com/apache/lucene-solr/pull/42

SOLR-8744 blockedTasks

@noblepaul 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fullstorydev/lucene-solr SOLR-8744

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/42.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #42


commit 668d426633fe5551b24ab38036e14c15e7ed4cdf
Author: Scott Blum 
Date:   2016-06-09T21:28:07Z

WIP: SOLR-8744 blockedTasks

commit 4b2512abcc5e55c653c923eba9762988cf1faae8
Author: Scott Blum 
Date:   2016-06-09T22:11:26Z

Simplifications




> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr pull request #42: SOLR-8744 blockedTasks

2016-06-09 Thread dragonsinth

GitHub user dragonsinth opened a pull request:

https://github.com/apache/lucene-solr/pull/42

SOLR-8744 blockedTasks

@noblepaul 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fullstorydev/lucene-solr SOLR-8744

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/42.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #42


commit 668d426633fe5551b24ab38036e14c15e7ed4cdf
Author: Scott Blum 
Date:   2016-06-09T21:28:07Z

WIP: SOLR-8744 blockedTasks

commit 4b2512abcc5e55c653c923eba9762988cf1faae8
Author: Scott Blum 
Date:   2016-06-09T22:11:26Z

Simplifications




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9185) Solr's "Lucene"/standard query parser should not split on whitespace before sending terms to analysis

2016-06-09 Thread Mary Jo Sminkey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323464#comment-15323464
 ] 

Mary Jo Sminkey commented on SOLR-9185:
---

This has been an issue for a LONG time and available solutions not always 
usable let alone ideal. This would be my #1 one thing to be fixed in Solr. 

> Solr's "Lucene"/standard query parser should not split on whitespace before 
> sending terms to analysis
> -
>
> Key: SOLR-9185
> URL: https://issues.apache.org/jira/browse/SOLR-9185
> Project: Solr
>  Issue Type: Bug
>Reporter: Steve Rowe
>Assignee: Steve Rowe
>
> Copied from LUCENE-2605:
> The queryparser parses input on whitespace, and sends each whitespace 
> separated term to its own independent token stream.
> This breaks the following at query-time, because they can't see across 
> whitespace boundaries:
> n-gram analysis
> shingles
> synonyms (especially multi-word for whitespace-separated languages)
> languages where a 'word' can contain whitespace (e.g. vietnamese)
> Its also rather unexpected, as users think their 
> charfilters/tokenizers/tokenfilters will do the same thing at index and 
> querytime, but
> in many cases they can't. Instead, preferably the queryparser would parse 
> around only real 'operators'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323447#comment-15323447
 ] 

Scott Blum commented on SOLR-8744:
--

One other comment:

{code}
// We are breaking out if we already have reached the no:of 
parallel tasks running
// By doing so we may end up discarding the old list of blocked 
tasks . But we have
// no means to know if they would still be blocked after some of 
the items ahead
// were cleared.
if (runningTasks.size() >= MAX_PARALLEL_TASKS) break;
{code}

In this case, can't we just shove all the remaining tasks into blockedTasks?  
It's a slight redefinition to mean either "tasks that are blocked because of 
locks" or "tasks blocked because too many are running".

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-6.x-Windows (32bit/jdk1.8.0_92) - Build # 239 - Still Failing!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/239/
Java: 32bit/jdk1.8.0_92 -client -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  org.apache.lucene.store.TestMmapDirectory.testPendingDeletions

Error Message:
access denied ("java.io.FilePermission" 
"C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\lucene\build\core\test\J0\temp\lucene.store.TestMmapDirectory_96FC6F2D45B76809-001\tempDir-007\con"
 "write")

Stack Trace:
java.security.AccessControlException: access denied ("java.io.FilePermission" 
"C:\Users\jenkins\workspace\Lucene-Solr-6.x-Windows\lucene\build\core\test\J0\temp\lucene.store.TestMmapDirectory_96FC6F2D45B76809-001\tempDir-007\con"
 "write")
at 
__randomizedtesting.SeedInfo.seed([96FC6F2D45B76809:DE018553DD164AB]:0)
at 
java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
at 
java.security.AccessController.checkPermission(AccessController.java:884)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.SecurityManager.checkWrite(SecurityManager.java:979)
at sun.nio.fs.WindowsChannelFactory.open(WindowsChannelFactory.java:295)
at 
sun.nio.fs.WindowsChannelFactory.newFileChannel(WindowsChannelFactory.java:162)
at 
sun.nio.fs.WindowsFileSystemProvider.newByteChannel(WindowsFileSystemProvider.java:225)
at 
java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
at 
org.apache.lucene.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:197)
at 
org.apache.lucene.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:197)
at 
org.apache.lucene.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:129)
at 
org.apache.lucene.mockfile.HandleTrackingFS.newOutputStream(HandleTrackingFS.java:129)
at 
org.apache.lucene.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:197)
at 
org.apache.lucene.mockfile.FilterFileSystemProvider.newOutputStream(FilterFileSystemProvider.java:197)
at java.nio.file.Files.newOutputStream(Files.java:216)
at 
org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:408)
at 
org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:404)
at 
org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
at 
org.apache.lucene.store.BaseDirectoryTestCase.testPendingDeletions(BaseDirectoryTestCase.java:1215)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at

[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion


 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Blum updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SmileyLockTree.java, SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323436#comment-15323436
 ] 

Scott Blum commented on SOLR-8744:
--

Actually, I may have a fix.  You need a Thread.sleep() in the final loop or you 
can burn through 500 iterations on a fast machine.  Here's a patch.

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323393#comment-15323393
 ] 

Scott Blum commented on SOLR-8744:
--

I got one test failure patching this into master:

MultiThreadedOCPTest fails reliably:

java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([7F3C69CF3DAFFDD5:F76856159353902D]:0)
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.solr.cloud.MultiThreadedOCPTest.testFillWorkQueue(MultiThreadedOCPTest.java:106)


> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7341) xjoin - join data from external sources

2016-06-09 Thread Adam Gamble (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323349#comment-15323349
 ] 

Adam Gamble commented on SOLR-7341:
---

I would love this feature as well. Is there a chance this will get merged? Or 
is it dead in the water?

> xjoin - join data from external sources
> ---
>
> Key: SOLR-7341
> URL: https://issues.apache.org/jira/browse/SOLR-7341
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Tom Winch
>Priority: Minor
> Fix For: 4.10.3, 5.3.2, 6.0
>
> Attachments: SOLR-7341.patch-4.10.3, SOLR-7341.patch-4_10, 
> SOLR-7341.patch-5.3.2, SOLR-7341.patch-5_3, SOLR-7341.patch-master, 
> SOLR-7341.patch-trunk, SOLR-7341.patch-trunk
>
>
> h2. XJoin
> The "xjoin" SOLR contrib allows external results to be joined with SOLR 
> results in a query and the SOLR result set to be filtered by the results of 
> an external query. Values from the external results are made available in the 
> SOLR results and may also be used to boost the scores of corresponding 
> documents during the search. The contrib consists of the Java classes 
> XJoinSearchComponent, XJoinValueSourceParser and XJoinQParserPlugin (and 
> associated classes), which must be configured in solrconfig.xml, and the 
> interfaces XJoinResultsFactory and XJoinResults, which are implemented by the 
> user to provide the link between SOLR and the external results source (but 
> see below for details of how to use the in-built SimpleXJoinResultsFactory 
> implementation). External results and SOLR documents are matched via a single 
> configurable attribute (the "join field").
> To include the XJoin contrib classes, add the following config to 
> solrconfig.xml:
> {code:xml}
> 
>   ..
>
>regex=".*\.jar" />
>regex="solr-xjoin-\d.*\.jar" />
>   ..
> 
> {code}
> Note that any JARs containing implementations of the XJoinResultsFactory must 
> also be included.
> h2. Java classes and interfaces
> h3. XJoinResultsFactory
> The user implementation of this interface is responsible for connecting to an 
> external source to perform a query (or otherwise collect results). Parameters 
> with prefix ".external." are passed from the SOLR query URL 
> to pararameterise the search. The interface has the following methods:
> * void init(NamedList args) - this is called during SOLR initialisation, and 
> passed parameters from the search component configuration (see below)
> * XJoinResults getResults(SolrParams params) - this is called during a SOLR 
> search to generate external results, and is passed parameters from the SOLR 
> query URL (as above)
> For example, the implementation might perform queries of an external source 
> based on the 'q' SOLR query URL parameter (in full,  name>.external.q).
> h3. XJoinResults
> A user implementation of this interface is returned by the getResults() 
> method of the XJoinResultsFactory implementation. It has methods:
> * Object getResult(String joinId) - this should return a particular result 
> given the value of the join attribute
> * Iterable getJoinIds() - this should return an ordered (ascending) 
> list of the join attribute values for all results of the external search
> h3. XJoinSearchComponent
> This is the central Java class of the contrib. It is a SOLR search component, 
> configured in solrconfig.xml and included in one or more SOLR request 
> handlers. There is one XJoin search component per external source, and each 
> has two main responsibilities:
> * Before the SOLR search, it connects to the external source and retrieves 
> results, storing them in the SOLR request context
> * After the SOLR search, it matches SOLR document in the results set and 
> external results via the join field, adding attributes from the external 
> results to documents in the SOLR results set
> It takes the following initialisation parameters:
> * factoryClass - this specifies the user-supplied class implementing 
> XJoinResultsFactory, used to generate external results
> * joinField - this specifies the attribute on which to join between SOLR 
> documents and external results
> * external - this parameter set is passed to configure the 
> XJoinResultsFactory implementation
> For example, in solrconfig.xml:
> {code:xml}
>  class="org.apache.solr.search.xjoin.XJoinSearchComponent">
>   test.TestXJoinResultsFactory
>   id
>   
> 1,2,3
>   
> 
> {code}
> Here, the search component instantiates a new TextXJoinResultsFactory during 
> initialisation, and passes it the "values" parameter (1, 2, 3) to configure 
> it. To properly use the XJoinSearchComponent in a request handler, it must be 
> included at the start and end of the component list, and may be configured 
> with the following query parameters:
> * results - a comma-separated list of attributes from

[jira] [Commented] (SOLR-9034) Atomic updates not work with CopyField


[ 
https://issues.apache.org/jira/browse/SOLR-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323293#comment-15323293
 ] 

Scott Blum commented on SOLR-9034:
--

Should probably backport to 5.5.2, 5.6

> Atomic updates not work with CopyField
> --
>
> Key: SOLR-9034
> URL: https://issues.apache.org/jira/browse/SOLR-9034
> Project: Solr
>  Issue Type: Bug
>  Components: Server
>Affects Versions: 5.5
>Reporter: Karthik Ramachandran
>Assignee: Yonik Seeley
>  Labels: atomicupdate
> Fix For: 6.0.1, 6.1
>
> Attachments: SOLR-9034.patch, SOLR-9034.patch, SOLR-9034.patch
>
>
> Atomic updates does not work when CopyField has docValues enabled.  Below is 
> the sample schema
> {code:xml|title:schema.xml}
> indexed="true" stored="true" />
> indexed="true" stored="true" />
> indexed="true" stored="true" />
> docValues="true" indexed="true" stored="false" useDocValuesAsStored="false" />
> docValues="true" indexed="true" stored="false" useDocValuesAsStored="false" />
> docValues="true" indexed="true" stored="false" useDocValuesAsStored="false" />
> {code}
> Below is the exception
> {noformat}
> Caused by: java.lang.IllegalArgumentException: DocValuesField
>  "copy_single_i_dvn" appears more than once in this document 
> (only one value is allowed per field)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9181) ZkStateReaderTest failure


[ 
https://issues.apache.org/jira/browse/SOLR-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323283#comment-15323283
 ] 

Scott Blum commented on SOLR-9181:
--

BTW, I have to admit I'm stalling on reviewing the update because there's no 
easy way to compare two patch files except to patch each one in turn and diff 
the result...

> ZkStateReaderTest failure
> -
>
> Key: SOLR-9181
> URL: https://issues.apache.org/jira/browse/SOLR-9181
> Project: Solr
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
> Fix For: 6.1
>
> Attachments: SOLR-9181.patch, SOLR-9181.patch, SOLR-9181.patch, 
> SOLR-9181.patch
>
>
> https://builds.apache.org/job/Lucene-Solr-Tests-6.x/243/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9199) ZkController#publishAndWaitForDownStates logic is inefficient

2016-06-09 Thread Gregory Chanan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan resolved SOLR-9199.
--
   Resolution: Fixed
Fix Version/s: 6.2
   master (7.0)

> ZkController#publishAndWaitForDownStates logic is inefficient
> -
>
> Key: SOLR-9199
> URL: https://issues.apache.org/jira/browse/SOLR-9199
> Project: Solr
>  Issue Type: Bug
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
> Fix For: master (7.0), 6.2
>
> Attachments: SOLR-9199.patch
>
>
> The following logic introduced as part of SOLR-8720 is inefficient. 
> https://github.com/apache/lucene-solr/blob/6c0331b8309603eaaf14b6677afba5ffe99f16a3/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L687-L712
> Specifically,
> * foundStates flag is set to TRUE before the for loop.
> * In the for loop we check if any replica on this node is not in the DOWN 
> state. If yes, then foundStates = FALSE
> * If foundStates == TRUE then we break out of the loop and return.
> The problem here is that once foundStates is set to FALSE, it is never reset 
> to TRUE. Hence we end up spending the whole 60 secs iterating over the 
> collections even though all the replicas are marked as DOWN in later 
> iterations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed

2016-06-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Blum updated SOLR-9191:
-
Fix Version/s: 6.2
   master (7.0)

> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, master (7.0), 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9199) ZkController#publishAndWaitForDownStates logic is inefficient


[ 
https://issues.apache.org/jira/browse/SOLR-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323274#comment-15323274
 ] 

ASF subversion and git services commented on SOLR-9199:
---

Commit 360d9c40da4cc1f86f080b4f2c7410da6fbc2195 in lucene-solr's branch 
refs/heads/branch_6x from [~gchanan]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=360d9c4 ]

SOLR-9199: ZkController#publishAndWaitForDownStates logic is inefficient


> ZkController#publishAndWaitForDownStates logic is inefficient
> -
>
> Key: SOLR-9199
> URL: https://issues.apache.org/jira/browse/SOLR-9199
> Project: Solr
>  Issue Type: Bug
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
> Fix For: master (7.0), 6.2
>
> Attachments: SOLR-9199.patch
>
>
> The following logic introduced as part of SOLR-8720 is inefficient. 
> https://github.com/apache/lucene-solr/blob/6c0331b8309603eaaf14b6677afba5ffe99f16a3/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L687-L712
> Specifically,
> * foundStates flag is set to TRUE before the for loop.
> * In the for loop we check if any replica on this node is not in the DOWN 
> state. If yes, then foundStates = FALSE
> * If foundStates == TRUE then we break out of the loop and return.
> The problem here is that once foundStates is set to FALSE, it is never reset 
> to TRUE. Hence we end up spending the whole 60 secs iterating over the 
> collections even though all the replicas are marked as DOWN in later 
> iterations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed

2016-06-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Blum resolved SOLR-9191.
--
Resolution: Fixed

> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, master (7.0), 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9199) ZkController#publishAndWaitForDownStates logic is inefficient


[ 
https://issues.apache.org/jira/browse/SOLR-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323270#comment-15323270
 ] 

ASF subversion and git services commented on SOLR-9199:
---

Commit d55cc8f293aec4ccc882b1a92ed450c9ec3877dc in lucene-solr's branch 
refs/heads/master from [~gchanan]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d55cc8f ]

SOLR-9199: ZkController#publishAndWaitForDownStates logic is inefficient


> ZkController#publishAndWaitForDownStates logic is inefficient
> -
>
> Key: SOLR-9199
> URL: https://issues.apache.org/jira/browse/SOLR-9199
> Project: Solr
>  Issue Type: Bug
>Reporter: Hrishikesh Gadre
>Assignee: Hrishikesh Gadre
> Attachments: SOLR-9199.patch
>
>
> The following logic introduced as part of SOLR-8720 is inefficient. 
> https://github.com/apache/lucene-solr/blob/6c0331b8309603eaaf14b6677afba5ffe99f16a3/solr/core/src/java/org/apache/solr/cloud/ZkController.java#L687-L712
> Specifically,
> * foundStates flag is set to TRUE before the for loop.
> * In the for loop we check if any replica on this node is not in the DOWN 
> state. If yes, then foundStates = FALSE
> * If foundStates == TRUE then we break out of the loop and return.
> The problem here is that once foundStates is set to FALSE, it is never reset 
> to TRUE. Hence we end up spending the whole 60 secs iterating over the 
> collections even though all the replicas are marked as DOWN in later 
> iterations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323263#comment-15323263
 ] 

ASF subversion and git services commented on SOLR-9191:
---

Commit 80d6d26cc7b5d6bf3eca434cded5179d717bb378 in lucene-solr's branch 
refs/heads/branch_6_1 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=80d6d26 ]

SOLR-9191: OverseerTaskQueue.peekTopN() fatally flawed


> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed

2016-06-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Blum updated SOLR-9191:
-
Fix Version/s: (was: 6.2)

> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed


[ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323244#comment-15323244
 ] 

ASF subversion and git services commented on SOLR-9191:
---

Commit c6b04886244e90f4e0f83d1f3fa330e6ccf1a062 in lucene-solr's branch 
refs/heads/branch_6_0 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c6b0488 ]

SOLR-9191: OverseerTaskQueue.peekTopN() fatally flawed


> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323222#comment-15323222
 ] 

ASF subversion and git services commented on SOLR-9191:
---

Commit 9f5fae7ed82cd565d991ed92f9af4ca23eb7bac2 in lucene-solr's branch 
refs/heads/branch_5_5 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9f5fae7 ]

SOLR-9191: OverseerTaskQueue.peekTopN() fatally flawed


> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Windows (32bit/jdk1.8.0_92) - Build # 5901 - Failure!

2016-06-09 Thread ASF subversion and git services (JIRA)

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/5901/
Java: 32bit/jdk1.8.0_92 -server -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic

Error Message:
Could not remove the following files (in the order of attempts):
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\build\replicator\test\J0\temp\lucene.replicator.http.HttpReplicatorTest_D3A1294FC765E960-001\httpReplicatorTest-001\2:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\build\replicator\test\J0\temp\lucene.replicator.http.HttpReplicatorTest_D3A1294FC765E960-001\httpReplicatorTest-001\2
 

Stack Trace:
java.io.IOException: Could not remove the following files (in the order of 
attempts):
   
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\build\replicator\test\J0\temp\lucene.replicator.http.HttpReplicatorTest_D3A1294FC765E960-001\httpReplicatorTest-001\2:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\build\replicator\test\J0\temp\lucene.replicator.http.HttpReplicatorTest_D3A1294FC765E960-001\httpReplicatorTest-001\2

at 
__randomizedtesting.SeedInfo.seed([D3A1294FC765E960:785B345A18B96F4E]:0)
at org.apache.lucene.util.IOUtils.rm(IOUtils.java:323)
at 
org.apache.lucene.replicator.PerSessionDirectoryFactory.cleanupSession(PerSessionDirectoryFactory.java:58)
at 
org.apache.lucene.replicator.ReplicationClient.doUpdate(ReplicationClient.java:259)
at 
org.apache.lucene.replicator.ReplicationClient.updateNow(ReplicationClient.java:401)
at 
org.apache.lucene.replicator.http.HttpReplicatorTest.testBasic(HttpReplicatorTest.java:121)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)

[jira] [Commented] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed


[ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323211#comment-15323211
 ] 

ASF subversion and git services commented on SOLR-9191:
---

Commit 5955712ab1e3a37537929c3050b42aed243d3b4b in lucene-solr's branch 
refs/heads/branch_5x from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5955712 ]

SOLR-9191: OverseerTaskQueue.peekTopN() fatally flawed


> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323207#comment-15323207
 ] 

Scott Blum commented on SOLR-8744:
--

BTW, I landed SOLR-9191 in master and 6x, so you should be good to go on that 
front.

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323205#comment-15323205
 ] 

Scott Blum commented on SOLR-8744:
--

Mostly LG.  One completely minor comment:

{code}
+  if (blockedTasks.size() < MAX_BLOCKED_TASKS) {
+blockedTasks.put(head.getId(), head);
+  }
{code}

Maybe unnecessary?  It doesn't actually matter if blockedTasks.size() gets to 
1100 items, the check at the top of the loop will keep it from growing 
endlessly.  If you drop these tasks on the floor, you'll just end up 
re-fetching the bytes from ZK later.

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed


[ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323149#comment-15323149
 ] 

ASF subversion and git services commented on SOLR-9191:
---

Commit cde57ab64a6f4082b2dfab515397a242600a1df7 in lucene-solr's branch 
refs/heads/branch_6x from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cde57ab ]

SOLR-9191: OverseerTaskQueue.peekTopN() fatally flawed


> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5944) Support updates of numeric DocValues

[
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323101#comment-15323101
]

Ishan Chattopadhyaya edited comment on SOLR-5944 at 6/9/16 6:59 PM:

When update2 (say a partial update) arrives before update1 (say a full update,
on which update2 depends), then the call for indexing update2 is a blocking
call (which finishes either after update1 is indexed, or timeout is reached).

The intention was to:
# shuffle the updates (so that the 3 updates are in one of the 6 possible
permutations, one of those permutations being in-order)
# send them out in sequence of the shuffle
# have them arrive at Solr in the intended order (as intended in steps 1 and
2). However, since an out of order update waits for the dependent update and
blocks the call until such a dependent update arrives (or timeout is reached),
the intention is to have these calls non-blocking.

So, I wanted to send updates out sequentially (deliberately re-ordered, through
a shuffle), but asynchronously (so as to keep those calls non-blocking).

bq. ...My impression, based on the entirety of that method, was that the intent
of the test was to bypass the normal distributed update logic and send
carefully crafted "simulated" updates direct to each replica, such that one
repliica got the (simulated from leader) updates "in order" and another replica
got the (simulated from leader) updates "out of order"
That is exactly my intention.

bq. if the point was for replica2 to get the (simulated from leader) updates
"out of order" then why shuffle them - why not explicitly put them in the
"wrong" order?
There could be possibly 6 permutations in terms of the mutual ordering of the 3
updates, so I used shuffle instead of choosing a particular "wrong" ordering.
Of course, one of those 6 permutations is the "right" order, so that case is
not consistent with the name of the test; I can make a fix to exclude that case.

bq. if the goal was send them asynchronously, and try to get them to happen as
concurrently as possible (as you indicated above in your answer to my question)
then what was the point of the "shuffle" ?
I think I was trying: (a) asynchronously (so that out of order update doesn't
block out the next update request that sends a dependent order), (b) intention
was not really to test for race conditions (i.e. not really "as concurrently as
possible", but maybe I don't understand the phrase correctly), but just to be
concurrent enough so that a dependent update arrives before an out of order
update times out.

bq. why is there now a sleep call with an explicit comment "...so that they
arrive in the intended order" ... if there is an "intended" order why would you
want them to be async?

The point of this was to avoid situations where the shuffled list (and intended
order for that testcase) was, say, "update1, update3, update2", but it actually
arrived at the Solr server in the order "update1, update2, update3" due to
parallel threads sending the updates at nearly the same time.

bq. if there is an "intended" order why would you want them to be async?
So that the calls are non-blocking. The first out of order partial update
request will block the call until timeout/dependent update is indexed.

Do you think this makes sense? I am open to revise this entire logic if you
suggest.

was (Author: ichattopadhyaya):
When update2 (say a partial update) arrives before update1 (say a full update,
on which update2 depends), then the call for indexing update2 is a blocking
call (which finishes either after update1 is indexed, or timeout is reached).

So, I wanted to send updates out sequentially (deliberately re-ordered, through
a shuffle), but asynchronously (so as to keep those calls non-blocking).

[jira] [Comment Edited] (SOLR-5944) Support updates of numeric DocValues

[
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323101#comment-15323101
]

Ishan Chattopadhyaya edited comment on SOLR-5944 at 6/9/16 6:56 PM:

So, I wanted to send updates out sequentially (deliberately re-ordered, through
a shuffle), but asynchronously (so as to keep those calls non-blocking).

bq. Thread.sleep(10);
The point of this was to avoid situations where the shuffled list (and intended
order for that testcase) was, say, "update1, update3, update2", but it actually
arrived at the Solr server in the order "update1, update2, update3" due to
parallel threads sending the updates at nearly the same time.

Do you think this makes sense? I am open to revise this entire logic if you
suggest.

So, I wanted to send updates out sequentially (deliberately re-ordered, through
a shuffle), but asynchronously (so as to keep those calls non-blocking).

bq. if the goal was send them asynchronously, and try to get them to happen as
concurrently as possible (as you

[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_92) - Build # 16956 - Failure!

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/16956/
Java: 32bit/jdk1.8.0_92 -server -XX:+UseG1GC

1 tests failed.
FAILED:  
org.apache.solr.cloud.overseer.ZkStateReaderTest.testStateFormatUpdateWithTimeDelay

Error Message:
Could not find collection : c1

Stack Trace:
org.apache.solr.common.SolrException: Could not find collection : c1
at 
__randomizedtesting.SeedInfo.seed([20A07D4792A44933:5F3ECAC2FBC664B9]:0)
at 
org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:192)
at 
org.apache.solr.cloud.overseer.ZkStateReaderTest.testStateFormatUpdate(ZkStateReaderTest.java:129)
at 
org.apache.solr.cloud.overseer.ZkStateReaderTest.testStateFormatUpdateWithTimeDelay(ZkStateReaderTest.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 10540 lines...]
   [junit4] Suite: org.apache.solr.cloud.overseer.ZkStateReaderTest
   [junit4]   2> Creating dataDir:

[jira] [Comment Edited] (SOLR-5944) Support updates of numeric DocValues

2016-06-09 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323101#comment-15323101
]

Ishan Chattopadhyaya edited comment on SOLR-5944 at 6/9/16 6:54 PM:

So, I wanted to send updates out sequentially (deliberately re-ordered, through
a shuffle), but asynchronously (so as to keep those calls non-blocking).

bq. Thread.sleep(10);
The point of this was to avoid situations where the shuffled list (and intended
order for that testcase) was, say, "update1, update3, update2", but it actually
went to the Solr server in the order "update1, update2, update3" due to
parallel threads sending the updates at nearly the same time.

Do you think this makes sense? I am open to revise this entire logic if you
suggest.

So, I wanted to send updates out sequentially (deliberately re-ordered, through
a shuffle), but asynchronously (so as to keep those calls non-blocking).

bq. if the goal was send them asynchronously, and try to get them to happen as
concurrently as possible (as you indicated

[jira] [Commented] (LUCENE-7323) Compound file writing should verify checksum of its sub-files


[ 
https://issues.apache.org/jira/browse/LUCENE-7323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323113#comment-15323113
 ] 

ASF subversion and git services commented on LUCENE-7323:
-

Commit ae0adfc34dea21df86ab7ebf034f3dbd6714c541 in lucene-solr's branch 
refs/heads/branch_6x from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ae0adfc ]

LUCENE-7323: compound file writing now verifies checksum and segment ID for the 
incoming sub-files, to catch hardware issues or filesystem bugs earlier


> Compound file writing should verify checksum of its sub-files
> -
>
> Key: LUCENE-7323
> URL: https://issues.apache.org/jira/browse/LUCENE-7323
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (7.0), 6.2
>
> Attachments: LUCENE-7323.patch, LUCENE-7323.patch
>
>
> For larger segments, there is a non-trivial window, from when IW
> writes sub-files, to when it then builds the CFS, during which the
> files can become corrupted (from external process, bad filesystem,
> hardware, etc.)
> Today we quietly build the CFS even if the sub-files are corrupted,
> but we can easily detect it, letting users catch corruption earlier
> (write time instead of read time).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-7323) Compound file writing should verify checksum of its sub-files


 [ 
https://issues.apache.org/jira/browse/LUCENE-7323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless closed LUCENE-7323.
--
Resolution: Fixed

> Compound file writing should verify checksum of its sub-files
> -
>
> Key: LUCENE-7323
> URL: https://issues.apache.org/jira/browse/LUCENE-7323
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (7.0), 6.2
>
> Attachments: LUCENE-7323.patch, LUCENE-7323.patch
>
>
> For larger segments, there is a non-trivial window, from when IW
> writes sub-files, to when it then builds the CFS, during which the
> files can become corrupted (from external process, bad filesystem,
> hardware, etc.)
> Today we quietly build the CFS even if the sub-files are corrupted,
> but we can easily detect it, letting users catch corruption earlier
> (write time instead of read time).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323101#comment-15323101
 ] 

Ishan Chattopadhyaya commented on SOLR-5944:


When update2 (say a partial update) arrives before update1 (say a full update, 
on which update2 depends), then the call for indexing update2 is a blocking 
call (which finishes either after update1 is indexed, or timeout is reached).

The intention was to:
# shuffle the updates (so that the 3 updates are in one of the 6 possible 
permutations, one of those permutations being in-order)
# send them out in sequence of the shuffle
# have them arrive at Solr in the intended order (as intended in steps 1 and 
2). However, since an out of order update waits for the dependent update and 
blocks the call until such a dependent update arrives (or timeout is reached), 
the intention is to have these calls non-blocking.

So, I wanted to send updates out sequentially (deliberately re-ordered, through 
a shuffle), but asynchronously (so as to keep those calls non-blocking).

bq. ...My impression, based on the entirety of that method, was that the intent 
of the test was to bypass the normal distributed update logic and send 
carefully crafted "simulated" updates direct to each replica, such that one 
repliica got the (simulated from leader) updates "in order" and another replica 
got the (simulated from leader) updates "out of order"
That is exactly my intention.

bq. if the point was for replica2 to get the (simulated from leader) updates 
"out of order" then why shuffle them - why not explicitly put them in the 
"wrong" order?
There could be possibly 6 permutations in terms of the mutual ordering of the 3 
updates, so I used shuffle instead of choosing a particular "wrong" ordering. 
Of course, one of those 6 permutations is the "right" order, so that case is 
not consistent with the name of the test; I can make a fix to exclude that case.

bq. if the goal was send them asynchronously, and try to get them to happen as 
concurrently as possible (as you indicated above in your answer to my question) 
then what was the point of the "shuffle" ?
I think I was trying: (a) asynchronously (so that out of order update doesn't 
block out the next update request that sends a dependent order), (b) intention 
was not really to test for race conditions, but just to be concurrent enough so 
that a dependent update arrives before an out of order update times out. 

bq.   Thread.sleep(10);
The point of this was to avoid situations where the shuffled list (and intended 
order for that testcase) was, say, "update1, update3, update2", but it actually 
went to the Solr server in the order "update1, update2, update3" due to 
parallel threads sending the updates at nearly the same time.

Do you think this makes sense? I am open to revise this entire logic if you 
suggest.

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7323) Compound file writing should verify checksum of its sub-files


[ 
https://issues.apache.org/jira/browse/LUCENE-7323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323098#comment-15323098
 ] 

ASF subversion and git services commented on LUCENE-7323:
-

Commit 067fb25e4359ed8d5673e385976da7debc0e5b77 in lucene-solr's branch 
refs/heads/master from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=067fb25 ]

LUCENE-7323: compound file writing now verifies checksum and segment ID for the 
incoming sub-files, to catch hardware issues or filesystem bugs earlier


> Compound file writing should verify checksum of its sub-files
> -
>
> Key: LUCENE-7323
> URL: https://issues.apache.org/jira/browse/LUCENE-7323
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: master (7.0), 6.2
>
> Attachments: LUCENE-7323.patch, LUCENE-7323.patch
>
>
> For larger segments, there is a non-trivial window, from when IW
> writes sub-files, to when it then builds the CFS, during which the
> files can become corrupted (from external process, bad filesystem,
> hardware, etc.)
> Today we quietly build the CFS even if the sub-files are corrupted,
> but we can easily detect it, letting users catch corruption earlier
> (write time instead of read time).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5944) Support updates of numeric DocValues

2016-06-09 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323045#comment-15323045
 ] 

Hoss Man commented on SOLR-5944:



I don't understand this comment -- particularly in light of the changes you've 
made to the test since...

{quote}
bq. what's the point of using a threadpool and SendUpdateToReplicaTask here? 
why not just send the updates in a (randdomly assigned) determinisitc order? 

Essentially, I need a way to send three updates to the replica asynchronously. 
To achieve the effect of asynchronous updates, I used a threadpool here. Three 
updates sent one after the other, each being a blocking call, wouldn't have 
simulated the leader -> replica interaction sufficiently.
{quote}

When i posted that particular question it was about 
outOfOrderUpdatesIndividualReplicaTest -- the code in question at teh time 
looked like this...

{code}
// re-order the updates for replica2
List reorderedUpdates = new ArrayList<>(updates);
Collections.shuffle(reorderedUpdates, random());
for (UpdateRequest update : reorderedUpdates) {
  SendUpdateToReplicaTask task = new SendUpdateToReplicaTask(update, 
clients.get(1), random());
  threadpool.submit(task);
}
{code}

...My impression, based on the entirety of that method, was that the intent of 
the test was to bypass the normal distributed update logic and send carefully 
crafted "simulated" updates direct to each replica, such that one repliica got 
the (simulated from leader) updates "in order" and another replica got the 
(simulated from leader) updates "out of order"

* if the point was for replica2 to get the (simulated from leader) updates "out 
of order" then why shuffle them - why not explicitly put them in the "wrong" 
order?
* if the goal was send them asynchronously, and try to get them to happen as 
concurrently as possible (as you indicated above in your answer to my question) 
then what was the point of the "shuffle" ?

Looking at the modified version of that code in your latest patch doesn't 
really help clarify things for me...

{code}
// re-order the updates for NONLEADER 0
List reorderedUpdates = new ArrayList<>(updates);
Collections.shuffle(reorderedUpdates, random());
List updateResponses = new ArrayList<>();
for (UpdateRequest update : reorderedUpdates) {
  AsyncUpdateWithRandomCommit task = new AsyncUpdateWithRandomCommit(update, 
NONLEADERS.get(0), seed);
  updateResponses.add(threadpool.submit(task));
  // send the updates with a slight delay in between so that they arrive in the 
intended order
  Thread.sleep(10);
}
{code}

In the context of your answer, that it's intentional for the updates to be 
async...

* why shuffle them?
* why is there now a {{sleep}} call with an explicit comment "...so that they 
arrive in the intended order" ... if there is an "intended" order why would you 
want them to be async?

the other SendUpdateToReplicaTask/AsyncUpdateWithRandomCommit usages exhibit 
the same behavior of a "sleep" in between {{ threadpool.submit(task); }} calls 
with similar comments about wanting to "...ensure requests are sequential..." 
hence my question about why threadpools are being used at all.

> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> TestStressInPlaceUpdates.eb044ac71.beast-167-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.beast-587-failure.stdout.txt, 
> TestStressInPlaceUpdates.eb044ac71.failures.tar.gz, 
> hoss.62D328FA1DEA57FD.fail.txt, hoss.62D328FA1DEA57FD.fail2.txt, 
> hoss.62D328FA1DEA57FD.fail3.txt, hoss.D768DD9443A98DC.fail.txt, 
> hoss.D768DD9443A98DC.pass.txt
>
>
> LUCENE-5189 introduced support for updates to numeric docvalues. It would be 
> really nice to have Solr support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-master-Solaris (64bit/jdk1.8.0) - Build # 637 - Failure!

2016-06-09 Thread ASF subversion and git services (JIRA)

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-master-Solaris/637/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  
org.apache.solr.common.cloud.TestCollectionStateWatchers.testWaitForStateWatcherIsRetainedOnPredicateFailure

Error Message:
Did not see a fully active cluster after 30 seconds

Stack Trace:
java.lang.AssertionError: Did not see a fully active cluster after 30 seconds
at 
__randomizedtesting.SeedInfo.seed([39C7ABC96F36FD3E:B1F1099AB799152C]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.common.cloud.TestCollectionStateWatchers.testWaitForStateWatcherIsRetainedOnPredicateFailure(TestCollectionStateWatchers.java:227)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 13076 lines...]
   [junit4] Suite: org.apache.solr.common.cloud.TestCollectionStateWatchers
   [junit4]   2> Creating

[jira] [Commented] (SOLR-9191) OverseerTaskQueue.peekTopN() fatally flawed


[ 
https://issues.apache.org/jira/browse/SOLR-9191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322987#comment-15322987
 ] 

ASF subversion and git services commented on SOLR-9191:
---

Commit 7e86ba8c7327f99ca8708494b6d402af4cd0b4ec in lucene-solr's branch 
refs/heads/master from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7e86ba8 ]

SOLR-9191: use a Predicate instead of a Function


> OverseerTaskQueue.peekTopN() fatally flawed
> ---
>
> Key: SOLR-9191
> URL: https://issues.apache.org/jira/browse/SOLR-9191
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.4, 5.4.1, 5.5, 5.5.1, 6.0, 6.0.1
>Reporter: Scott Blum
>Assignee: Scott Blum
>Priority: Blocker
> Fix For: 5.6, 6.1, 5.5.2, 6.0.2, 6.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We rewrote DistributedQueue in SOLR-6760, to optimize its obvious use case as 
> a FIFO.  But in doing so, we broke the assumptions in 
> OverseerTaskQueue.peekTopN()..
> OverseerTaskQueue.peekTopN() involves filtering out items you're already 
> working on, it's trying to peek for new items in the queue beyond what you 
> already know about.  But DistributedQueue (being designed as a FIFO) doesn't 
> know about the filtering; as long as it has any items in-memory it just keeps 
> returning those over and over without ever pulling new data from ZK.  This is 
> true even if the watcher has fired and marked the state as dirty.  So 
> OverseerTaskQueue gets into a state where it can never read new items in ZK 
> because DQ keeps returning the same items that it has marked as in-progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: (was: SOLR-8744.patch)

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-8744) Overseer operations need more fine grained mutual exclusion

2016-06-09 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-8744:
-
Attachment: SOLR-8744.patch

[~dragonsinth] This takes care of your concerns

> Overseer operations need more fine grained mutual exclusion
> ---
>
> Key: SOLR-8744
> URL: https://issues.apache.org/jira/browse/SOLR-8744
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 5.4.1
>Reporter: Scott Blum
>Assignee: Noble Paul
>Priority: Blocker
>  Labels: sharding, solrcloud
> Fix For: 6.1
>
> Attachments: SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, 
> SOLR-8744.patch, SOLR-8744.patch, SOLR-8744.patch, SmileyLockTree.java, 
> SmileyLockTree.java
>
>
> SplitShard creates a mutex over the whole collection, but, in practice, this 
> is a big scaling problem.  Multiple split shard operations could happen at 
> the time time, as long as different shards are being split.  In practice, 
> those shards often reside on different machines, so there's no I/O bottleneck 
> in those cases, just the mutex in Overseer forcing the operations to be done 
> serially.
> Given that a single split can take many minutes on a large collection, this 
> is a bottleneck at scale.
> Here is the proposed new design
> There are various Collection operations performed at Overseer. They may need 
> exclusive access at various levels. Each operation must define the Access 
> level at which the access is required. Access level is an enum. 
> CLUSTER(0)
> COLLECTION(1)
> SHARD(2)
> REPLICA(3)
> The Overseer node maintains a tree of these locks. The lock tree would look 
> as follows. The tree can be created lazily as and when tasks come up.
> {code}
> Legend: 
> C1, C2 -> Collections
> S1, S2 -> Shards 
> R1,R2,R3,R4 -> Replicas
>  Cluster
> /   \
>/ \ 
>   C1  C2
>  / \ /   \ 
> /   \   / \  
>S1   S2  S1 S2
> R1, R2  R3.R4  R1,R2   R3,R4
> {code}
> When the overseer receives a message, it tries to acquire the appropriate 
> lock from the tree. For example, if an operation needs a lock at a Collection 
> level and it needs to operate on Collection C1, the node C1 and all child 
> nodes of C1 must be free. 
> h2.Lock acquiring logic
> Each operation would start from the root of the tree (Level 0 -> Cluster) and 
> start moving down depending upon the operation. After it reaches the right 
> node, it checks if all the children are free from a lock.  If it fails to 
> acquire a lock, it remains in the work queue. A scheduler thread waits for 
> notification from the current set of tasks . Every task would do a 
> {{notify()}} on the monitor of  the scheduler thread. The thread would start 
> from the head of the queue and check all tasks to see if that task is able to 
> acquire the right lock. If yes, it is executed, if not, the task is left in 
> the work queue.  
> When a new task arrives in the work queue, the schedulerthread wakes and just 
> try to schedule that task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-6572) Highlighter depends on analyzers-common


 [ 
https://issues.apache.org/jira/browse/LUCENE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6572:
-
Attachment: LUCENE-6572.patch

Here is a patch. In the mean time, the ngram analyzer has been used in tests, 
so I adder a simplified (works on chars rather than code points, 
minLen==maxLen, does not refill) version to the test case.

> Highlighter depends on analyzers-common
> ---
>
> Key: LUCENE-6572
> URL: https://issues.apache.org/jira/browse/LUCENE-6572
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Reporter: Robert Muir
>Priority: Blocker
> Attachments: LUCENE-6572.patch
>
>
> This is a huge WTF, just for "LimitTokenOffsetFilter" which is only useful 
> for highlighting.
> Adding all these intermodule dependencies makes things too hard to use.
> This is a 5.3 release blocker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9181) ZkStateReaderTest failure


[ 
https://issues.apache.org/jira/browse/SOLR-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322917#comment-15322917
 ] 

Scott Blum commented on SOLR-9181:
--

I think I'd probably leave the APIs there, but maybe mark them experimental in 
6.1 and just not have the rest of Solr rely on them yet?

> ZkStateReaderTest failure
> -
>
> Key: SOLR-9181
> URL: https://issues.apache.org/jira/browse/SOLR-9181
> Project: Solr
>  Issue Type: Bug
>Reporter: Alan Woodward
>Assignee: Alan Woodward
> Fix For: 6.1
>
> Attachments: SOLR-9181.patch, SOLR-9181.patch, SOLR-9181.patch, 
> SOLR-9181.patch
>
>
> https://builds.apache.org/job/Lucene-Solr-Tests-6.x/243/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7317) Remove auto prefix terms


 [ 
https://issues.apache.org/jira/browse/LUCENE-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7317:
-
Attachment: LUCENE-7317.patch

Here is a patch. It removes writes of auto prefix terms in the block tree 
writer and the AutoPrefixTermsPostingsFormat.

> Remove auto prefix terms
> 
>
> Key: LUCENE-7317
> URL: https://issues.apache.org/jira/browse/LUCENE-7317
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7317.patch
>
>
> This was mostly superseded by the new points API so should we remove 
> auto-prefix terms?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-7256) PatternReplaceCharFilter can make Lucene hang


 [ 
https://issues.apache.org/jira/browse/LUCENE-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-7256.

Resolution: Won't Fix

Closing then.

> PatternReplaceCharFilter can make Lucene hang
> -
>
> Key: LUCENE-7256
> URL: https://issues.apache.org/jira/browse/LUCENE-7256
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 5.4.1
> Environment: alpine linux v3.3
>Reporter: Tom Fotherby
>Priority: Minor
>
> I'm using ElasticSearch (v2.2.0 , Lucene v5.4.1) and it's [Pattern Replace 
> Char 
> Filter|https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-replace-charfilter.html]
>  (Lucenes PatternReplaceCharFilter) . I need to filter out urls from my query 
> text before it is tokenised. But I found that some input strings cause 
> ElasticSearch to "hang" (slowly eating more CPU and memory) until the system 
> crashes.
> 
> *Example*
> {code}
> // Character filters are used to "tidy up" a string *before* it is tokenized.
> 'char_filter' => [
> 'url_removal_pattern' => [
> 'type'=> 'pattern_replace',
> 'pattern' => 
> '(?mi)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»""'']))',
> 'replacement' => '',
> ],
> {code}
> This filter was working fine for some weeks until suddenly ElasticSearch 
> started crashing. We found someone was trying to do a javascript injection 
> attack in our search box.
> I pasted the regex and the attack string into https://regex101.com 
> * Regexp: 
>  * 
> {code}(?mi)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s!()\[\]{};:\'".,<>?«»""''])){code}
> * Test string: 
>  * 
> {code}twitter.com/widgets.js\";fjs.parentNode.insertBefore(js,fjs);}}(document,\"script\",\"twitter-wjs\"{code}
> https://regex101.com shows the problem to be "Catastrophic backtracking"
> bq. Catastrophic backtracking has been detected and the execution of your 
> expression has been halted. To find out more what this is, please read the 
> following article: [Runaway Regular 
> Expressions|http://www.regular-expressions.info/catastrophic.html].
> It would be great if Lucene could detect "Catastrophic backtracking" and 
> throw a error or return null.
> 
> As an aside, I created a unit test for our PHP application that uses the same 
> regexp and test string. (PHP can understand the same regexp, even though it's 
> obviously for Java in the ElasticSearch case) . Interestingly in php, the 
> regex results in `null` which is the documented response of 
> [preg_replace|http://php.net/manual/en/function.preg-replace.php] when a 
> error occurs. If PHP can return a error rather than crashing - surely Lucene 
> / Java can too :trollface: ?
> {code}
> namespace app\tests\unit;
> use \yii\codeception\TestCase;
> class TagsControllerTest extends TestCase
> {
> public function testRegexForURLDetection()
> {
> $regex = 
> '(?mi)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»""'']))';
> // Test the Catastrophic backtracking problem
> $testString = 
> "twitter.com/widgets.js\";fjs.parentNode.insertBefore(js,fjs);}}(document,\"script\",\"twitter-wjs\"";
> // This shows the regex is not working for our test string - it gives 
> null but should give 'hello '
> $this->assertEquals(null, preg_replace("/$regex/", '', "hello 
> $testString"));
> }
> }
> {code}
> 
> (I originally [opened a 
> ticket|https://github.com/elastic/elasticsearch/issues/17934] to the 
> ElasticSearch project but got told opening it here would be more appropriate 
> - sorry if I'm wrong)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7324) ExitableDirectoryReader should not check on every term

2016-06-09 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322824#comment-15322824
 ] 

Uwe Schindler commented on LUCENE-7324:
---

I'd not move it to Sandbox without a major release. I would suggest to fix it 
to not check time on every, every access. As said a year ago, this might be 
much more horrible, e.g., on MacOSX, where it uses a very slow variant of 
nanotime!

> ExitableDirectoryReader should not check on every term
> --
>
> Key: LUCENE-7324
> URL: https://issues.apache.org/jira/browse/LUCENE-7324
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>
> I looked at ExitableDirectoryReader and to me checking the timeout on every 
> term is a pretty heavy operation. I wonder if we can relax that a bit and 
> begin with checking when we pull Terms and maybe only every Nth term by 
> default? I wonder if we even can make it a function of the number of therms 
> ie log(numTerms) I think it's pretty trappy to have something that won't 
> perform well or has the risk to not scale well in lucene core?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9176) Legacy Faceting Term Enum Method Regression


[ 
https://issues.apache.org/jira/browse/SOLR-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322817#comment-15322817
 ] 

Adrien Grand commented on SOLR-9176:


Thanks Alan.

> Legacy Faceting Term Enum Method Regression
> ---
>
> Key: SOLR-9176
> URL: https://issues.apache.org/jira/browse/SOLR-9176
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 5.2, 5.2.1, 5.3, 5.3.1, 5.3.2, 5.4, 5.4.1, 5.5, 5.5.1, 
> 6.0, 6.0.1
>Reporter: Alessandro Benedetti
>Assignee: Alan Woodward
> Attachments: SOLR-9176.patch, SOLR-9176.patch
>
>
> Starting from this commit :
> LUCENE-5666: get solr started
> git-svn-id: 
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5666@1594254 
> 13f79535-47bb-0310-9956-ffa450edef68
> https://github.com/apache/lucene-solr/commit/1489085807cb10981a7ea5b5663ada4e3f85953e#diff-5ac9dc7b128b4dd99b764060759222b2
> It is not possible to use Term Enum as a faceting method, for numeric and 
> single valued fields ( org.apache.solr.request.SimpleFacets ) .
> We personally verified that there are use cases when this is bringing a quite 
> big performance regression  ( even with DocValues enabled) .
> In the mailing list from time to time people complain about a regression 
> happening with the term enum method, but actually it is more likely to be the 
>   automatic forcing of FCS.
> Forcing FCS in co-op with the famous regression that happened in SOLR-8096 
> could be confused as a term Enum regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9176) Legacy Faceting Term Enum Method Regression

2016-06-09 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322803#comment-15322803
 ] 

Alan Woodward commented on SOLR-9176:
-

bq. This looks to me like something that would be nice to fix for 6.1?

It would!  I'll write some tests and get this committed tomorrow.

> Legacy Faceting Term Enum Method Regression
> ---
>
> Key: SOLR-9176
> URL: https://issues.apache.org/jira/browse/SOLR-9176
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 5.2, 5.2.1, 5.3, 5.3.1, 5.3.2, 5.4, 5.4.1, 5.5, 5.5.1, 
> 6.0, 6.0.1
>Reporter: Alessandro Benedetti
>Assignee: Alan Woodward
> Attachments: SOLR-9176.patch, SOLR-9176.patch
>
>
> Starting from this commit :
> LUCENE-5666: get solr started
> git-svn-id: 
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5666@1594254 
> 13f79535-47bb-0310-9956-ffa450edef68
> https://github.com/apache/lucene-solr/commit/1489085807cb10981a7ea5b5663ada4e3f85953e#diff-5ac9dc7b128b4dd99b764060759222b2
> It is not possible to use Term Enum as a faceting method, for numeric and 
> single valued fields ( org.apache.solr.request.SimpleFacets ) .
> We personally verified that there are use cases when this is bringing a quite 
> big performance regression  ( even with DocValues enabled) .
> In the mailing list from time to time people complain about a regression 
> happening with the term enum method, but actually it is more likely to be the 
>   automatic forcing of FCS.
> Forcing FCS in co-op with the famous regression that happened in SOLR-8096 
> could be confused as a term Enum regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9176) Legacy Faceting Term Enum Method Regression

2016-06-09 Thread Alessandro Benedetti (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322732#comment-15322732
 ] 

Alessandro Benedetti commented on SOLR-9176:


i agree for the unit test ( even if some clause in the selectFacet method are 
workaround to avoid bugs or incompatibility in some legacy facet method, and 
they don't deserve a test but actually a bug) . 
Unfortunately I will be away until monday, I can try to contribute the tests 
next week.  If you want to contribute them before, feel free to do that! 
Cheers

> Legacy Faceting Term Enum Method Regression
> ---
>
> Key: SOLR-9176
> URL: https://issues.apache.org/jira/browse/SOLR-9176
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 5.2, 5.2.1, 5.3, 5.3.1, 5.3.2, 5.4, 5.4.1, 5.5, 5.5.1, 
> 6.0, 6.0.1
>Reporter: Alessandro Benedetti
>Assignee: Alan Woodward
> Attachments: SOLR-9176.patch, SOLR-9176.patch
>
>
> Starting from this commit :
> LUCENE-5666: get solr started
> git-svn-id: 
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5666@1594254 
> 13f79535-47bb-0310-9956-ffa450edef68
> https://github.com/apache/lucene-solr/commit/1489085807cb10981a7ea5b5663ada4e3f85953e#diff-5ac9dc7b128b4dd99b764060759222b2
> It is not possible to use Term Enum as a faceting method, for numeric and 
> single valued fields ( org.apache.solr.request.SimpleFacets ) .
> We personally verified that there are use cases when this is bringing a quite 
> big performance regression  ( even with DocValues enabled) .
> In the mailing list from time to time people complain about a regression 
> happening with the term enum method, but actually it is more likely to be the 
>   automatic forcing of FCS.
> Forcing FCS in co-op with the famous regression that happened in SOLR-8096 
> could be confused as a term Enum regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7286) WeightedSpanTermExtractor.extract() does not recognize SynonymQuery


 [ 
https://issues.apache.org/jira/browse/LUCENE-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7286:
-
Attachment: LUCENE-7286.patch

Here is a patch. I think this bug may hit many users so I would like to include 
it in 6.1.

> WeightedSpanTermExtractor.extract() does not recognize SynonymQuery
> ---
>
> Key: LUCENE-7286
> URL: https://issues.apache.org/jira/browse/LUCENE-7286
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 6.0
>Reporter: Piotr
> Attachments: LUCENE-7286.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Short description:
> In WeightedSpanTermExtractor.extract(...)  method there is a long list of 
> supported Queries. There is no SynonymQuery which leads to 
> extractUnknownQuery() that does nothing. It would be really nice to have 
> SynonymQuery covered as well.
> Long description:
> I'm trying to highlight an external text using a Highlighter. The query is 
> created by QueryParser. If the created query is simple it works like a charm. 
> The problem is when parsed query contains SynonymQuery -- it happens, when 
> stemmer returns multiple stems, which is not uncommon for Polish language. 
> Btw. this is my first jira issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6572) Highlighter depends on analyzers-common


[ 
https://issues.apache.org/jira/browse/LUCENE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322697#comment-15322697
 ] 

Michael McCandless commented on LUCENE-6572:


bq. Given how simple this token filter is, what about having a copy it in the 
highlighter module?

+1

> Highlighter depends on analyzers-common
> ---
>
> Key: LUCENE-6572
> URL: https://issues.apache.org/jira/browse/LUCENE-6572
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Reporter: Robert Muir
>Priority: Blocker
>
> This is a huge WTF, just for "LimitTokenOffsetFilter" which is only useful 
> for highlighting.
> Adding all these intermodule dependencies makes things too hard to use.
> This is a 5.3 release blocker.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7324) ExitableDirectoryReader should not check on every term


[ 
https://issues.apache.org/jira/browse/LUCENE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322695#comment-15322695
 ] 

Michael McCandless commented on LUCENE-7324:


bq. I wonder if we should move it to sandbox for now?

+1

> ExitableDirectoryReader should not check on every term
> --
>
> Key: LUCENE-7324
> URL: https://issues.apache.org/jira/browse/LUCENE-7324
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Simon Willnauer
>
> I looked at ExitableDirectoryReader and to me checking the timeout on every 
> term is a pretty heavy operation. I wonder if we can relax that a bit and 
> begin with checking when we pull Terms and maybe only every Nth term by 
> default? I wonder if we even can make it a function of the number of therms 
> ie log(numTerms) I think it's pretty trappy to have something that won't 
> perform well or has the risk to not scale well in lucene core?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Moved] (SOLR-9202) UninvertingReader needs multi-valued points support


 [ 
https://issues.apache.org/jira/browse/SOLR-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless moved LUCENE-7096 to SOLR-9202:
--

Lucene Fields:   (was: New)
  Key: SOLR-9202  (was: LUCENE-7096)
  Project: Solr  (was: Lucene - Core)

> UninvertingReader needs multi-valued points support
> ---
>
> Key: SOLR-9202
> URL: https://issues.apache.org/jira/browse/SOLR-9202
> Project: Solr
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-7096.patch
>
>
> It now supports the single valued case (deprecating the legacy encoding), but 
> the multi-valued stuff does not yet have a replacement.
> ideally we add a FC.getSortedNumeric(Parser..) that works from points. Unlike 
> postings, points never lose frequency within a field, so its the best fit. 
> when getDocCount() == size(), the field is single-valued, so this should call 
> getNumeric and box that in SortedNumeric, similar to the String case.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7096) UninvertingReader needs multi-valued points support


[ 
https://issues.apache.org/jira/browse/LUCENE-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322688#comment-15322688
 ] 

Michael McCandless commented on LUCENE-7096:


I think we should just move this issue to a Solr issue ... maybe this patch is 
helpful for Solr?

I'll move it.

> UninvertingReader needs multi-valued points support
> ---
>
> Key: LUCENE-7096
> URL: https://issues.apache.org/jira/browse/LUCENE-7096
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-7096.patch
>
>
> It now supports the single valued case (deprecating the legacy encoding), but 
> the multi-valued stuff does not yet have a replacement.
> ideally we add a FC.getSortedNumeric(Parser..) that works from points. Unlike 
> postings, points never lose frequency within a field, so its the best fit. 
> when getDocCount() == size(), the field is single-valued, so this should call 
> getNumeric and box that in SortedNumeric, similar to the String case.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7291) HeatmapFacetCounterTest.testRandom failure; random


[ 
https://issues.apache.org/jira/browse/LUCENE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322687#comment-15322687
 ] 

Adrien Grand commented on LUCENE-7291:
--

[~dsmiley] Pinging you in case you want to have a chance to look into it before 
we release 6.1. FYI the seed still reproduces for me on master.

> HeatmapFacetCounterTest.testRandom failure; random
> --
>
> Key: LUCENE-7291
> URL: https://issues.apache.org/jira/browse/LUCENE-7291
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial-extras
>Reporter: David Smiley
>Assignee: David Smiley
>
> Jenkins found a test failure today.
> This reproduces for me (master, java 8):
> ant test  -Dtestcase=HeatmapFacetCounterTest -Dtests.method=testRandom 
> -Dtests.seed=3EC907D1784B6F23 -Dtests.multiplier=2 -Dtests.nightly=true 
> -Dtests.slow=true 
> -Dtests.linedocsfile=/x1/jenkins/lucene-data/enwiki.random.lines.txt 
> -Dtests.locale=is-IS -Dtests.timezone=Europe/Tirane -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
> {noformat}
> java.lang.AssertionError: 
> Expected :1
> Actual   :0
>  
>   at 
> __randomizedtesting.SeedInfo.seed([3EC907D1784B6F23:A3439C5F68FEAB94]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at org.junit.Assert.failNotEquals(Assert.java:647)
>   at org.junit.Assert.assertEquals(Assert.java:128)
>   at org.junit.Assert.assertEquals(Assert.java:472)
>   at org.junit.Assert.assertEquals(Assert.java:456)
>   at 
> org.apache.lucene.spatial.prefix.HeatmapFacetCounterTest.validateHeatmapResult(HeatmapFacetCounterTest.java:226)
>   at 
> org.apache.lucene.spatial.prefix.HeatmapFacetCounterTest.queryHeatmapRecursive(HeatmapFacetCounterTest.java:193)
>   at 
> org.apache.lucene.spatial.prefix.HeatmapFacetCounterTest.queryHeatmapRecursive(HeatmapFacetCounterTest.java:206)
>   at 
> org.apache.lucene.spatial.prefix.HeatmapFacetCounterTest.testRandom(HeatmapFacetCounterTest.java:172)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
>   at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9176) Legacy Faceting Term Enum Method Regression


[ 
https://issues.apache.org/jira/browse/SOLR-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322680#comment-15322680
 ] 

Adrien Grand commented on SOLR-9176:


This looks to me like something that would be nice to fix for 6.1?

> Legacy Faceting Term Enum Method Regression
> ---
>
> Key: SOLR-9176
> URL: https://issues.apache.org/jira/browse/SOLR-9176
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 5.2, 5.2.1, 5.3, 5.3.1, 5.3.2, 5.4, 5.4.1, 5.5, 5.5.1, 
> 6.0, 6.0.1
>Reporter: Alessandro Benedetti
>Assignee: Alan Woodward
> Attachments: SOLR-9176.patch, SOLR-9176.patch
>
>
> Starting from this commit :
> LUCENE-5666: get solr started
> git-svn-id: 
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5666@1594254 
> 13f79535-47bb-0310-9956-ffa450edef68
> https://github.com/apache/lucene-solr/commit/1489085807cb10981a7ea5b5663ada4e3f85953e#diff-5ac9dc7b128b4dd99b764060759222b2
> It is not possible to use Term Enum as a faceting method, for numeric and 
> single valued fields ( org.apache.solr.request.SimpleFacets ) .
> We personally verified that there are use cases when this is bringing a quite 
> big performance regression  ( even with DocValues enabled) .
> In the mailing list from time to time people complain about a regression 
> happening with the term enum method, but actually it is more likely to be the 
>   automatic forcing of FCS.
> Forcing FCS in co-op with the famous regression that happened in SOLR-8096 
> could be confused as a term Enum regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?


 [ 
https://issues.apache.org/jira/browse/LUCENE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-7325:
---
Attachment: LUCENE-7325.patch

Simple showing the issue ... but it's possible there are test bugs, e.g. wrong 
lat/lon order or encoding or something :)

> GeoPointInBBoxQuery no longer includes boundaries?
> --
>
> Key: LUCENE-7325
> URL: https://issues.apache.org/jira/browse/LUCENE-7325
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Michael McCandless
> Attachments: LUCENE-7325.patch
>
>
> {{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
> and 6.0, but in 6.1 something broke.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?


[ 
https://issues.apache.org/jira/browse/LUCENE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322668#comment-15322668
 ] 

Michael McCandless commented on LUCENE-7325:


bq.  If this is removed you will see them.

+1 to remove {{GeoPointTestUtil}} ... we need stronger testing here, to prevent 
regressions like this.

> GeoPointInBBoxQuery no longer includes boundaries?
> --
>
> Key: LUCENE-7325
> URL: https://issues.apache.org/jira/browse/LUCENE-7325
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Michael McCandless
>Priority: Blocker
> Attachments: LUCENE-7325.patch
>
>
> {{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
> and 6.0, but in 6.1 something broke.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?

Michael McCandless created LUCENE-7325:
--

 Summary: GeoPointInBBoxQuery no longer includes boundaries?
 Key: LUCENE-7325
 URL: https://issues.apache.org/jira/browse/LUCENE-7325
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 6.1
Reporter: Michael McCandless


{{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
and 6.0, but in 6.1 something broke.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?


 [ 
https://issues.apache.org/jira/browse/LUCENE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-7325:
---
Priority: Blocker  (was: Major)

> GeoPointInBBoxQuery no longer includes boundaries?
> --
>
> Key: LUCENE-7325
> URL: https://issues.apache.org/jira/browse/LUCENE-7325
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Michael McCandless
>Priority: Blocker
> Attachments: LUCENE-7325.patch
>
>
> {{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
> and 6.0, but in 6.1 something broke.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?


[ 
https://issues.apache.org/jira/browse/LUCENE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322667#comment-15322667
 ] 

Michael McCandless commented on LUCENE-7325:


I think we should fix this for 6.1.0.

> GeoPointInBBoxQuery no longer includes boundaries?
> --
>
> Key: LUCENE-7325
> URL: https://issues.apache.org/jira/browse/LUCENE-7325
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Michael McCandless
>Priority: Blocker
> Attachments: LUCENE-7325.patch
>
>
> {{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
> and 6.0, but in 6.1 something broke.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-9201) New Solr Web UI does not work correctly if contextPath differs from /solr

2016-06-09 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett closed SOLR-9201.
---
Resolution: Duplicate

This seems to be a duplicate of two other already duplicated issues, SOLR-9000 
and SOLR-9054.

> New Solr Web UI does not work correctly if contextPath differs from /solr
> -
>
> Key: SOLR-9201
> URL: https://issues.apache.org/jira/browse/SOLR-9201
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 6.0
>Reporter: Andriy Binetsky
>Priority: Minor
>
> Solr Web UI Javascript file solr/webapp/web/js/angular/services.js uses hard 
> coded /solr contextPath for resources.
> Therefore new UI  does not work correctly with contextPath other then /solr 
> (by example 
> http://stackoverflow.com/questions/34772611/override-contextpath-on-solr-start-from-command-line)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7325) GeoPointInBBoxQuery no longer includes boundaries?

2016-06-09 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322663#comment-15322663
 ] 

Robert Muir commented on LUCENE-7325:
-

Note we already have a test and it will fail if you enable the new RNG for 
geopoint (LUCENE-7185).

Other tests fail too, thats still not fixed. GeoPoint hacks around this with a 
"GeoPointTestUtil" at the moment. If this is removed you will see them.

> GeoPointInBBoxQuery no longer includes boundaries?
> --
>
> Key: LUCENE-7325
> URL: https://issues.apache.org/jira/browse/LUCENE-7325
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.1
>Reporter: Michael McCandless
> Attachments: LUCENE-7325.patch
>
>
> {{GeoPointInBBoxQuery}} is supposed to include boundaries, and it does in 5.x 
> and 6.0, but in 6.1 something broke.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8829) Mark CDCR as experimental in 6.0


[ 
https://issues.apache.org/jira/browse/SOLR-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322660#comment-15322660
 ] 

Adrien Grand commented on SOLR-8829:


Is there anything we need to get done for 6.1 here?

> Mark CDCR as experimental in 6.0
> 
>
> Key: SOLR-8829
> URL: https://issues.apache.org/jira/browse/SOLR-8829
> Project: Solr
>  Issue Type: Wish
>  Components: CDCR, SolrCloud
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 6.0
>
>
> A slew of improvements to CDCR viz. SOLR-6465, SOLR-8389, SOLR-8391 are 
> planned and on which I am actively working. Unfortunately, those won't make 
> it to the 6.0 release. At the same time since both the amount of 
> configuration and mode of configuration (API vs editing solrconfig.xml) will 
> change with these planned improvements, it'd be nice to not have strong 
> back-compat guarantees which prevent us from making CDCR easier to use.
> Therefore, I propose to put a note in CHANGES.txt and any documentation in 
> the Solr reference guide to this effect as well as add an experimental 
> response key in all CDCR API responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3673) NativeFSLockFactory Race Condition


 [ 
https://issues.apache.org/jira/browse/LUCENE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-3673.
--
Resolution: Fixed

This seems to have been fixed by other changes. Testing also improved via 
LUCENE-5624.

> NativeFSLockFactory Race Condition
> --
>
> Key: LUCENE-3673
> URL: https://issues.apache.org/jira/browse/LUCENE-3673
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: 3.4, 3.5
> Environment: Mac OS X 10.7, JDK 1.6
> CentOS 5, JDK 1.7u1
>Reporter: Yaokai Jiang
>Priority: Critical
>  Labels: concurrency, locking
> Attachments: BreakIt.java, LUCENE-3673.patch
>
>
> When the NativeFSLock releases, it deletes the lock file after lock release.
> In a concurrent situation, it is possible for another thread or process to 
> acquire the lock before the delete happens. Any other third thread or process 
> will then be able to recreate the lock file and acquire the lock while a lock 
> is already held on the deleted file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3779) An incomplete fix for the NPE bugs in MultipleTermPositions.java


 [ 
https://issues.apache.org/jira/browse/LUCENE-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-3779.
--
Resolution: Unresolved

MultiTermsEnum.java has changed a lot and this bug is not relevant anymore.

> An incomplete fix for the NPE bugs in MultipleTermPositions.java
> 
>
> Key: LUCENE-3779
> URL: https://issues.apache.org/jira/browse/LUCENE-3779
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 3.0
>Reporter: Guangtai Liang
>Priority: Critical
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> The fix revision 219387 (Fix for NPE (bug #35626). Fix by Hans Hjelm, test 
> case by Scotty Allen.) was aimed to remove an NPE bug on the  return value of 
>  "_termPositionsQueue.peek()" in the method "skipTo" of the file 
> "/lucene/java/trunk/src/java/org/apache/lucene/index/MultipleTermPositions.java"
>  , but it is incomplete. 
> Since the returned value  "_termPositionsQueue.peek()" could be null during 
> the run-time execution, its value should also be null-checked before being 
> dereferenced in other methods. 
> The buggy code locations the same fix needs to be applied at are as bellows: 
>  
> Line 118, 124, 135 of the method "next()" : 
> public final boolean next() throws IOException {
> if (_termPositionsQueue.size() == 0)
>   return false;
> _posList.clear();
> [Line  118]_doc = _termPositionsQueue.peek().doc();
> TermPositions tp;
> do {
>   tp = _termPositionsQueue.peek();
> [Line  124]for (int i = 0; i < tp.freq(); i++) {
>   // NOTE: this can result in dup positions being added!
> _posList.add(tp.nextPosition());
> }
>   if (tp.next())
> _termPositionsQueue.updateTop();
>   else {
> _termPositionsQueue.pop();
> tp.close();
>   }
> [Line  135]} while (_termPositionsQueue.size() > 0 && 
> _termPositionsQueue.peek().doc() == _doc);
> _posList.sort();
> _freq = _posList.size();
> return true;
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-3780) An incomplete fix for the NPE bugs in ParallelReader.java


 [ 
https://issues.apache.org/jira/browse/LUCENE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-3780.

Resolution: Unresolved

ParallelReader.java has changed a lot and this bug is not relevant anymore.

> An incomplete fix for the NPE bugs in ParallelReader.java
> -
>
> Key: LUCENE-3780
> URL: https://issues.apache.org/jira/browse/LUCENE-3780
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 3.0
>Reporter: Guangtai Liang
>Priority: Critical
>  Labels: incomplete_fix, missing_fixes
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> The fix revision 407851 was aimed to remove an NPE bug on the  return value 
> of  "fieldToReader.get(field)" in the methods "getTermFreqVector", 
> "hasNorms", "norms", "doSetNorm" of the file 
> "/lucene/java/trunk/src/java/org/apache/lucene/index/ParallelReader.java
> " , but it is incomplete. 
> Since the returned value  "fieldToReader.get(field)" could be null during the 
> runtime execution, its value should also be null-checked before being 
> dereferenced in other methods. 
> The buggy code locations the same fix needs to be applied at are as bellows: 
>  
> Line 499  of the method "ParallelTermEnum()" : 
> public ParallelTermEnum() throws IOException {
>   try {
> field = fieldToReader.firstKey();
>   } catch(NoSuchElementException e) {
> // No fields, so keep field == null, termEnum == null
> return;
>   }
>   if (field != null)
> [Line 499]termEnum = fieldToReader.get(field).terms();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-3782) An incomplete fix for the NPE bugs in Directory.java


 [ 
https://issues.apache.org/jira/browse/LUCENE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-3782.

Resolution: Unresolved

Directory.java has changed a lot and this bug is not relevant anymore.

> An incomplete fix for the NPE bugs in Directory.java
> 
>
> Key: LUCENE-3782
> URL: https://issues.apache.org/jira/browse/LUCENE-3782
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: 3.0
>Reporter: Guangtai Liang
>Priority: Critical
>  Labels: incomplete_fix, missing_fixes
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> The fix revision 499089 was aimed to remove an NPE bug (LUCENE-773) on the 
> value  of  "lockFactory " in the method "clearLock" of the file 
> "/lucene/java/trunk/src/java/org/apache/lucene/store/Directory.java" , but it 
> is incomplete. 
> Since the value  "lockFactory " could be null during the runtime execution, 
> its value should also be null-checked before being dereferenced in other 
> methods. 
> The buggy code locations the same fix needs to be applied at are as bellows: 
>  
> Line 106 of the methods "doc()" , and "freq": 
>   public Lock makeLock(String name) {
> [Line 106]  return lockFactory.makeLock(name);
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-4200) ArrayIndexOutOfBoundsException in lucene.index.SegmentTermDocs.read


 [ 
https://issues.apache.org/jira/browse/LUCENE-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-4200.

Resolution: Unresolved

This code has changed a lot. It is very likely that this bug is not relevant 
anymore.

> ArrayIndexOutOfBoundsException in lucene.index.SegmentTermDocs.read
> ---
>
> Key: LUCENE-4200
> URL: https://issues.apache.org/jira/browse/LUCENE-4200
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 3.6
> Environment: java version "1.6.0_20"
> OpenJDK Runtime Environment (IcedTea6 1.9.13) (6b20-1.9.13-0ubuntu1~10.04.1)
> OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
> tomcat 6.0.24
> solr 3.6.0
>Reporter: Serge Negodyuck
>Priority: Critical
>
> Sometimes one of my slave tomcat/solr instances stop working with following 
> stacktrace (every request)
> If I restart tomcat everything work fine. 
> Jul 6, 2012 11:30:04 AM org.apache.solr.common.SolrException log
> SEVERE: java.lang.ArrayIndexOutOfBoundsException: 12
> at org.apache.lucene.util.BitVector.get(BitVector.java:114)
> at 
> org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:161)
> at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:112)
> at 
> org.apache.lucene.search.BooleanScorer2$SingleMatchScorer.nextDoc(BooleanScorer2.java:137)
> at 
> org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:280)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:581)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:364)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:863)
> at 
> org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:635)
> at 
> org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:769)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1341)
> at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1172)
> at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:375)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:394)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:186)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
> at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
> at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
> at java.lang.Thread.run(Thread.java:636)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-6.x - Build # 88 - Still Failing

2016-06-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-6.x/88/

3 tests failed.
FAILED:  org.apache.solr.cloud.RestartWhileUpdatingTest.test

Error Message:
There are still nodes recoverying - waited for 320 seconds

Stack Trace:
java.lang.AssertionError: There are still nodes recoverying - waited for 320 
seconds
at 
__randomizedtesting.SeedInfo.seed([A0CEFEFED1780E70:289AC1247F846388]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:182)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.waitForRecoveriesToFinish(AbstractFullDistribZkTestBase.java:862)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.waitForThingsToLevelOut(AbstractFullDistribZkTestBase.java:1418)
at 
org.apache.solr.cloud.RestartWhileUpdatingTest.test(RestartWhileUpdatingTest.java:144)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at

[jira] [Resolved] (LUCENE-4730) SmartChineseAnalyzer got wrong matched offset


 [ 
https://issues.apache.org/jira/browse/LUCENE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4730.
--
Resolution: Fixed

Thanks Michael for digging it.

> SmartChineseAnalyzer got wrong matched offset
> -
>
> Key: LUCENE-4730
> URL: https://issues.apache.org/jira/browse/LUCENE-4730
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 4.0, 4.1
> Environment: JDK1.7 Linux/Windows
>Reporter: Jinsong Hu
>Priority: Critical
> Attachments: LUCENE-4730.patch
>
>
> We found that SmartChineseAnalyzer got wrong matched offset with the 
> following test code:
> public void testHighlight() throws Exception {
> String text = "My China  ";
> String queryText = "China";
> StringBuilder builder = new StringBuilder("");
> Analyzer analyzer = new SmartChineseAnalyzer(Version.LUCENE_40);
> //Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
> QueryParser parser = new QueryParser(Version.LUCENE_40, "text", 
> analyzer);
> Query query = parser.parse(queryText);
> SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(" style=\"background: yellow\">", "");
> TokenStream tokens = analyzer.tokenStream("text", new 
> StringReader(text));
> QueryScorer scorer = new QueryScorer(query, "text");
> Highlighter highlighter = new Highlighter(formatter, scorer);
> highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
> String result = highlighter.getBestFragments(tokens, text, 10, "...");
> if (result.length() < text.length()) {
> result = text;
> }
> builder.append("");
> builder.append(result);
> builder.append("");
> builder.append("");
> System.out.println(builder.toString());
> }
> This method will generate a hilighted text, however, the highlight position 
> is obviously wrong, and if we remove one space from the text, that is, change 
> text from "My China  " (ends with two spaces) to "My China " (ends with one 
> space), it will generate a text with correct highlight. If we change the 
> analyzer from SmartChineseAnalyzer to StandardAnalyzer, the highlight issue 
> will disappear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.

2016-06-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-4787.
--
   Resolution: Fixed
Fix Version/s: 6.2
   master (7.0)

Merged, thanks (and sorry for the delay)!

> The QueryScorer.getMaxWeight method is not found.
> -
>
> Key: LUCENE-4787
> URL: https://issues.apache.org/jira/browse/LUCENE-4787
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 4.1
>Reporter: Hao Zhong
>Priority: Critical
> Fix For: master (7.0), 6.2
>
> Attachments: LUCENE-4787.patch
>
>
> The following API documents refer to the QueryScorer.getMaxWeight method:
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html
> "The QueryScorer.getMaxWeight method is useful when passed to the 
> GradientFormatter constructor to define the top score which is associated 
> with the top color."
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
> "See QueryScorer.getMaxWeight which can be used to calibrate scoring scale"
> However, the QueryScorer class does not declare a getMaxWeight method in 
> lucene 4.1, according to its document:
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html
> Instead, the class declares a getMaxTermWeight method. Is that the correct 
> method in the preceding two documents? If it is, please revise the two 
> documents. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.


[ 
https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322586#comment-15322586
 ] 

ASF subversion and git services commented on LUCENE-4787:
-

Commit bd7689b74de0c3201391e1f7d3b254b7cf3513e4 in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=bd7689b ]

LUCENE-4787: Fixed some highlighting javadocs.


> The QueryScorer.getMaxWeight method is not found.
> -
>
> Key: LUCENE-4787
> URL: https://issues.apache.org/jira/browse/LUCENE-4787
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 4.1
>Reporter: Hao Zhong
>Priority: Critical
> Attachments: LUCENE-4787.patch
>
>
> The following API documents refer to the QueryScorer.getMaxWeight method:
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html
> "The QueryScorer.getMaxWeight method is useful when passed to the 
> GradientFormatter constructor to define the top score which is associated 
> with the top color."
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
> "See QueryScorer.getMaxWeight which can be used to calibrate scoring scale"
> However, the QueryScorer class does not declare a getMaxWeight method in 
> lucene 4.1, according to its document:
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html
> Instead, the class declares a getMaxTermWeight method. Is that the correct 
> method in the preceding two documents? If it is, please revise the two 
> documents. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4787) The QueryScorer.getMaxWeight method is not found.

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322587#comment-15322587
 ] 

ASF subversion and git services commented on LUCENE-4787:
-

Commit 09fd65b0150364b2e82d1f9ab954751a38653f45 in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=09fd65b ]

LUCENE-4787: Fixed some highlighting javadocs.


> The QueryScorer.getMaxWeight method is not found.
> -
>
> Key: LUCENE-4787
> URL: https://issues.apache.org/jira/browse/LUCENE-4787
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 4.1
>Reporter: Hao Zhong
>Priority: Critical
> Attachments: LUCENE-4787.patch
>
>
> The following API documents refer to the QueryScorer.getMaxWeight method:
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/package-summary.html
> "The QueryScorer.getMaxWeight method is useful when passed to the 
> GradientFormatter constructor to define the top score which is associated 
> with the top color."
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/GradientFormatter.html
> "See QueryScorer.getMaxWeight which can be used to calibrate scoring scale"
> However, the QueryScorer class does not declare a getMaxWeight method in 
> lucene 4.1, according to its document:
> http://lucene.apache.org/core/4_1_0/highlighter/org/apache/lucene/search/highlight/QueryScorer.html
> Instead, the class declares a getMaxTermWeight method. Is that the correct 
> method in the preceding two documents? If it is, please revise the two 
> documents. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-5195) 2013-08-30 02:00:01,062 [pool-24-thread-1] ERROR [org.apache.solr.handler.ReplicationHandler] - SnapPull failed

2016-06-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-5195.

Resolution: Invalid

This is very likely not a bug anymore since we now use much smaller buffers 
(either 1K for regular index inputs or 4K for merging).

> 2013-08-30 02:00:01,062 [pool-24-thread-1] ERROR 
> [org.apache.solr.handler.ReplicationHandler] - SnapPull failed 
> 
>
> Key: LUCENE-5195
> URL: https://issues.apache.org/jira/browse/LUCENE-5195
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: richie
>Priority: Critical
>  Labels: SnapPull, failed, masterSlave, sorl
>
> 2013-08-30 02:00:01,062 [pool-24-thread-1] ERROR 
> [org.apache.solr.handler.ReplicationHandler] - SnapPull failed 
> org.apache.solr.common.SolrException: Failed to create temporary config 
> folder: conf.20130830020001
>   at 
> org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:513)
>   at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:271)
>   at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6662) Resource Leaks


[ 
https://issues.apache.org/jira/browse/LUCENE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322533#comment-15322533
 ] 

ASF subversion and git services commented on LUCENE-6662:
-

Commit 3de727786d15257f58eb6c1072f0e6d16f1d4126 in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3de7277 ]

LUCENE-6662: Fixed typo in the CHANGES entry.


> Resource Leaks
> --
>
> Key: LUCENE-6662
> URL: https://issues.apache.org/jira/browse/LUCENE-6662
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.0
>Reporter: Rishabh Patel
>Priority: Critical
> Fix For: 6.2
>
> Attachments: LUCENE-6662.patch
>
>
> Several resource leaks were identified. I am merging all resource leak issues 
> and creating a single patch as suggested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6662) Resource Leaks

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322532#comment-15322532
 ] 

ASF subversion and git services commented on LUCENE-6662:
-

Commit 7a4565e29896ee59210e7e664d37d833d161932c in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7a4565e ]

LUCENE-6662: Fixed typo in the CHANGES entry.


> Resource Leaks
> --
>
> Key: LUCENE-6662
> URL: https://issues.apache.org/jira/browse/LUCENE-6662
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.0
>Reporter: Rishabh Patel
>Priority: Critical
> Fix For: 6.2
>
> Attachments: LUCENE-6662.patch
>
>
> Several resource leaks were identified. I am merging all resource leak issues 
> and creating a single patch as suggested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-6662) Resource Leaks

2016-06-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-6662.
--
   Resolution: Fixed
Fix Version/s: (was: 6.0)
   6.2

Merged, thanks!

> Resource Leaks
> --
>
> Key: LUCENE-6662
> URL: https://issues.apache.org/jira/browse/LUCENE-6662
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.0
>Reporter: Rishabh Patel
>Priority: Critical
> Fix For: 6.2
>
> Attachments: LUCENE-6662.patch
>
>
> Several resource leaks were identified. I am merging all resource leak issues 
> and creating a single patch as suggested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6662) Resource Leaks


[ 
https://issues.apache.org/jira/browse/LUCENE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322528#comment-15322528
 ] 

ASF subversion and git services commented on LUCENE-6662:
-

Commit b6c6d5e9ffb2f5d8a8b06ad6269de5d17b312b5f in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b6c6d5e ]

LUCENE-6662: Fixd potential resource leaks.


> Resource Leaks
> --
>
> Key: LUCENE-6662
> URL: https://issues.apache.org/jira/browse/LUCENE-6662
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.0
>Reporter: Rishabh Patel
>Priority: Critical
> Fix For: 6.0
>
> Attachments: LUCENE-6662.patch
>
>
> Several resource leaks were identified. I am merging all resource leak issues 
> and creating a single patch as suggested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6662) Resource Leaks

2016-06-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322527#comment-15322527
 ] 

ASF subversion and git services commented on LUCENE-6662:
-

Commit 04b0a459ec08eb869528fd7b6cd2e3cad12c6563 in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=04b0a45 ]

LUCENE-6662: Fixd potential resource leaks.


> Resource Leaks
> --
>
> Key: LUCENE-6662
> URL: https://issues.apache.org/jira/browse/LUCENE-6662
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 6.0
>Reporter: Rishabh Patel
>Priority: Critical
> Fix For: 6.0
>
> Attachments: LUCENE-6662.patch
>
>
> Several resource leaks were identified. I am merging all resource leak issues 
> and creating a single patch as suggested. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-5741) IndexWriter.tryDeleteDocument does not work


 [ 
https://issues.apache.org/jira/browse/LUCENE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand closed LUCENE-5741.

Resolution: Won't Fix

No feedback, so we can't get to te bottom of it.

> IndexWriter.tryDeleteDocument does not work
> ---
>
> Key: LUCENE-5741
> URL: https://issues.apache.org/jira/browse/LUCENE-5741
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3, 4.5, 4.6, 4.7, 4.8, 4.8.1
>Reporter: Zhuravskiy Vitaliy
>Assignee: Michael McCandless
>Priority: Critical
>
> I am using "fresh"a and opened reader. 
> One segement and 3 documents in index.
> tryDeleteDocument always return false, i deep into your code, and see follow, 
> that 
> segmentInfos.indexOf(info)
> always return -1 because org.apache.lucene.index.SegmentInfoPerCommit doesnot 
> have equals method, see screenshoot for more inforamtion 
> http://postimg.org/image/jvtezvqnn/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-5814) JVM crash When Run Lucene