date:20161021

[JENKINS] Lucene-Solr-master-Solaris (64bit/jdk1.8.0) - Build # 919 - Still Unstable!

2016-10-21 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Solaris/919/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

2 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.core.HdfsDirectoryFactoryTest

Error Message:
Suite timeout exceeded (>= 720 msec).

Stack Trace:
java.lang.Exception: Suite timeout exceeded (>= 720 msec).
at __randomizedtesting.SeedInfo.seed([99E5F7D6E1DA3D1F]:0)


FAILED:  
org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.testDelegationTokenCancelFail

Error Message:
expected:<200> but was:<404>

Stack Trace:
java.lang.AssertionError: expected:<200> but was:<404>
at 
__randomizedtesting.SeedInfo.seed([99E5F7D6E1DA3D1F:F15AC2FC31402FF3]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.cancelDelegationToken(TestSolrCloudWithDelegationTokens.java:140)
at 
org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.testDelegationTokenCancelFail(TestSolrCloudWithDelegationTokens.java:294)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)

[jira] [Commented] (SOLR-2216) Highlighter query exceeds maxBooleanClause limit due to range query

2016-10-21 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15597068#comment-15597068
 ] 

Cao Manh Dat commented on SOLR-2216:


Currently, I do not think this is a problem of Solr. Because the code to create 
QueryScorer for Highlighting is quite simple
{code}
QueryScorer scorer = new QueryScorer(query,
   hl.requireFieldMatch == true ? fieldName : null);
{code}
and
{code}
if (reqFieldMatch) {
  return new QueryTermScorer(query, request.getSearcher().getIndexReader(), 
fieldName);
} else {
  return new QueryTermScorer(query);
}
{code}

I think it is related to how Lucene rewrite the query, will try to dig deeper 
into Lucene code and figure out the changes.

> Highlighter query exceeds maxBooleanClause limit due to range query
> ---
>
> Key: SOLR-2216
> URL: https://issues.apache.org/jira/browse/SOLR-2216
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.4.1
> Environment: Linux solr-2.bizjournals.int 2.6.18-194.3.1.el5 #1 SMP 
> Thu May 13 13:08:30 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_21"
> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
> JAVA_OPTS="-client -Dcom.sun.management.jmxremote=true 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.authenticate=true 
> -Dcom.sun.management.jmxremote.access.file=/root/.jmxaccess 
> -Dcom.sun.management.jmxremote.password.file=/root/.jmxpasswd 
> -Dcom.sun.management.jmxremote.ssl=false -XX:+UseCompressedOops 
> -XX:MaxPermSize=512M -Xms10240M -Xmx15360M -XX:+UseParallelGC 
> -XX:+AggressiveOpts -XX:NewRatio=5"
> top - 11:38:49 up 124 days, 22:37,  1 user,  load average: 5.20, 4.35, 3.90
> Tasks: 220 total,   1 running, 219 sleeping,   0 stopped,   0 zombie
> Cpu(s): 47.5%us,  2.9%sy,  0.0%ni, 49.5%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:  24679008k total, 18179980k used,  6499028k free,   125424k buffers
> Swap: 26738680k total,29276k used, 26709404k free,  8187444k cached
>Reporter: Ken Stanley
>
> For a full detail of the issue, please see the mailing list: 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201011.mbox/%3CAANLkTimE8z8yOni+u0Nsbgct1=ef7e+su0_waku2c...@mail.gmail.com%3E
> The nutshell version of the issue is that when I have a query that contains 
> ranges on a specific (non-highlighted) field, the highlighter component is 
> attempting to create a query that exceeds the value of maxBooleanClauses set 
> from solrconfig.xml. This is despite my explicit setting of hl.field, 
> hl.requireFieldMatch, and various other hightlight options in the query. 
> As suggested by Koji in the follow-up response, I removed the range queries 
> from my main query, and SOLR and highlighting were happy to fulfill my 
> request. It was suggested that if removing the range queries worked that this 
> might potentially be a bug, hence my filing this JIRA ticket. For what it is 
> worth, if I move my range queries into an fq, I do not get the exception 
> about exceeding maxBooleanClauses, and I get the effect that I was looking 
> for. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2016-10-21 Thread Lance Norskog (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596885#comment-15596885
 ] 

Lance Norskog commented on LUCENE-2899:
---

I don't remember if it's always or just seldom. It was just something I noticed 
when testing them. I'm not an NLP researcher, and I've been out of the Solr 
world for years. It sounds like Joern Kottman knows his way around this stuff.


> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-2899-6.1.0.patch, LUCENE-2899-RJN.patch, 
> LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
> LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9610) New AssertTool in SolrCLI

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596782#comment-15596782
 ] 

ASF subversion and git services commented on SOLR-9610:
---

Commit 3488f12170a6b035391fda719ce69380dc4b2882 in lucene-solr's branch 
refs/heads/master from [~janhoy]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3488f12 ]

SOLR-9610: Bugfix option key for assertTool


> New AssertTool in SolrCLI
> -
>
> Key: SOLR-9610
> URL: https://issues.apache.org/jira/browse/SOLR-9610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-9610.patch, SOLR-9610.patch
>
>
> Moving some code from SOLR-7826 over here. This is a new AssertTool which can 
> be used to assert that we are (not) root user and more. Usage:
> {noformat}
> usage: bin/solr assert [-m ] [-e] [-rR] [-s ] [-S ] [-u
> ] [-x ] [-X ]
>  -e,--exitcode Return an exit code instead of printing
>error message on assert fail.
>  -help Print this message
>  -m,--message Exception message to be used in place of
>the default error message
>  -R,--not-root Asserts that we are NOT the root user
>  -r,--root Asserts that we are the root user
>  -S,--not-started Asserts that Solr is NOT started on a
>certain URL
>  -s,--started Asserts that Solr is started on a certain
>URL
>  -u,--same-user Asserts that we run as same user that owns
>
>  -x,--existsAsserts that directory  exists
>  -X,--not-existsAsserts that directory  does NOT
> {noformat}
> This can then also be used from bin/solr through e.g. {{run_tool assert -r}}, 
> or from Java Code static methods such as 
> {{AssertTool.assertSolrRunning(String url)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9371) Fix bin/solr script calculations - start/stop wait time and RMI_PORT

2016-10-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596777#comment-15596777
 ] 

Jan Høydahl commented on SOLR-9371:
---

Perhaps you can use the new AssertTool from the windows script?
{code}
call :run_assert -e -S http://localhost:8983/solr/
IF errorlevel 1 
{code}

> Fix bin/solr script calculations - start/stop wait time and RMI_PORT
> 
>
> Key: SOLR-9371
> URL: https://issues.apache.org/jira/browse/SOLR-9371
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Affects Versions: 6.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Minor
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9371.patch, SOLR-9371.patch
>
>
> The bin/solr script doesn't wait long enough for Solr to stop before it sends 
> the KILL signal to the process.  The start could use a longer wait too.
> Also, the RMI_PORT is calculated by simply prefixing the port number with a 
> "1" instead of adding 1.  If the solr port has five digits, then the rmi 
> port will be invalid, because it will be greater than 65535.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9662) New parameter -u in bin/post

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-9662.
---
Resolution: Fixed

> New parameter -u  in bin/post
> 
>
> Key: SOLR-9662
> URL: https://issues.apache.org/jira/browse/SOLR-9662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-9662.patch
>
>
> This issue will add a new argument to bin/post:
> {noformat}
> -u or -user  (sets Basic Authentication)
> {noformat}
> Passing the param will set system property {{basicauth}}, which will be 
> picked up and used by SimplePostTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9662) New parameter -u in bin/post

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596735#comment-15596735
 ] 

ASF subversion and git services commented on SOLR-9662:
---

Commit d18666b336f073df3c33faf60f48a9261e6985f8 in lucene-solr's branch 
refs/heads/branch_6x from [~janhoy]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d18666b ]

SOLR-9662: New parameter -u  in bin/post to pass basicauth 
credentials

(cherry picked from commit e3a8a0f)


> New parameter -u  in bin/post
> 
>
> Key: SOLR-9662
> URL: https://issues.apache.org/jira/browse/SOLR-9662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-9662.patch
>
>
> This issue will add a new argument to bin/post:
> {noformat}
> -u or -user  (sets Basic Authentication)
> {noformat}
> Passing the param will set system property {{basicauth}}, which will be 
> picked up and used by SimplePostTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9662) New parameter -u in bin/post

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596732#comment-15596732
 ] 

ASF subversion and git services commented on SOLR-9662:
---

Commit e3a8a0fe5f7ebff46509f51f9d490a5c801626ba in lucene-solr's branch 
refs/heads/master from [~janhoy]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e3a8a0f ]

SOLR-9662: New parameter -u  in bin/post to pass basicauth 
credentials


> New parameter -u  in bin/post
> 
>
> Key: SOLR-9662
> URL: https://issues.apache.org/jira/browse/SOLR-9662
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
> Attachments: SOLR-9662.patch
>
>
> This issue will add a new argument to bin/post:
> {noformat}
> -u or -user  (sets Basic Authentication)
> {noformat}
> Passing the param will set system property {{basicauth}}, which will be 
> picked up and used by SimplePostTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9670) Support SOLR_AUTHENTICATION_OPTS in solr.cmd

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-9670:
--
Attachment: SOLR-9670.patch

Patch

> Support SOLR_AUTHENTICATION_OPTS in solr.cmd
> 
>
> Key: SOLR-9670
> URL: https://issues.apache.org/jira/browse/SOLR-9670
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>  Labels: authentication, security
> Attachments: SOLR-9670.patch
>
>
> Add support for SOLR_AUTHENTICATION_OPTS for basic authentication in solr.cmd 
> and solr.in.cmd



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7850) Move user customization out of solr.in.* scripts

2016-10-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596662#comment-15596662
 ] 

Jan Høydahl commented on SOLR-7850:
---

Minor edit to refGuide 
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=50856198=51=50

> Move user customization out of solr.in.* scripts
> 
>
> Key: SOLR-7850
> URL: https://issues.apache.org/jira/browse/SOLR-7850
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Affects Versions: 5.2.1
>Reporter: Shawn Heisey
>Assignee: David Smiley
>Priority: Minor
> Fix For: 6.3
>
> Attachments: 
> SOLR_7850_move_bin_solr_in_sh_defaults_into_bin_solr.patch, 
> SOLR_7850_move_bin_solr_in_sh_defaults_into_bin_solr.patch
>
>
> I've seen a fair number of users customizing solr.in.* scripts to make 
> changes to their Solr installs.  I think the documentation suggests this, 
> though I haven't confirmed.
> One possible problem with this is that we might make changes in those scripts 
> which such a user would want in their setup, but if they replace the script 
> with the one in the new version, they will lose their customizations.
> I propose instead that we have the startup script look for and utilize a user 
> customization script, in a similar manner to linux init scripts that look for 
> /etc/default/packagename, but are able to function without it.  I'm not 
> entirely sure where the script should live or what it should be called.  One 
> idea is server/etc/userconfig.\{sh,cmd\} ... but I haven't put a lot of 
> thought into it yet.
> If the internal behavior of our scripts is largely replaced by a small java 
> app as detailed in SOLR-7043, then the same thing should apply there -- have 
> a config file for a user to specify settings, but work perfectly if that 
> config file is absent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9325) solr.log written to {solrRoot}/server/logs instead of location specified by SOLR_LOGS_DIR

2016-10-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596639#comment-15596639
 ] 

Jan Høydahl commented on SOLR-9325:
---

Documented in RefGuide
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=32604193=26=25
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=50856198=50=49


> solr.log written to {solrRoot}/server/logs instead of location specified by 
> SOLR_LOGS_DIR
> -
>
> Key: SOLR-9325
> URL: https://issues.apache.org/jira/browse/SOLR-9325
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: logging
>Affects Versions: 5.5.2, 6.0.1
> Environment: 64-bit CentOS 7 with latest patches, JVM 1.8.0.92
>Reporter: Tim Parker
>Assignee: Jan Høydahl
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-9325-installscript.patch, SOLR-9325.patch, 
> SOLR-9325.patch, SOLR-9325.patch
>
>
> (6.1 is probably also affected, but we've been blocked by SOLR-9231)
> solr.log should be written to the directory specified by the SOLR_LOGS_DIR 
> environment variable, but instead it's written to {solrRoot}/server/logs.
> This results in requiring that solr is installed on a writable device, which 
> leads to two problems:
> 1) solr installation can't live on a shared device (single copy shared by two 
> or more VMs)
> 2) solr installation is more difficult to lock down
> Solr should be able to run without error in this test scenario:
> burn the Solr directory tree onto a CD-ROM
> Mount this CD as /solr
> run Solr from there (with appropriate environment variables set, of course)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9325) solr.log written to {solrRoot}/server/logs instead of location specified by SOLR_LOGS_DIR

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596617#comment-15596617
 ] 

ASF subversion and git services commented on SOLR-9325:
---

Commit c9cf0eff03763d151a04baccb5530445d5d5feb5 in lucene-solr's branch 
refs/heads/master from [~janhoy]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c9cf0ef ]

SOLR-9325: Remove unnecessary search/replace in installer script


> solr.log written to {solrRoot}/server/logs instead of location specified by 
> SOLR_LOGS_DIR
> -
>
> Key: SOLR-9325
> URL: https://issues.apache.org/jira/browse/SOLR-9325
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: logging
>Affects Versions: 5.5.2, 6.0.1
> Environment: 64-bit CentOS 7 with latest patches, JVM 1.8.0.92
>Reporter: Tim Parker
>Assignee: Jan Høydahl
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-9325-installscript.patch, SOLR-9325.patch, 
> SOLR-9325.patch, SOLR-9325.patch
>
>
> (6.1 is probably also affected, but we've been blocked by SOLR-9231)
> solr.log should be written to the directory specified by the SOLR_LOGS_DIR 
> environment variable, but instead it's written to {solrRoot}/server/logs.
> This results in requiring that solr is installed on a writable device, which 
> leads to two problems:
> 1) solr installation can't live on a shared device (single copy shared by two 
> or more VMs)
> 2) solr installation is more difficult to lock down
> Solr should be able to run without error in this test scenario:
> burn the Solr directory tree onto a CD-ROM
> Mount this CD as /solr
> run Solr from there (with appropriate environment variables set, of course)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9325) solr.log written to {solrRoot}/server/logs instead of location specified by SOLR_LOGS_DIR

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-9325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-9325:
--
Attachment: SOLR-9325-installscript.patch

The {{bin/install_solr_service.sh}} script does a search/replace to hardcode 
solr.log in log4j.properties. This is not necessary anymore, and also breaks if 
people later modify {{solr.in.sh}}, here is a patch 
(SOLR-9325-installscript.patch).

> solr.log written to {solrRoot}/server/logs instead of location specified by 
> SOLR_LOGS_DIR
> -
>
> Key: SOLR-9325
> URL: https://issues.apache.org/jira/browse/SOLR-9325
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: logging
>Affects Versions: 5.5.2, 6.0.1
> Environment: 64-bit CentOS 7 with latest patches, JVM 1.8.0.92
>Reporter: Tim Parker
>Assignee: Jan Høydahl
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-9325-installscript.patch, SOLR-9325.patch, 
> SOLR-9325.patch, SOLR-9325.patch
>
>
> (6.1 is probably also affected, but we've been blocked by SOLR-9231)
> solr.log should be written to the directory specified by the SOLR_LOGS_DIR 
> environment variable, but instead it's written to {solrRoot}/server/logs.
> This results in requiring that solr is installed on a writable device, which 
> leads to two problems:
> 1) solr installation can't live on a shared device (single copy shared by two 
> or more VMs)
> 2) solr installation is more difficult to lock down
> Solr should be able to run without error in this test scenario:
> burn the Solr directory tree onto a CD-ROM
> Mount this CD as /solr
> run Solr from there (with appropriate environment variables set, of course)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9533) Reload core config when a core is reloaded

2016-10-21 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596545#comment-15596545
 ] 

Joel Bernstein commented on SOLR-9533:
--

I've been looking for a SolrCloud hook for the solrcore.properties but there 
does not appear to one. I suspect this is by design as it's called an 
*extermal* properties file in the documentation.


> Reload core config when a core is reloaded
> --
>
> Key: SOLR-9533
> URL: https://issues.apache.org/jira/browse/SOLR-9533
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Gethin James
>Assignee: Joel Bernstein
> Attachments: SOLR-9533.patch, SOLR-9533.patch
>
>
> I am reloading a core using {{coreContainer.reload(coreName)}}.  However it 
> doesn't seem to reload the configuration.  I have changed solrcore.properties 
> on the file system but the change doesn't get picked up.
> The coreContainer.reload method seems to call:
> {code}
> CoreDescriptor cd = core.getCoreDescriptor();
> {code}
> I can't see a way to reload CoreDescriptor, so it isn't picking up my 
> changes.  It simply reuses the existing CoreDescriptor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9255) Rename SOLR_AUTHENTICATION_CLIENT_CONFIGURER -> SOLR_AUTHENTICATION_CLIENT_BUILDER

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-9255:
--
Summary: Rename SOLR_AUTHENTICATION_CLIENT_CONFIGURER -> 
SOLR_AUTHENTICATION_CLIENT_BUILDER  (was: Start Script Basic Authentication)

> Rename SOLR_AUTHENTICATION_CLIENT_CONFIGURER -> 
> SOLR_AUTHENTICATION_CLIENT_BUILDER
> --
>
> Key: SOLR-9255
> URL: https://issues.apache.org/jira/browse/SOLR-9255
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication
>Affects Versions: master (7.0)
>Reporter: Martin Löper
>Assignee: Jan Høydahl
> Fix For: master (7.0)
>
> Attachments: SOLR-9255.patch, SOLR-9255.patch
>
>
> I configured SSL and BasicAuthentication with Rule-Based-Authorization.
> I noticed that since the latest changes from 6.0.1 to 6.1.0 I cannot pass the 
> Basic Authentication Credentials to the Solr Start Script anymore. For the 
> previous release I did this via the bin/solr.in.sh shellscript.
> What has happened with the SOLR_AUTHENTICATION_CLIENT_CONFIGURER and 
> SOLR_AUTHENTICATION_OPTS parameters? Are they still in use or is there a new 
> way to pass basic auth credentials on the command-line?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-9670) Support SOLR_AUTHENTICATION_OPTS in solr.cmd

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-9670:
-

Assignee: Jan Høydahl

> Support SOLR_AUTHENTICATION_OPTS in solr.cmd
> 
>
> Key: SOLR-9670
> URL: https://issues.apache.org/jira/browse/SOLR-9670
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>  Labels: authentication, security
>
> Add support for SOLR_AUTHENTICATION_OPTS for basic authentication in solr.cmd 
> and solr.in.cmd



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9533) Reload core config when a core is reloaded

2016-10-21 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596341#comment-15596341
 ] 

Joel Bernstein commented on SOLR-9533:
--

Ok, thanks!

> Reload core config when a core is reloaded
> --
>
> Key: SOLR-9533
> URL: https://issues.apache.org/jira/browse/SOLR-9533
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Gethin James
>Assignee: Joel Bernstein
> Attachments: SOLR-9533.patch, SOLR-9533.patch
>
>
> I am reloading a core using {{coreContainer.reload(coreName)}}.  However it 
> doesn't seem to reload the configuration.  I have changed solrcore.properties 
> on the file system but the change doesn't get picked up.
> The coreContainer.reload method seems to call:
> {code}
> CoreDescriptor cd = core.getCoreDescriptor();
> {code}
> I can't see a way to reload CoreDescriptor, so it isn't picking up my 
> changes.  It simply reuses the existing CoreDescriptor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9533) Reload core config when a core is reloaded

2016-10-21 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-9533:
-
Attachment: SOLR-9533.patch

Added a simple test case

> Reload core config when a core is reloaded
> --
>
> Key: SOLR-9533
> URL: https://issues.apache.org/jira/browse/SOLR-9533
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Gethin James
>Assignee: Joel Bernstein
> Attachments: SOLR-9533.patch, SOLR-9533.patch
>
>
> I am reloading a core using {{coreContainer.reload(coreName)}}.  However it 
> doesn't seem to reload the configuration.  I have changed solrcore.properties 
> on the file system but the change doesn't get picked up.
> The coreContainer.reload method seems to call:
> {code}
> CoreDescriptor cd = core.getCoreDescriptor();
> {code}
> I can't see a way to reload CoreDescriptor, so it isn't picking up my 
> changes.  It simply reuses the existing CoreDescriptor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7506) Roll over GC logs by default via bin/solr scripts

2016-10-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596281#comment-15596281
 ] 

Jan Høydahl commented on SOLR-7506:
---

Tested on windows 10. Planning to commit during weekend.

> Roll over GC logs by default via bin/solr scripts
> -
>
> Key: SOLR-7506
> URL: https://issues.apache.org/jira/browse/SOLR-7506
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Reporter: Shalin Shekhar Mangar
>Assignee: Jan Høydahl
>Priority: Minor
>  Labels: logging
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-7506.patch, SOLR-7506.patch
>
>
> The Oracle JDK supports rolling over GC logs. I propose to add the following 
> to the solr.in.{sh,cmd} scripts to enable it by default:
> {code}
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=20M
> {code}
> Unfortunately, the JDK doesn't have any option to append to existing log 
> instead of overwriting so the latest log is overwritten. Maybe we can have 
> the bin/solr script roll that after the process is killed?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9506) cache IndexFingerprint for each segment

2016-10-21 Thread Pushkar Raste (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596148#comment-15596148
 ] 

Pushkar Raste commented on SOLR-9506:
-

Don't use patch for parallalized computation. Parallel streams in use a shared 
fork-join pool. A bad actor can create havoc.

> cache IndexFingerprint for each segment
> ---
>
> Key: SOLR-9506
> URL: https://issues.apache.org/jira/browse/SOLR-9506
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
> Attachments: SOLR-9506.patch, SOLR-9506.patch, SOLR-9506.patch, 
> SOLR-9506.patch, SOLR-9506_POC.patch
>
>
> The IndexFingerprint is cached per index searcher. it is quite useless during 
> high throughput indexing. If the fingerprint is cached per segment it will 
> make it vastly more efficient to compute the fingerprint



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9678) Group sorting by numFound

2016-10-21 Thread Awais Malik (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596010#comment-15596010
 ] 

Awais Malik commented on SOLR-9678:
---

Please guide me. Thanks.

> Group sorting by numFound
> -
>
> Key: SOLR-9678
> URL: https://issues.apache.org/jira/browse/SOLR-9678
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Awais Malik
>
> Right now in 6.1.2 there doesn't seem to be a way to sort documents by 
> numFound when grouped by a field. We're looking for this feature and can't 
> implement it since numFound isn't a valid field. We've looked for 
> alternatives but haven't found any. Please include this feature in the next 
> release or direct us towards an alternative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9678) Group sorting by numFound

2016-10-21 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595950#comment-15595950
 ] 

Erick Erickson commented on SOLR-9678:
--

Patches are always welcome. By their nature, Open Source projects are driven by 
user needs (thus contributions). If you have a great enough need to invest the 
time and energy in a patch we'd be glad to guide you in its creation and, if 
accepted by the community, add it to the code base.

> Group sorting by numFound
> -
>
> Key: SOLR-9678
> URL: https://issues.apache.org/jira/browse/SOLR-9678
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Awais Malik
>
> Right now in 6.1.2 there doesn't seem to be a way to sort documents by 
> numFound when grouped by a field. We're looking for this feature and can't 
> implement it since numFound isn't a valid field. We've looked for 
> alternatives but haven't found any. Please include this feature in the next 
> release or direct us towards an alternative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-9678) Group sorting by numFound

2016-10-21 Thread Awais Malik (JIRA)

Awais Malik created SOLR-9678:
-

 Summary: Group sorting by numFound
 Key: SOLR-9678
 URL: https://issues.apache.org/jira/browse/SOLR-9678
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Awais Malik


Right now in 6.1.2 there doesn't seem to be a way to sort documents by numFound 
when grouped by a field. We're looking for this feature and can't implement it 
since numFound isn't a valid field. We've looked for alternatives but haven't 
found any. Please include this feature in the next release or direct us towards 
an alternative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9676) FastVectorHighligher log message could be improved

2016-10-21 Thread David Smiley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-9676.

   Resolution: Fixed
Fix Version/s: 6.3

Sure; thanks Mike.

BTW, please change your name in JIRA... there are a lot of "Mike"s.

> FastVectorHighligher log message could be improved
> --
>
> Key: SOLR-9676
> URL: https://issues.apache.org/jira/browse/SOLR-9676
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 4.10.4
>Reporter: Mike
>Assignee: David Smiley
>Priority: Minor
> Fix For: 6.3
>
>
> If you try to use the FastVectorHighlighter on a field that doesn't have 
> TermPositions and TermOffsets enabled, you get an ok error message:
> {{WARN  org.apache.solr.highlight.DefaultSolrHighlighter  – Solr will use 
> Highlighter instead of FastVectorHighlighter because assignedTo field does 
> not store TermPositions and TermOffsets.}}
> If you heed that message, and dutifully add TermPositions and TermOffsets to 
> your schema, you get a crashing message that says:
> {code:none}
> Blah, blah, stacktrace
> 
> Caused by: java.lang.IllegalArgumentException: cannot index term vector 
> offsets when term vectors are not indexed (field="court")
> ...
> {code}
> Can we update the first message to say:
> {{Solr will use Highlighter instead of FastVectorHighlighter because 
> assignedTo field does not store TermPositions, TermOffsets, and TermVectors.}}
> That'd save at least one headache next time I screw this up...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9676) FastVectorHighligher log message could be improved

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595846#comment-15595846
 ] 

ASF subversion and git services commented on SOLR-9676:
---

Commit e744feeeb9796ace48adeb5cb63c8116317c07b0 in lucene-solr's branch 
refs/heads/branch_6x from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e744fee ]

SOLR-9676: DefaultSolrHighlighter: clarify warning when FVH can't be used

(cherry picked from commit 91f58ac)


> FastVectorHighligher log message could be improved
> --
>
> Key: SOLR-9676
> URL: https://issues.apache.org/jira/browse/SOLR-9676
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 4.10.4
>Reporter: Mike
>Assignee: David Smiley
>Priority: Minor
>
> If you try to use the FastVectorHighlighter on a field that doesn't have 
> TermPositions and TermOffsets enabled, you get an ok error message:
> {{WARN  org.apache.solr.highlight.DefaultSolrHighlighter  – Solr will use 
> Highlighter instead of FastVectorHighlighter because assignedTo field does 
> not store TermPositions and TermOffsets.}}
> If you heed that message, and dutifully add TermPositions and TermOffsets to 
> your schema, you get a crashing message that says:
> {code:none}
> Blah, blah, stacktrace
> 
> Caused by: java.lang.IllegalArgumentException: cannot index term vector 
> offsets when term vectors are not indexed (field="court")
> ...
> {code}
> Can we update the first message to say:
> {{Solr will use Highlighter instead of FastVectorHighlighter because 
> assignedTo field does not store TermPositions, TermOffsets, and TermVectors.}}
> That'd save at least one headache next time I screw this up...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9676) FastVectorHighligher log message could be improved

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595843#comment-15595843
 ] 

ASF subversion and git services commented on SOLR-9676:
---

Commit 91f58ac72b603bc9a66f537829c0f99dcd65fbff in lucene-solr's branch 
refs/heads/master from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=91f58ac ]

SOLR-9676: DefaultSolrHighlighter: clarify warning when FVH can't be used


> FastVectorHighligher log message could be improved
> --
>
> Key: SOLR-9676
> URL: https://issues.apache.org/jira/browse/SOLR-9676
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 4.10.4
>Reporter: Mike
>Assignee: David Smiley
>Priority: Minor
>
> If you try to use the FastVectorHighlighter on a field that doesn't have 
> TermPositions and TermOffsets enabled, you get an ok error message:
> {{WARN  org.apache.solr.highlight.DefaultSolrHighlighter  – Solr will use 
> Highlighter instead of FastVectorHighlighter because assignedTo field does 
> not store TermPositions and TermOffsets.}}
> If you heed that message, and dutifully add TermPositions and TermOffsets to 
> your schema, you get a crashing message that says:
> {code:none}
> Blah, blah, stacktrace
> 
> Caused by: java.lang.IllegalArgumentException: cannot index term vector 
> offsets when term vectors are not indexed (field="court")
> ...
> {code}
> Can we update the first message to say:
> {{Solr will use Highlighter instead of FastVectorHighlighter because 
> assignedTo field does not store TermPositions, TermOffsets, and TermVectors.}}
> That'd save at least one headache next time I screw this up...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9546) There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class

2016-10-21 Thread Pushkar Raste (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pushkar Raste updated SOLR-9546:

Attachment: SOLR-9546_CloudMLTQParser.patch

Patch for CloudMLTQParser

> There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class
> --
>
> Key: SOLR-9546
> URL: https://issues.apache.org/jira/browse/SOLR-9546
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9546.patch, SOLR-9546_CloudMLTQParser.patch
>
>
> Here is an excerpt 
> {code}
>   public Long getLong(String param, Long def) {
> String val = get(param);
> try {
>   return val== null ? def : Long.parseLong(val);
> }
> catch( Exception ex ) {
>   throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, 
> ex.getMessage(), ex );
> }
>   }
> {code}
> {{Long.parseLong()}} returns a primitive type but since method expect to 
> return a {{Long}}, it needs to be wrapped. There are many more method like 
> that. We might be creating a lot of unnecessary objects here.
> I am not sure if JVM catches upto it and somehow optimizes it if these 
> methods are called enough times (or may be compiler does some modifications 
> at compile time)
> Let me know if I am thinking of some premature optimization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9598) Solr RESTORE api doesn't wait for the restored collection to be fully ready for usage

2016-10-21 Thread Hrishikesh Gadre (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595753#comment-15595753
 ] 

Hrishikesh Gadre commented on SOLR-9598:


Clarification - This happens only for a *large* Solr collection (roughly ~50GB 
total index size with 6 shards).

> Solr RESTORE api doesn't wait for the restored collection to be fully ready 
> for usage
> -
>
> Key: SOLR-9598
> URL: https://issues.apache.org/jira/browse/SOLR-9598
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Hrishikesh Gadre
>
> As part of the RESTORE operation, Solr creates a new collection and adds 
> necessary number of replicas to each shard. The problem is that this 
> operation doesn't wait for this new collection to be fully ready for usage 
> (e.g. querying and indexing). This requires extra checks on the client side 
> to make sure that the recovery is complete and reflected in cluster status 
> stored in Zookeeper. e.g. refer to the backup/restore unit test for this 
> check,
> https://github.com/apache/lucene-solr/blob/722e82712435ecf46c9868137d885484152f749b/solr/core/src/test/org/apache/solr/cloud/AbstractCloudBackupRestoreTestCase.java#L234
> Ideally this check should be implemented in the RESTORE operation itself. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

bin/solr still at 5 second kill

2016-10-21 Thread Erick Erickson

Is my memory failing? At one point I thought we had a fix for the fact
that the bin/solr script waits 5 seconds and kills Solr, leading to
bogus recoveries. But the latest bin/solr script (6x and trunk) still
have a hard-coded 5 second wait time

Anyone remember what's up with that?

Erick

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9598) Solr RESTORE api doesn't wait for the restored collection to be fully ready for usage

2016-10-21 Thread Michael Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595712#comment-15595712
 ] 

Michael Sun commented on SOLR-9598:
---

It seems Solr sometimes doesn't reopen the searcher once collection is 
restored. Therefore a query soon after restoring sometimes doesn't return the 
correct result. 

Currently it requires to restart Solr before using restored collection to make 
sure query returns correct result.


> Solr RESTORE api doesn't wait for the restored collection to be fully ready 
> for usage
> -
>
> Key: SOLR-9598
> URL: https://issues.apache.org/jira/browse/SOLR-9598
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.2
>Reporter: Hrishikesh Gadre
>
> As part of the RESTORE operation, Solr creates a new collection and adds 
> necessary number of replicas to each shard. The problem is that this 
> operation doesn't wait for this new collection to be fully ready for usage 
> (e.g. querying and indexing). This requires extra checks on the client side 
> to make sure that the recovery is complete and reflected in cluster status 
> stored in Zookeeper. e.g. refer to the backup/restore unit test for this 
> check,
> https://github.com/apache/lucene-solr/blob/722e82712435ecf46c9868137d885484152f749b/solr/core/src/test/org/apache/solr/cloud/AbstractCloudBackupRestoreTestCase.java#L234
> Ideally this check should be implemented in the RESTORE operation itself. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8785) Use Metrics library for core metrics

2016-10-21 Thread Jeff Wartes (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595671#comment-15595671
 ] 

Jeff Wartes commented on SOLR-8785:
---

For the record, it looks like I wrote this patch against master, around about 
version 6.1.
I recall I had some concern at the time that the metrics namespace generation 
was too flexible (complicated), so that's something to look at.

> Use Metrics library for core metrics
> 
>
> Key: SOLR-8785
> URL: https://issues.apache.org/jira/browse/SOLR-8785
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.1
>Reporter: Jeff Wartes
>  Labels: patch, patch-available
>
> The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a 
> well-known way to track metrics about applications. 
> In SOLR-1972, latency percentile tracking was added. The comment list is 
> long, so here’s my synopsis:
> 1. An attempt was made to use the Metrics library
> 2. That attempt failed due to a memory leak in Metrics v2.1.1
> 3. Large parts of Metrics were then copied wholesale into the 
> org.apache.solr.util.stats package space and that was used instead.
> Copy/pasting Metrics code into Solr may have been the correct solution at the 
> time, but I submit that it isn’t correct any more. 
> The leak in Metrics was fixed even before SOLR-1972 was released, and by 
> copy/pasting a subset of the functionality, we miss access to other important 
> things that the Metrics library provides, particularly the concept of a 
> Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters)
> Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s 
> used in two contrib modules. (map-reduce and morphines-core)
> I’m proposing that:
> 1. Metrics as bundled with Solr be upgraded to the current v3.1.2
> 2. Most of the org.apache.solr.util.stats package space be deleted outright, 
> or gutted and replaced with simple calls to Metrics. Due to the copy/paste 
> origin, the concepts should mostly map 1:1.
> I’d further recommend a usage pattern like:
> SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”,
>  “solr-registry”))
> There are all kinds of areas in Solr that could benefit from metrics tracking 
> and reporting. This pattern allows diverse areas of code to track metrics 
> within a single, named registry. This well-known-name then becomes a handle 
> you can use to easily attach a Reporter and ship all of those metrics off-box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8785) Use Metrics library for core metrics

2016-10-21 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595531#comment-15595531
 ] 

Shawn Heisey commented on SOLR-8785:


Perhaps it's not the right way, but currently it's the only way I know of.

In master, defaulting to persistence is probably an excellent option, 
especially if there is a reset option.  I just worry about causing problems for 
existing 6.x users.

So, combining everything into one coherent plan:

* Create a config option for 6.x to enable persistence, that defaults to false. 
 Set it to true in 6.x example configs.
* Add CoreAdmin and CollectionsAdmin actions to reset stats to 6.x and master..
* In master, the option to enable persistence will not exist.  Persistence will 
always be enabled. Users will be able to use the reset action.


> Use Metrics library for core metrics
> 
>
> Key: SOLR-8785
> URL: https://issues.apache.org/jira/browse/SOLR-8785
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.1
>Reporter: Jeff Wartes
>  Labels: patch, patch-available
>
> The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a 
> well-known way to track metrics about applications. 
> In SOLR-1972, latency percentile tracking was added. The comment list is 
> long, so here’s my synopsis:
> 1. An attempt was made to use the Metrics library
> 2. That attempt failed due to a memory leak in Metrics v2.1.1
> 3. Large parts of Metrics were then copied wholesale into the 
> org.apache.solr.util.stats package space and that was used instead.
> Copy/pasting Metrics code into Solr may have been the correct solution at the 
> time, but I submit that it isn’t correct any more. 
> The leak in Metrics was fixed even before SOLR-1972 was released, and by 
> copy/pasting a subset of the functionality, we miss access to other important 
> things that the Metrics library provides, particularly the concept of a 
> Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters)
> Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s 
> used in two contrib modules. (map-reduce and morphines-core)
> I’m proposing that:
> 1. Metrics as bundled with Solr be upgraded to the current v3.1.2
> 2. Most of the org.apache.solr.util.stats package space be deleted outright, 
> or gutted and replaced with simple calls to Metrics. Due to the copy/paste 
> origin, the concepts should mostly map 1:1.
> I’d further recommend a usage pattern like:
> SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”,
>  “solr-registry”))
> There are all kinds of areas in Solr that could benefit from metrics tracking 
> and reporting. This pattern allows diverse areas of code to track metrics 
> within a single, named registry. This well-known-name then becomes a handle 
> you can use to easily attach a Reporter and ship all of those metrics off-box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8785) Use Metrics library for core metrics

2016-10-21 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595421#comment-15595421
 ] 

Shawn Heisey edited comment on SOLR-8785 at 10/21/16 3:59 PM:
--

bq. RequestHandler stats are now persistent, and will no longer reset on reload.

If I understand this correctly, then I don't think we want to do this.  Perhaps 
I don't understand it correctly?

There needs to be a way to reset the statistics to zero.  Reloading the core is 
currently the way that I do this.  (I'm not running cloud)

I don't want you to think I'm completely against persistence.  The idea is VERY 
nice, but having it turned on by default could cause issues for users with 
existing workflows. I think the way to handle this particular feature is:

 * Introduce a new config parameter to enable persistence.  Default the 
parameter to "false".
 * Discuss a new default of "true" in 7.0.  If consensus is to change the 
default in master (which probably is a good idea), then enable it in 6.x 
example configs so it most brand-new setups will use it.



was (Author: elyograg):
bq. RequestHandler stats are now persistent, and will no longer reset on reload.

If I understand this correctly, then I don't think we want to do this.  Perhaps 
I don't understand it correctly?

There needs to be a way to reset the statistics to zero.  Reloading the core is 
currently the way that I do this.  (I'm not running cloud)


> Use Metrics library for core metrics
> 
>
> Key: SOLR-8785
> URL: https://issues.apache.org/jira/browse/SOLR-8785
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.1
>Reporter: Jeff Wartes
>  Labels: patch, patch-available
>
> The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a 
> well-known way to track metrics about applications. 
> In SOLR-1972, latency percentile tracking was added. The comment list is 
> long, so here’s my synopsis:
> 1. An attempt was made to use the Metrics library
> 2. That attempt failed due to a memory leak in Metrics v2.1.1
> 3. Large parts of Metrics were then copied wholesale into the 
> org.apache.solr.util.stats package space and that was used instead.
> Copy/pasting Metrics code into Solr may have been the correct solution at the 
> time, but I submit that it isn’t correct any more. 
> The leak in Metrics was fixed even before SOLR-1972 was released, and by 
> copy/pasting a subset of the functionality, we miss access to other important 
> things that the Metrics library provides, particularly the concept of a 
> Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters)
> Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s 
> used in two contrib modules. (map-reduce and morphines-core)
> I’m proposing that:
> 1. Metrics as bundled with Solr be upgraded to the current v3.1.2
> 2. Most of the org.apache.solr.util.stats package space be deleted outright, 
> or gutted and replaced with simple calls to Metrics. Due to the copy/paste 
> origin, the concepts should mostly map 1:1.
> I’d further recommend a usage pattern like:
> SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”,
>  “solr-registry”))
> There are all kinds of areas in Solr that could benefit from metrics tracking 
> and reporting. This pattern allows diverse areas of code to track metrics 
> within a single, named registry. This well-known-name then becomes a handle 
> you can use to easily attach a Reporter and ship all of those metrics off-box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7498) More Like This to Use BM25

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595420#comment-15595420
 ] 

Michael McCandless commented on LUCENE-7498:


+1

> More Like This to Use BM25
> --
>
> Key: LUCENE-7498
> URL: https://issues.apache.org/jira/browse/LUCENE-7498
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/other
>Reporter: Alessandro Benedetti
>
> BM25 is now the default similarity, but the more like this is still using the 
> old TF/IDF .
>  
> This issue is to move to BM25 and refactor the MLT to be more organised, 
> extensible and maintainable.
> Few extensions will follow later, but the focus of this issue will be :
>  - BM25
> - code refactor + tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7508) [smartcn] tokens are not correctly created if text length > 1024

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595368#comment-15595368
 ] 

Michael McCandless commented on LUCENE-7508:


Seems like there are two issues here.

First, there is a (hard coded?) limit of 1024 for the max sentence?  Second, 
whether those additional characters should be considered safe sentence endings.

Can you also turn this into a patch with a failing test [~peina]?  Thank you!

> [smartcn] tokens are not correctly created if text length > 1024
> 
>
> Key: LUCENE-7508
> URL: https://issues.apache.org/jira/browse/LUCENE-7508
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 6.2.1
> Environment: Mac OS X 10.10
>Reporter: peina
>  Labels: chinese, tokenization
>
> If text length is > 1024, HMMChineseTokenizer failed to split sentences 
> correctly.
> Test Sample:
> public static void main(String[] args) throws IOException{
> Analyzer analyzer = new SmartChineseAnalyzer(); /* will load stopwords */
> //String sentence = 
> "“七八个物管工作人员对我一个文弱书生拳打脚踢，我极力躲避时还被追打。”前天，微信网友爆料称，一名50多岁的江西教师在昆明被物管群殴，手指骨折，向网友求助。教师为何会被物管殴打？事情的真相又是如何？昨天，记者来到圣世一品小区，通过调查了解，事情的起因源于这名教师在小区里帮女儿散发汗蒸馆广告单，被物管保安发现后，引发冲突。对于群殴教师的说法，该小区物管保安队长称：“保安在追的过程中，确实有拉扯，但并没有殴打教师，至于手指骨折是他自己摔伤的。”爆料江西教师在昆明被物管殴打记者注意到，消息于8月27日发出，爆料者称，自己是江西宜丰崇文中学的一名中年教师黄敏。暑假期间来昆明的女儿家度假。他女儿在昆明与人合伙开了一家汗蒸馆，7月30日开业。8月9日下午6点30分许，他到昆明东二环圣世一品小区为女儿的汗蒸馆散发宣传小广告。小区物管前来制止，他就停止发放行为。黄敏称，小区物管保安人员要求他收回散发出去的广告单，他就去收了。物管要求他到办公室里去接受处理，他也配合了。让他没有想到的是，在处理的过程中，七八个年轻的物管人员突然对他拳打脚踢，他极力躲避时还被追着打，而且这一切，是在小区物管领导的注视下发生的。黄敏说，被打后，他立即报了警。除身上多处软组织挫伤外，伤得最严重的是右手大拇指粉碎性骨折，一掌骨骨折。他到云南省第三人民医院住了7天院，医生说无法手术，只能用夹板固定，也不吃药，待其自然修复，至少要3个月以上，右手大拇指还有可能伤残。为证明自己的说法，黄敏还拿出了官渡区公安分局菊花派出所出具的伤情鉴定委托书。他的伤情被鉴定为轻伤二级。说法帮女儿发宣传小广告教师在小区里被殴打昨日，记者者拨通了黄敏的电话。他说，当时他看见该小区的大门没有关，也没有保安值班。于是，他就进到了小区里帮女儿的汗蒸馆发广告单。在楼栋值班的保安没有阻止的前提下，他乘电梯来到了楼上，为了不影响住户，他将名片放在了房门的把手上。被保安发现时，他才发了四五十张。保安问他干什么？他回答，家里开了汗蒸馆，来宣传一下。两名保安叫他不要发了，并要求他到物管办公室等待领导处理。交谈中，由于对方一直在说方言，黄敏只能听清楚的一句话是，物管叫他去收回小广告。他当即同意了，准备去收。这时，小区的七八名工作人员就殴打了他，其中有穿保安服装的，也有身着便衣的。让他气愤的是，他试图逃跑躲起来，依然被追着殴打。黄敏说，女儿将他被打又维权无门的遭遇发到了微信上，希望找到相关视频和照片，还原事件真相。。";
> String sentence = 
> "“七八个物管工作人员对我一个文弱书生拳打脚踢，我极力躲避时还被追打。”前天，微信网友爆料称，一名50多岁的江西教师在昆明被物管群殴，手指骨折，向网友求助。教师为何会被物管殴打？事情的真相又是如何？昨天，记者来到圣世一品小区，通过调查了解，事情的起因源于这名教师在小区里帮女儿散发汗蒸馆广告单，被物管保安发现后，引发冲突。对于群殴教师的说法，该小区物管保安队长称：“保安在追的过程中，确实有拉扯，但并没有殴打教师，至于手指骨折是他自己摔伤的。”爆料江西教师在昆明被物管殴打记者注意到，消息于8月27日发出，爆料者称，自己是江西宜丰崇文中学的一名中年教师黄敏。暑假期间来昆明的女儿家度假。他女儿在昆明与人合伙开了一家汗蒸馆，7月30日开业。8月9日下午6点30分许，他到昆明东二环圣世一品小区为女儿的汗蒸馆散发宣传小广告。小区物管前来制止，他就停止发放行为。黄敏称，小区物管保安人员要求他收回散发出去的广告单，他就去收了。物管要求他到办公室里去接受处理，他也配合了。让他没有想到的是，在处理的过程中，七八个年轻的物管人员突然对他拳打脚踢，他极力躲避时还被追着打，而且这一切，是在小区物管领导的注视下发生的。黄敏说，被打后，他立即报了警。除身上多处软组织挫伤外，伤得最严重的是右手大拇指粉碎性骨折，一掌骨骨折。他到云南省第三人民医院住了7天院，医生说无法手术，只能用夹板固定，也不吃药，待其自然修复，至少要3个月以上，右手大拇指还有可能伤残。为证明自己的说法，黄敏还拿出了官渡区公安分局菊花派出所出具的伤情鉴定委托书。他的伤情被鉴定为轻伤二级。说法帮女儿发宣传小广告教师在小区里被殴打昨日，��者拨通了黄敏的电话。他说，当时他看见该小区的大门没有关，也没有保安值班。于是，他就进到了小区里帮女儿的汗蒸馆发广告单。在楼栋值班的保安没有阻止的前提下，他乘电梯来到了楼上，为了不影响住户，他将名片放在了房门的把手上。被保安发现时，他才发了四五十张。保安问他干什么？他回答，家里开了汗蒸馆，来宣传一下。两名保安叫他不要发了，并要求他到物管办公室等待领导处理。交谈中，由于对方一直在说方言，黄敏只能听清楚的一句话是，物管叫他去收回小广告。他当即同意了，准备去收。这时，小区的七八名工作人员就殴打了他，其中有穿保安服装的，也有身着便衣的。让他气愤的是，他试图逃跑躲起来，依然被追着殴打。黄敏说，女儿将他被打又维权无门的遭遇发到了微信上，希望找到相关视频和照片，还原事件真相";
> System.out.println(sentence.length());
>// String sentence = "女儿将他被打又维权无门的遭遇发到了微信上，希望找到相关视频和照片，还原事件真相。";
> TokenStream tokens = analyzer.tokenStream("dummyfield", sentence);
> tokens.reset();
> CharTermAttribute termAttr = (CharTermAttribute) 
> tokens.getAttribute(CharTermAttribute.class);
> while (tokens.incrementToken()) {
>  // System.out.println(termAttr.toString());
> }
> 
> analyzer.close();
>   }
> The text length in above sample is 1027, with this sample, the sentences are 
> like this:
> .
> Sentence:黄敏说，女儿将他被打又维权无门的遭遇发到了微信上，希望找到相关视频和照片，还原事
> Sentence:件真相
> The last 3 characters are detected as an individual sentence, so 还原事件真相 is 
> tokenized as 还原|事|件|真相. when the correct tokens should be 还原|事件|真相。
> Override isSafeEnd method in HMMChineseTokenizer fixes this issue by consider 
> '，'  or '。'  as a safe end of text:
> public class HMMChineseTokenizer extends SegmentingTokenizerBase {
> 
>  /** For sentence tokenization, these are the unambiguous break positions. */
>   protected boolean isSafeEnd(char ch) {
> switch(ch) {
>   case 0x000D:
>   case 0x000A:
>   case 0x0085:
>   case 0x2028:
>   case 0x2029:
>+   case '。':
>+   case '，':
> return true;
>   default:
> return false;
> }
>   }
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (LUCENE-7509) [smartcn] Some chinese text is not tokenized correctly with Chinese punctuation marks appended

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595364#comment-15595364
 ] 

Michael McCandless commented on LUCENE-7509:


Hi [~peina], could you please turn your test fragments into a test that fails?  
See e.g. https://wiki.apache.org/lucene-java/HowToContribute

Do you know how to fix this?  Is there a Unicode API we should be using to more 
generally check for punctuation, so that Chinese punctuation is included?

> [smartcn] Some chinese text is not tokenized correctly with Chinese 
> punctuation marks appended
> --
>
> Key: LUCENE-7509
> URL: https://issues.apache.org/jira/browse/LUCENE-7509
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Affects Versions: 6.2.1
> Environment: Mac OS X 10.10
>Reporter: peina
>  Labels: chinese, tokenization
>
> Some chinese text is not tokenized correctly with Chinese punctuation marks 
> appended.
> e.g.
> 碧绿的眼珠 is tokenized as 碧绿|的|眼珠. Which is correct.
> But 
> 碧绿的眼珠，（with a Chinese punctuation appended )is tokenized as 碧绿|的|眼|珠，
> The similar case happens when text with numbers appended.
> e.g.
> 生活报8月4号 -->生活|报|8|月|4|号
> 生活报-->生活报
> Test Sample:
> public static void main(String[] args) throws IOException{
> Analyzer analyzer = new SmartChineseAnalyzer(); /* will load stopwords */
> System.out.println("Sample1===");
> String sentence = "生活报8月4号";
> printTokens(analyzer, sentence);
> sentence = "生活报";
> printTokens(analyzer, sentence);
> System.out.println("Sample2===");
> 
> sentence = "碧绿的眼珠，";
> printTokens(analyzer, sentence);
> sentence = "碧绿的眼珠";
> printTokens(analyzer, sentence);
> 
> analyzer.close();
>   }
>   private static void printTokens(Analyzer analyzer, String sentence) throws 
> IOException{
> System.out.println("sentence:" + sentence);
> TokenStream tokens = analyzer.tokenStream("dummyfield", sentence);
> tokens.reset();
> CharTermAttribute termAttr = (CharTermAttribute) 
> tokens.getAttribute(CharTermAttribute.class);
> while (tokens.incrementToken()) {
>   System.out.println(termAttr.toString());
> }
> tokens.close();
>   }
> Output:
> Sample1===
> sentence:生活报8月4号
> 生活
> 报
> 8
> 月
> 4
> 号
> sentence:生活报
> 生活报
> Sample2===
> sentence:碧绿的眼珠，
> 碧绿
> 的
> 眼
> 珠
> sentence:碧绿的眼珠
> 碧绿
> 的
> 眼珠



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4449) Enable backup requests for the internal solr load balancer

2016-10-21 Thread Jeff Wartes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Wartes updated SOLR-4449:
--
Labels: patch patch-available  (was: patch-available)

> Enable backup requests for the internal solr load balancer
> --
>
> Key: SOLR-4449
> URL: https://issues.apache.org/jira/browse/SOLR-4449
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: philip hoy
>Priority: Minor
>  Labels: patch, patch-available
> Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, 
> patch-4449.txt, solr-back-request-lb-plugin.jar
>
>
> Add the ability to configure the built-in solr load balancer such that it 
> submits a backup request to the next server in the list if the initial 
> request takes too long. Employing such an algorithm could improve the latency 
> of the 9xth percentile albeit at the expense of increasing overall load due 
> to additional requests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-8785) Use Metrics library for core metrics

2016-10-21 Thread Jeff Wartes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Wartes updated SOLR-8785:
--
Labels: patch patch-available  (was: patch-available)

> Use Metrics library for core metrics
> 
>
> Key: SOLR-8785
> URL: https://issues.apache.org/jira/browse/SOLR-8785
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.1
>Reporter: Jeff Wartes
>  Labels: patch, patch-available
>
> The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a 
> well-known way to track metrics about applications. 
> In SOLR-1972, latency percentile tracking was added. The comment list is 
> long, so here’s my synopsis:
> 1. An attempt was made to use the Metrics library
> 2. That attempt failed due to a memory leak in Metrics v2.1.1
> 3. Large parts of Metrics were then copied wholesale into the 
> org.apache.solr.util.stats package space and that was used instead.
> Copy/pasting Metrics code into Solr may have been the correct solution at the 
> time, but I submit that it isn’t correct any more. 
> The leak in Metrics was fixed even before SOLR-1972 was released, and by 
> copy/pasting a subset of the functionality, we miss access to other important 
> things that the Metrics library provides, particularly the concept of a 
> Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters)
> Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s 
> used in two contrib modules. (map-reduce and morphines-core)
> I’m proposing that:
> 1. Metrics as bundled with Solr be upgraded to the current v3.1.2
> 2. Most of the org.apache.solr.util.stats package space be deleted outright, 
> or gutted and replaced with simple calls to Metrics. Due to the copy/paste 
> origin, the concepts should mostly map 1:1.
> I’d further recommend a usage pattern like:
> SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”,
>  “solr-registry”))
> There are all kinds of areas in Solr that could benefit from metrics tracking 
> and reporting. This pattern allows diverse areas of code to track metrics 
> within a single, named registry. This well-known-name then becomes a handle 
> you can use to easily attach a Reporter and ship all of those metrics off-box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2016-10-21 Thread Phil Hoy (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595349#comment-15595349
 ] 

Phil Hoy commented on SOLR-4449:


I no longer work at Findmypast. Thank you.


> Enable backup requests for the internal solr load balancer
> --
>
> Key: SOLR-4449
> URL: https://issues.apache.org/jira/browse/SOLR-4449
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: philip hoy
>Priority: Minor
>  Labels: patch, patch-available
> Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, 
> patch-4449.txt, solr-back-request-lb-plugin.jar
>
>
> Add the ability to configure the built-in solr load balancer such that it 
> submits a backup request to the next server in the list if the initial 
> request takes too long. Employing such an algorithm could improve the latency 
> of the 9xth percentile albeit at the expense of increasing overall load due 
> to additional requests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4449) Enable backup requests for the internal solr load balancer

2016-10-21 Thread Jeff Wartes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Wartes updated SOLR-4449:
--
Labels: patch-available  (was: )

> Enable backup requests for the internal solr load balancer
> --
>
> Key: SOLR-4449
> URL: https://issues.apache.org/jira/browse/SOLR-4449
> Project: Solr
>  Issue Type: New Feature
>  Components: SolrCloud
>Reporter: philip hoy
>Priority: Minor
>  Labels: patch-available
> Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, 
> patch-4449.txt, solr-back-request-lb-plugin.jar
>
>
> Add the ability to configure the built-in solr load balancer such that it 
> submits a backup request to the next server in the list if the initial 
> request takes too long. Employing such an algorithm could improve the latency 
> of the 9xth percentile albeit at the expense of increasing overall load due 
> to additional requests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-8785) Use Metrics library for core metrics

2016-10-21 Thread Jeff Wartes (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Wartes updated SOLR-8785:
--
Labels: patch-available  (was: )

> Use Metrics library for core metrics
> 
>
> Key: SOLR-8785
> URL: https://issues.apache.org/jira/browse/SOLR-8785
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.1
>Reporter: Jeff Wartes
>  Labels: patch-available
>
> The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a 
> well-known way to track metrics about applications. 
> In SOLR-1972, latency percentile tracking was added. The comment list is 
> long, so here’s my synopsis:
> 1. An attempt was made to use the Metrics library
> 2. That attempt failed due to a memory leak in Metrics v2.1.1
> 3. Large parts of Metrics were then copied wholesale into the 
> org.apache.solr.util.stats package space and that was used instead.
> Copy/pasting Metrics code into Solr may have been the correct solution at the 
> time, but I submit that it isn’t correct any more. 
> The leak in Metrics was fixed even before SOLR-1972 was released, and by 
> copy/pasting a subset of the functionality, we miss access to other important 
> things that the Metrics library provides, particularly the concept of a 
> Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters)
> Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s 
> used in two contrib modules. (map-reduce and morphines-core)
> I’m proposing that:
> 1. Metrics as bundled with Solr be upgraded to the current v3.1.2
> 2. Most of the org.apache.solr.util.stats package space be deleted outright, 
> or gutted and replaced with simple calls to Metrics. Due to the copy/paste 
> origin, the concepts should mostly map 1:1.
> I’d further recommend a usage pattern like:
> SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”,
>  “solr-registry”))
> There are all kinds of areas in Solr that could benefit from metrics tracking 
> and reporting. This pattern allows diverse areas of code to track metrics 
> within a single, named registry. This well-known-name then becomes a handle 
> you can use to easily attach a Reporter and ship all of those metrics off-box.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7512) supress ecj-lint warinings on precommit

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595329#comment-15595329
 ] 

Michael McCandless commented on LUCENE-7512:


I haven't looked closely at the warnings recently ... but, instead of turning 
them off, can't we fix them?  Or are they somehow false alarms that we can't 
fix/suppress?

Also, when there is an error buried in all those warnings, the build does fail, 
so you can't "mistake" that there is really a problem.

> supress ecj-lint warinings on precommit
> ---
>
> Key: LUCENE-7512
> URL: https://issues.apache.org/jira/browse/LUCENE-7512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>  Labels: build
> Attachments: LUCENE-7512-solr-core-src.patch, 
> LUCENE-7512-solr-core-src.patch
>
>
> Turns out the subj noise too much and people miss significant ERRORs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6744) fl renaming / alias of uniqueKey field generates null pointer exception in SolrCloud configuration

2016-10-21 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595259#comment-15595259
 ] 

Mike Drob commented on SOLR-6744:
-

[~gesias] - Yes, I think it's the same issue. There were two parts to this 
JIRA: field renaming id to something else causes an error, and renaming 
something else to id returns the wrong results.

At this point it is a known bug in older versions, and fixed to behave properly 
in 6.2.1+

> fl renaming / alias of uniqueKey field generates null pointer exception in 
> SolrCloud configuration
> --
>
> Key: SOLR-6744
> URL: https://issues.apache.org/jira/browse/SOLR-6744
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.10.1
> Environment: Multiple replicas on SolrCloud config.  This specific 
> example with 4 shard, 3 replica per shard config.  This bug does NOT exist 
> when query is handled by single core.
>Reporter: Garth Grimm
>Assignee: Tomás Fernández Löbbe
>Priority: Minor
> Fix For: 6.2.1, 6.3, master (7.0)
>
> Attachments: SOLR-6744.patch, SOLR-6744.patch, SOLR-6744.patch
>
>
> If trying to rename the uniqueKey field using 'fl' in a distributed query 
> (ie: SolrCloud config), an NPE is thrown.
> The workarround is to redundently request the uniqueKey field, once with the 
> desired alias, and once with the original name
> Example...
> http://localhost:8983/solr/cloudcollection/select?q=*%3A*=xml=true=key:id
> Work around:
> http://localhost:8983/solr/cloudcollection/select?q=*%3A*=xml=true=key:id=id
> Error w/o work around...
> {code}
> 500 name="QTime">11*:* name="indent">truekey:id name="wt">xml name="trace">java.lang.NullPointerException
>   at 
> org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:1257)
>   at 
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:720)
>   at 
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:695)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:324)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>   at org.eclipse.jetty.server.Server.handle(Server.java:368)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
>   at 
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
>   at 
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
>   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
>   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
>   at 
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
>   at 
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
>   at 
>

[jira] [Updated] (SOLR-9677) edismax treat operator as a keyword when a query parameter 'qf' contains inexist field.

2016-10-21 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-9677:
-
Affects Version/s: 6.2.1

> edismax treat operator as a keyword when a query parameter 'qf' contains 
> inexist field.
> ---
>
> Key: SOLR-9677
> URL: https://issues.apache.org/jira/browse/SOLR-9677
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.2.1, 5.5.1, 6.2.1, trunk
>Reporter: Takumi Yoshida
>
> edismax treat operator as a keyword when a query parameter 'qf' contains 
> inexist field.
> ex. ('hoge' does not existing in the schema.)
> q=Japan OR Tokyo
> defType=edismax
> qf=title hoge
> you will get result containing keywords 'Japan' or 'OR' or 'Tokyo' in Title.
> also, you can get the following parsed query with debugQuery=true.
> {code}
> +((title:Japan) (title:OR) 
> (title:Tokyo))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9677) edismax treat operator as a keyword when a query parameter 'qf' contains inexist field.

2016-10-21 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595235#comment-15595235
 ] 

Erick Erickson commented on SOLR-9677:
--

Just confirmed on trunk and 6x...

> edismax treat operator as a keyword when a query parameter 'qf' contains 
> inexist field.
> ---
>
> Key: SOLR-9677
> URL: https://issues.apache.org/jira/browse/SOLR-9677
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.2.1, 5.5.1, 6.2.1, trunk
>Reporter: Takumi Yoshida
>
> edismax treat operator as a keyword when a query parameter 'qf' contains 
> inexist field.
> ex. ('hoge' does not existing in the schema.)
> q=Japan OR Tokyo
> defType=edismax
> qf=title hoge
> you will get result containing keywords 'Japan' or 'OR' or 'Tokyo' in Title.
> also, you can get the following parsed query with debugQuery=true.
> {code}
> +((title:Japan) (title:OR) 
> (title:Tokyo))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7517) Explore making Scorer.score() return a double

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595211#comment-15595211
 ] 

Michael McCandless commented on LUCENE-7517:


bq. Seems like it should switch to doubles (that original caching was meant to 
be totally transparent).

+1

And +1 to the API change.

> Explore making Scorer.score() return a double
> -
>
> Key: LUCENE-7517
> URL: https://issues.apache.org/jira/browse/LUCENE-7517
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
>
> Follow-up to 
> http://search-lucene.com/m/l6pAi1BoyPJ1vr2382=Re+JENKINS+EA+Lucene+Solr+master+Linux+64bit+jdk+9+ea+140+Build+18103+Unstable+.
> We could make Scorer.score() return a double in order to lose less accuracy 
> when combining scores together, while still using floats on TopDocs and more 
> generally all parts of the code that need to store scores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2016-10-21 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595203#comment-15595203
 ] 

Steve Rowe commented on LUCENE-2899:


IIRC a period is added to sentences that don't already have one.

[~goksron], since you're the author of the above-referenced comment, can you 
provide more detail here?

> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-2899-6.1.0.patch, LUCENE-2899-RJN.patch, 
> LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
> LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Deleted] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2016-10-21 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-2899:
---
Comment: was deleted

(was: I am currently on Annual Leave, returning on the 24th of October.
If your matter is urgent please contact the office on 01414401234.
Regards,
Alex
)

> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-2899-6.1.0.patch, LUCENE-2899-RJN.patch, 
> LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
> LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7517) Explore making Scorer.score() return a double

2016-10-21 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595168#comment-15595168
 ] 

Yonik Seeley commented on LUCENE-7517:
--

bq. Any opinions about whether things like CachingCollector should keep 
buffering floats or switch to doubles if we do that?

Seems like it should switch to doubles (that original caching was meant to be 
totally transparent).

> Explore making Scorer.score() return a double
> -
>
> Key: LUCENE-7517
> URL: https://issues.apache.org/jira/browse/LUCENE-7517
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
>
> Follow-up to 
> http://search-lucene.com/m/l6pAi1BoyPJ1vr2382=Re+JENKINS+EA+Lucene+Solr+master+Linux+64bit+jdk+9+ea+140+Build+18103+Unstable+.
> We could make Scorer.score() return a double in order to lose less accuracy 
> when combining scores together, while still using floats on TopDocs and more 
> generally all parts of the code that need to store scores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9546) There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595163#comment-15595163
 ] 

ASF subversion and git services commented on SOLR-9546:
---

Commit 7131849892f07ae9ad5cb945a138078e94fcb919 in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7131849 ]

SOLR-9546: reverted some changes


> There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class
> --
>
> Key: SOLR-9546
> URL: https://issues.apache.org/jira/browse/SOLR-9546
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9546.patch
>
>
> Here is an excerpt 
> {code}
>   public Long getLong(String param, Long def) {
> String val = get(param);
> try {
>   return val== null ? def : Long.parseLong(val);
> }
> catch( Exception ex ) {
>   throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, 
> ex.getMessage(), ex );
> }
>   }
> {code}
> {{Long.parseLong()}} returns a primitive type but since method expect to 
> return a {{Long}}, it needs to be wrapped. There are many more method like 
> that. We might be creating a lot of unnecessary objects here.
> I am not sure if JVM catches upto it and somehow optimizes it if these 
> methods are called enough times (or may be compiler does some modifications 
> at compile time)
> Let me know if I am thinking of some premature optimization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9546) There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595162#comment-15595162
 ] 

ASF subversion and git services commented on SOLR-9546:
---

Commit f51340993aea7cca3053844284c115bddaa90215 in lucene-solr's branch 
refs/heads/branch_6x from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f513409 ]

SOLR-9546: Eliminate unnecessary boxing/unboxing going on in SolrParams


> There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class
> --
>
> Key: SOLR-9546
> URL: https://issues.apache.org/jira/browse/SOLR-9546
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9546.patch
>
>
> Here is an excerpt 
> {code}
>   public Long getLong(String param, Long def) {
> String val = get(param);
> try {
>   return val== null ? def : Long.parseLong(val);
> }
> catch( Exception ex ) {
>   throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, 
> ex.getMessage(), ex );
> }
>   }
> {code}
> {{Long.parseLong()}} returns a primitive type but since method expect to 
> return a {{Long}}, it needs to be wrapped. There are many more method like 
> that. We might be creating a lot of unnecessary objects here.
> I am not sure if JVM catches upto it and somehow optimizes it if these 
> methods are called enough times (or may be compiler does some modifications 
> at compile time)
> Let me know if I am thinking of some premature optimization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-9326) Ability to create/delete/list snapshots for a solr collection

2016-10-21 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-9326.

   Resolution: Fixed
Fix Version/s: master (7.0)
   6.3

Committed.  Thanks Hrishikesh!

> Ability to create/delete/list snapshots for a solr collection
> -
>
> Key: SOLR-9326
> URL: https://issues.apache.org/jira/browse/SOLR-9326
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Hrishikesh Gadre
>Assignee: Yonik Seeley
> Fix For: 6.3, master (7.0)
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7462) Faster search APIs for doc values

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595145#comment-15595145
 ] 

Michael McCandless commented on LUCENE-7462:


+1 to the semantics and the patch.  Thanks [~jpountz]!

> Faster search APIs for doc values
> -
>
> Key: LUCENE-7462
> URL: https://issues.apache.org/jira/browse/LUCENE-7462
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: master (7.0)
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-7462-advanceExact.patch, LUCENE-7462.patch
>
>
> While the iterator API helps deal with sparse doc values more efficiently, it 
> also makes search-time operations more costly. For instance, the old 
> random-access API allowed to compute facets on a given segment without any 
> conditionals, by just incrementing the counter at index {{ordinal+1}} while 
> the new API requires to advance the iterator if necessary and then check 
> whether it is exactly on the right document or not.
> Since it is very common for fields to exist across most documents, I suspect 
> codecs will keep an internal structure that is similar to the current codec 
> in the dense case, by having a dense representation of the data and just 
> making the iterator skip over the minority of documents that do not have a 
> value.
> I suggest that we add APIs that make things cheaper at search time. For 
> instance in the case of SORTED doc values, it could look like 
> {{LegacySortedDocValues}} with the additional restriction that documents can 
> only be consumed in order. Codecs that can implement this API efficiently 
> would hide it behind a {{SortedDocValues}} adapter, and then at search time 
> facets and comparators (which liked the {{LegacySortedDocValues}} API better) 
> would either unwrap or hide the SortedDocValues they got behind a more 
> random-access API (which would only happen in the truly sparse case if the 
> codec optimizes the dense case).
> One challenge is that we already use the same idea for hiding single-valued 
> impls behind multi-valued impls, so we would need to enforce the order in 
> which the wrapping needs to happen. At first sight, it seems that it would be 
> best to do the single-value-behind-multi-value-API wrapping above the 
> random-access-behind-iterator-API wrapping. The complexity of 
> wrapping/unwrapping in the right order could be contained in the 
> {{DocValues}} helper class.
> I think this change would also simplify search-time consumption of doc 
> values, which currently needs to spend several lines of code positioning the 
> iterator everytime it needs to do something interesting with doc values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9326) Ability to create/delete/list snapshots for a solr collection

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595138#comment-15595138
 ] 

ASF subversion and git services commented on SOLR-9326:
---

Commit 57ba96145ce8233034c67ffaead22d3bd7f3460f in lucene-solr's branch 
refs/heads/master from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=57ba961 ]

SOLR-9326: Ability to create/delete/list snapshots at collection level.


> Ability to create/delete/list snapshots for a solr collection
> -
>
> Key: SOLR-9326
> URL: https://issues.apache.org/jira/browse/SOLR-9326
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Hrishikesh Gadre
>Assignee: Yonik Seeley
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9546) There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595127#comment-15595127
 ] 

ASF subversion and git services commented on SOLR-9546:
---

Commit 49ca9cea7283ab54086fdedd09889d171c777052 in lucene-solr's branch 
refs/heads/master from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=49ca9ce ]

SOLR-9546: reverted some changes


> There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class
> --
>
> Key: SOLR-9546
> URL: https://issues.apache.org/jira/browse/SOLR-9546
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9546.patch
>
>
> Here is an excerpt 
> {code}
>   public Long getLong(String param, Long def) {
> String val = get(param);
> try {
>   return val== null ? def : Long.parseLong(val);
> }
> catch( Exception ex ) {
>   throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, 
> ex.getMessage(), ex );
> }
>   }
> {code}
> {{Long.parseLong()}} returns a primitive type but since method expect to 
> return a {{Long}}, it needs to be wrapped. There are many more method like 
> that. We might be creating a lot of unnecessary objects here.
> I am not sure if JVM catches upto it and somehow optimizes it if these 
> methods are called enough times (or may be compiler does some modifications 
> at compile time)
> Let me know if I am thinking of some premature optimization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7407) Explore switching doc values to an iterator API

2016-10-21 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595113#comment-15595113
 ] 

Yonik Seeley commented on LUCENE-7407:
--

bq. > A quick test by hand is still more informative than having no information 
at all.
bq.  I disagree: it's reckless to run an overly synthetic benchmark and then 
present the results as if they mean we should make poor API tradeoffs.

When I did a quick test by hand, I always *disclosed* that.  It's a starting 
point, not an ending point.
And even homogenous tests (that are prone to hotspot overspecialization) are a 
useful datapoint, if you know what they are.
Some users will have exactly those types of requests - very homogeneous.

bq. My point is that running synthetic benchmarks and mis-representing them as 
"meaningful" is borderline reckless

The implication being that you judge they are not meaningful? Wow.

You seemed to admit that the lucene benchmarks don't even cover some of these 
cases (or don't cover them adequately).
- There is no single authoritative benchmark, and it's misleading to suggest 
there is (that somehow represents the *true* performance for users)
- The lucene benchmarks are also synthetic to a degree (although based off of 
real data).  For example, the query cache is disabled.  Why?  I assume to 
better isolate what is being tested.
- More realistic tests are always nice to verify that nothing was messed up... 
but a system will *always* have a bottleneck.  The question is *which 
bottleneck are you effectively testing*?
- More tests are better.  If others have the time/ability, they should run 
their own!

bq. [...] nowhere near as helpful as, say, improving our default codec, 
profiling and removing slow spots, removing extra legacy wrappers, etc. Those 
are more positive ways to move our project forward.

The first step I'd take would be to try and realistically isolate and quantify 
the performance of what I was trying to change anyway.  I did that starting off 
with Solr faceting tests (lucene benchmarks don't test that).

I *will* get around to trying and improve things.. in the meantime putting out 
the information I did have is better than hiding it.  Take it for what it is.
If you choose to just dismiss it as meaningless... well, I guess we'll have to 
agree to disagree.


> Explore switching doc values to an iterator API
> ---
>
> Key: LUCENE-7407
> URL: https://issues.apache.org/jira/browse/LUCENE-7407
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>  Labels: docValues
> Fix For: master (7.0)
>
> Attachments: LUCENE-7407.patch
>
>
> I think it could be compelling if we restricted doc values to use an
> iterator API at read time, instead of the more general random access
> API we have today:
>   * It would make doc values disk usage more of a "you pay for what
> what you actually use", like postings, which is a compelling
> reduction for sparse usage.
>   * I think codecs could compress better and maybe speed up decoding
> of doc values, even in the non-sparse case, since the read-time
> API is more restrictive "forward only" instead of random access.
>   * We could remove {{getDocsWithField}} entirely, since that's
> implicit in the iteration, and the awkward "return 0 if the
> document didn't have this field" would go away.
>   * We can remove the annoying thread locals we must make today in
> {{CodecReader}}, and close the trappy "I accidentally shared a
> single XXXDocValues instance across threads", since an iterator is
> inherently "use once".
>   * We could maybe leverage the numerous optimizations we've done for
> postings over time, since the two problems ("iterate over doc ids
> and store something interesting for each") are very similar.
> This idea has come up many in the past, e.g. LUCENE-7253 is a recent
> example, and very early iterations of doc values started with exactly
> this ;)
> However, it's a truly enormous change, likely 7.0 only.  Or maybe we
> could have the new iterator APIs also ported to 6.x side by side with
> the deprecate existing random-access APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9677) edismax treat operator as a keyword when a query parameter 'qf' contains inexist field.

2016-10-21 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-9677:
-
Affects Version/s: trunk

> edismax treat operator as a keyword when a query parameter 'qf' contains 
> inexist field.
> ---
>
> Key: SOLR-9677
> URL: https://issues.apache.org/jira/browse/SOLR-9677
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.2.1, 5.5.1, trunk
>Reporter: Takumi Yoshida
>
> edismax treat operator as a keyword when a query parameter 'qf' contains 
> inexist field.
> ex. ('hoge' does not existing in the schema.)
> q=Japan OR Tokyo
> defType=edismax
> qf=title hoge
> you will get result containing keywords 'Japan' or 'OR' or 'Tokyo' in Title.
> also, you can get the following parsed query with debugQuery=true.
> {code}
> +((title:Japan) (title:OR) 
> (title:Tokyo))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-9676) FastVectorHighligher log message could be improved

2016-10-21 Thread David Smiley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley reassigned SOLR-9676:
--

Assignee: David Smiley

> FastVectorHighligher log message could be improved
> --
>
> Key: SOLR-9676
> URL: https://issues.apache.org/jira/browse/SOLR-9676
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: highlighter
>Affects Versions: 4.10.4
>Reporter: Mike
>Assignee: David Smiley
>Priority: Minor
>
> If you try to use the FastVectorHighlighter on a field that doesn't have 
> TermPositions and TermOffsets enabled, you get an ok error message:
> {{WARN  org.apache.solr.highlight.DefaultSolrHighlighter  – Solr will use 
> Highlighter instead of FastVectorHighlighter because assignedTo field does 
> not store TermPositions and TermOffsets.}}
> If you heed that message, and dutifully add TermPositions and TermOffsets to 
> your schema, you get a crashing message that says:
> {code:none}
> Blah, blah, stacktrace
> 
> Caused by: java.lang.IllegalArgumentException: cannot index term vector 
> offsets when term vectors are not indexed (field="court")
> ...
> {code}
> Can we update the first message to say:
> {{Solr will use Highlighter instead of FastVectorHighlighter because 
> assignedTo field does not store TermPositions, TermOffsets, and TermVectors.}}
> That'd save at least one headache next time I screw this up...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9546) There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595075#comment-15595075
 ] 

ASF subversion and git services commented on SOLR-9546:
---

Commit ccbafdc403fb66e4becfe1b934957f6247b07a7a in lucene-solr's branch 
refs/heads/master from [~noble.paul]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ccbafdc ]

SOLR-9546: Eliminate unnecessary boxing/unboxing going on in SolrParams


> There is a lot of unnecessary boxing/unboxing going on in {{SolrParams}} class
> --
>
> Key: SOLR-9546
> URL: https://issues.apache.org/jira/browse/SOLR-9546
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Pushkar Raste
>Priority: Minor
> Attachments: SOLR-9546.patch
>
>
> Here is an excerpt 
> {code}
>   public Long getLong(String param, Long def) {
> String val = get(param);
> try {
>   return val== null ? def : Long.parseLong(val);
> }
> catch( Exception ex ) {
>   throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, 
> ex.getMessage(), ex );
> }
>   }
> {code}
> {{Long.parseLong()}} returns a primitive type but since method expect to 
> return a {{Long}}, it needs to be wrapped. There are many more method like 
> that. We might be creating a lot of unnecessary objects here.
> I am not sure if JVM catches upto it and somehow optimizes it if these 
> methods are called enough times (or may be compiler does some modifications 
> at compile time)
> Let me know if I am thinking of some premature optimization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7407) Explore switching doc values to an iterator API

2016-10-21 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595064#comment-15595064
 ] 

Joel Bernstein commented on LUCENE-7407:


I think this is progressing in a good way. The initial work was done in master 
and not backported 6x. The initial work had a performance impact and it's been 
noted. Now it's time to work on improving performance. I'll be happy to help 
out with the performance issues. As long we don't have a need to rush out 7.0 
then we have some time to improve the performance.


> Explore switching doc values to an iterator API
> ---
>
> Key: LUCENE-7407
> URL: https://issues.apache.org/jira/browse/LUCENE-7407
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>  Labels: docValues
> Fix For: master (7.0)
>
> Attachments: LUCENE-7407.patch
>
>
> I think it could be compelling if we restricted doc values to use an
> iterator API at read time, instead of the more general random access
> API we have today:
>   * It would make doc values disk usage more of a "you pay for what
> what you actually use", like postings, which is a compelling
> reduction for sparse usage.
>   * I think codecs could compress better and maybe speed up decoding
> of doc values, even in the non-sparse case, since the read-time
> API is more restrictive "forward only" instead of random access.
>   * We could remove {{getDocsWithField}} entirely, since that's
> implicit in the iteration, and the awkward "return 0 if the
> document didn't have this field" would go away.
>   * We can remove the annoying thread locals we must make today in
> {{CodecReader}}, and close the trappy "I accidentally shared a
> single XXXDocValues instance across threads", since an iterator is
> inherently "use once".
>   * We could maybe leverage the numerous optimizations we've done for
> postings over time, since the two problems ("iterate over doc ids
> and store something interesting for each") are very similar.
> This idea has come up many in the past, e.g. LUCENE-7253 is a recent
> example, and very early iterations of doc values started with exactly
> this ;)
> However, it's a truly enormous change, likely 7.0 only.  Or maybe we
> could have the new iterator APIs also ported to 6.x side by side with
> the deprecate existing random-access APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7516) consider to remove DocSet.close()

2016-10-21 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595063#comment-15595063
 ] 

David Smiley commented on LUCENE-7516:
--

Presumably this is filed against Lucene and not Solr because it's a sub-task 
and this is a JIRA limitation?

> consider to remove DocSet.close()
> -
>
> Key: LUCENE-7516
> URL: https://issues.apache.org/jira/browse/LUCENE-7516
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Mikhail Khludnev
>Priority: Trivial
>
> [~romseygeek], I'd like to do the subj but I've found
> {code}
>   /** FUTURE: for off-heap */
>   @Override
>   public void close() throws IOException {
>   }
> {code}
> and want to sync up with [~ysee...@gmail.com]. If he confirm, let's nuke it 
> from master, keeping in 6x as well it's public method. it should be SOLR 
> ticket for sure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7517) Explore making Scorer.score() return a double

2016-10-21 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595047#comment-15595047
 ] 

Uwe Schindler commented on LUCENE-7517:
---

I agree here, especially with Robert's comment: We should store and return 
scores as float to the user (a double makes no sense, because it leads to wrong 
expectations for users - who always tend to misuse scores for stuff they are 
not made for: they are just there to compare to search results and bring them 
in right order; for that we should use the float, so minor calculation 
differences don't matter). But when we actually calculate the score we should 
use double precision for all calculation steps.

The good thing: at the end we round everything to a float, so some differences 
by order of clauses would be removed during this round at end before it goes to 
TopDocs.

> Explore making Scorer.score() return a double
> -
>
> Key: LUCENE-7517
> URL: https://issues.apache.org/jira/browse/LUCENE-7517
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Adrien Grand
>Priority: Minor
>
> Follow-up to 
> http://search-lucene.com/m/l6pAi1BoyPJ1vr2382=Re+JENKINS+EA+Lucene+Solr+master+Linux+64bit+jdk+9+ea+140+Build+18103+Unstable+.
> We could make Scorer.score() return a double in order to lose less accuracy 
> when combining scores together, while still using floats on TopDocs and more 
> generally all parts of the code that need to store scores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7407) Explore switching doc values to an iterator API

2016-10-21 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595043#comment-15595043
 ] 

David Smiley commented on LUCENE-7407:
--

I wouldn't _dare_ suggest to another committer how they should spend their 
time; it's entirely their prerogative.  That's crossing a line; please stop!  I 
think we should value all technical input, even if it's bad news (e.g. 
something got slower).  Building/running a benchmark is being helpful.  I 
understand if you don't like the benchmark in particular (I'm not going to 
argue it's a particularly good or bad one) but it's being helpful and it takes 
time to do these things.  I'd be depressed right now if I were in Yonik's 
shoes; but hey that's me and we need emotions of steel around here to survive.

> Explore switching doc values to an iterator API
> ---
>
> Key: LUCENE-7407
> URL: https://issues.apache.org/jira/browse/LUCENE-7407
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>  Labels: docValues
> Fix For: master (7.0)
>
> Attachments: LUCENE-7407.patch
>
>
> I think it could be compelling if we restricted doc values to use an
> iterator API at read time, instead of the more general random access
> API we have today:
>   * It would make doc values disk usage more of a "you pay for what
> what you actually use", like postings, which is a compelling
> reduction for sparse usage.
>   * I think codecs could compress better and maybe speed up decoding
> of doc values, even in the non-sparse case, since the read-time
> API is more restrictive "forward only" instead of random access.
>   * We could remove {{getDocsWithField}} entirely, since that's
> implicit in the iteration, and the awkward "return 0 if the
> document didn't have this field" would go away.
>   * We can remove the annoying thread locals we must make today in
> {{CodecReader}}, and close the trappy "I accidentally shared a
> single XXXDocValues instance across threads", since an iterator is
> inherently "use once".
>   * We could maybe leverage the numerous optimizations we've done for
> postings over time, since the two problems ("iterate over doc ids
> and store something interesting for each") are very similar.
> This idea has come up many in the past, e.g. LUCENE-7253 is a recent
> example, and very early iterations of doc values started with exactly
> this ;)
> However, it's a truly enormous change, likely 7.0 only.  Or maybe we
> could have the new iterator APIs also ported to 6.x side by side with
> the deprecate existing random-access APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Adrien Grand

I opened https://issues.apache.org/jira/browse/LUCENE-7517.

Le ven. 21 oct. 2016 à 14:52, Robert Muir  a écrit :

> The problem is more than worth it. The alternative is to remove the
> optimization? I don't think being incorrect / adding leniency to tests
> is a valid option at all. In general, if we dont apply a general fix,
> it will just make more such optimizations harder: more jenkins
> failures, more deltas in tests, just a bad direction.
>
> I guess what i propose is something more like: change Scorer.score()
> to return double, and use double precision internally in all scoring
> (also similarity code).
>
> But keep it a float in e.g. ScoreDoc/TopDocs: we just "export" that to
> the user at the end. This is really best practice anyway, we shouldnt
> be storing intermediate calculations as 32-bit floats. It would just
> be a generalization of what DisjunctionSumScorer etc are already
> doing.
>
>
> On Fri, Oct 21, 2016 at 8:34 AM, Adrien Grand  wrote:
> > I suspect we could do something on the Scorer API indeed, eg. by giving
> > scorers a way to expose the double value of the score. However it's not
> > clear to me that this problem is worth making the Scorer API more
> complex?
> >
> > Le ven. 21 oct. 2016 à 12:37, Robert Muir  a écrit :
> >>
> >> But maybe the old "trick" can still be used somehow: just means using
> >> double precision internally to erase most differences? Maybe it means
> >> a change to scorer api or whatever, but still I think its a good
> >> practical solution (vs something more extreme like kahan summation). I
> >> am sure it does not work if someone has like 500k boolean clauses or
> >> for more extreme cases, but it prevents these problems for typical
> >> cases like keyword searches.
> >>
> >>
> >> On Fri, Oct 21, 2016 at 6:31 AM, Adrien Grand 
> wrote:
> >> > Le ven. 21 oct. 2016 à 12:20, Robert Muir  a écrit
> :
> >> >>
> >> >> What changed?
> >> >
> >> >
> >> > The issue here is ReqOptSumScorer, which computes the score of the
> MUST
> >> > and
> >> > SHOULD clauses separately and then sum them up. In that test case, in
> >> > one
> >> > case body:d is in the list of SHOULD clauses, and in the other case it
> >> > is in
> >> > the list of MUST clauses.
> >> >
> >> > For the same reason, "+a b", "+a +b" and "a +b" may return different
> >> > scores
> >> > on the same documents.
> >> >
> >> > I can undo the change if you think this is a blocker, but that would
> be
> >> > disappointing as it would mean that we cannot do other exciting
> changes
> >> > like
> >> > flattening nested disjunctions since it would cause the same problem.
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Created] (LUCENE-7517) Explore making Scorer.score() return a double

2016-10-21 Thread Adrien Grand (JIRA)

Adrien Grand created LUCENE-7517:


 Summary: Explore making Scorer.score() return a double
 Key: LUCENE-7517
 URL: https://issues.apache.org/jira/browse/LUCENE-7517
 Project: Lucene - Core
  Issue Type: Wish
Reporter: Adrien Grand
Priority: Minor


Follow-up to 
http://search-lucene.com/m/l6pAi1BoyPJ1vr2382=Re+JENKINS+EA+Lucene+Solr+master+Linux+64bit+jdk+9+ea+140+Build+18103+Unstable+.

We could make Scorer.score() return a double in order to lose less accuracy 
when combining scores together, while still using floats on TopDocs and more 
generally all parts of the code that need to store scores.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7506) Roll over GC logs by default via bin/solr scripts

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-7506:
--
Attachment: SOLR-7506.patch

New patch
* Adds GC log rotation with file name {{solr_gc.log}}
* Merge with master
* Updated UtilsTool to remove/rotate {{solr_gc.\*}} as well as {{solr_gc_\*}}

Re-tested on mac, not Windows so far

> Roll over GC logs by default via bin/solr scripts
> -
>
> Key: SOLR-7506
> URL: https://issues.apache.org/jira/browse/SOLR-7506
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Reporter: Shalin Shekhar Mangar
>Assignee: Jan Høydahl
>Priority: Minor
>  Labels: logging
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-7506.patch, SOLR-7506.patch
>
>
> The Oracle JDK supports rolling over GC logs. I propose to add the following 
> to the solr.in.{sh,cmd} scripts to enable it by default:
> {code}
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=20M
> {code}
> Unfortunately, the JDK doesn't have any option to append to existing log 
> instead of overwriting so the latest log is overwritten. Maybe we can have 
> the bin/solr script roll that after the process is killed?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Robert Muir

The problem is more than worth it. The alternative is to remove the
optimization? I don't think being incorrect / adding leniency to tests
is a valid option at all. In general, if we dont apply a general fix,
it will just make more such optimizations harder: more jenkins
failures, more deltas in tests, just a bad direction.

I guess what i propose is something more like: change Scorer.score()
to return double, and use double precision internally in all scoring
(also similarity code).

But keep it a float in e.g. ScoreDoc/TopDocs: we just "export" that to
the user at the end. This is really best practice anyway, we shouldnt
be storing intermediate calculations as 32-bit floats. It would just
be a generalization of what DisjunctionSumScorer etc are already
doing.


On Fri, Oct 21, 2016 at 8:34 AM, Adrien Grand  wrote:
> I suspect we could do something on the Scorer API indeed, eg. by giving
> scorers a way to expose the double value of the score. However it's not
> clear to me that this problem is worth making the Scorer API more complex?
>
> Le ven. 21 oct. 2016 à 12:37, Robert Muir  a écrit :
>>
>> But maybe the old "trick" can still be used somehow: just means using
>> double precision internally to erase most differences? Maybe it means
>> a change to scorer api or whatever, but still I think its a good
>> practical solution (vs something more extreme like kahan summation). I
>> am sure it does not work if someone has like 500k boolean clauses or
>> for more extreme cases, but it prevents these problems for typical
>> cases like keyword searches.
>>
>>
>> On Fri, Oct 21, 2016 at 6:31 AM, Adrien Grand  wrote:
>> > Le ven. 21 oct. 2016 à 12:20, Robert Muir  a écrit :
>> >>
>> >> What changed?
>> >
>> >
>> > The issue here is ReqOptSumScorer, which computes the score of the MUST
>> > and
>> > SHOULD clauses separately and then sum them up. In that test case, in
>> > one
>> > case body:d is in the list of SHOULD clauses, and in the other case it
>> > is in
>> > the list of MUST clauses.
>> >
>> > For the same reason, "+a b", "+a +b" and "a +b" may return different
>> > scores
>> > on the same documents.
>> >
>> > I can undo the change if you think this is a blocker, but that would be
>> > disappointing as it would mean that we cannot do other exciting changes
>> > like
>> > flattening nested disjunctions since it would cause the same problem.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9671) TestMiniSolrCloudCluster blowup jvm with remote /get requests

2016-10-21 Thread Mikhail Khludnev (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-9671:
---
Attachment: 
TestMiniSolrCloudCluster-testCollectionCreateSearchDelete-fail-brief.txt

[^TestMiniSolrCloudCluster-testCollectionCreateSearchDelete-fail-brief.txt] 
clarifies the case

bq. parallelCoreAdminExecutor-1321-thread-1 creates 
testcollection_shard2_replica1
bq.  parallelCoreAdminExecutor-1329-thread-1 creates  
testcollection_shard2_replica2 

but the parallelCoreAdminExecutor-1321-thread-1 (replica1) will never appear in 
logs until death from OOME heap space 
Anyway parallelCoreAdminExecutor-1329-thread-1 seems try to sync 
shard2_replica2 with stalled shard2_replica1, and then give up 
bq. o.a.s.c.ShardLeaderElectionContext We failed sync, but we have no versions 
- we can't sync in that case - we were active before, so become leader anyway

 but the problem is that it saturate heap with 
{quote}
749534 ERROR (qtp1915946497-6736) [] o.a.s.s.HttpSolrCall 
null:org.apache.solr.common.SolrException: Error trying to proxy request for 
url: http://127.0.0.1:42320/solr/testcollection_shard2_replica1/get
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:590)
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:444)
{quote}

for me it's strange that it issues "remoteQueries" ie talks to a replica 
through other peers, and it's the only explanation why we have so many of them 
hanging on read - it seems like two nodes calls each other until heap 
saturation. WDYT?

> TestMiniSolrCloudCluster blowup jvm with remote /get requests
> -
>
> Key: SOLR-9671
> URL: https://issues.apache.org/jira/browse/SOLR-9671
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mikhail Khludnev
>  Labels: cloud
> Attachments: 
> TestMiniSolrCloudCluster-testCollectionCreateSearchDelete-fail-brief.txt, 
> TestMiniSolrCloudCluster-testCollectionCreateSearchDelete-fail.zip
>
>
> this is epic https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1994/
> There is no many cores, I checked. It seems like cluster blow up when tries 
> to launch after collection remove. Haven't tried to reproduce it locally 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Commented] (SOLR-9399) Delete requests do not send credentials & fails for Basic Authentication

2016-10-21 Thread Susheel Kumar

and this behavior of not failing update.process() with bad credentials only
seen within the test / for the test cluster created within the
BasicAuthIntegrationTest.  If I point it to external cluster, the same code
 i.e. update.process() fails for bad credentials.  Something is weird /
missing in test.  I debugged deep to the SolrHttpClient yesterday which
ultimately sends the update POST request  and it returns 200 status when
run from BasicAuthIntegrationTest while same returns 401 when point
to external cluster.  Does that tells anything. Not sure if retry/PKI auth
may have any role.

HttpSolrClient.java

---

 final HttpResponse response = httpClient.execute(method,
httpClientRequestContext);

 int httpStatus = response.getStatusLine().getStatusCode();

On Fri, Oct 21, 2016 at 7:59 AM, Jan Høydahl (JIRA)  wrote:

>
> [ https://issues.apache.org/jira/browse/SOLR-9399?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=15594893#comment-15594893 ]
>
> Jan Høydahl commented on SOLR-9399:
> ---
>
> Did some more testing and managed to have the CloudSolrClient actually
> fail with 401, but only when calling update.commit() and patching
> CloudSolrClient, adding in line 799
> {code}
> 
> nonRoutableRequest.setBasicAuthCredentials(updateRequest.getBasicAuthUser(),
> updateRequest.getBasicAuthPassword());
> {code}
>
> However, when calling update.process() the update request succeeds even
> with wrong credentials. I even verified that the doc gets added/deleted
> from the index when using wrong credentials. The process() method is using
> some retry logic, could it be that the retry succeeds using PKI auth?
>
> > Delete requests do not send credentials & fails for Basic Authentication
> > 
> >
> > Key: SOLR-9399
> > URL: https://issues.apache.org/jira/browse/SOLR-9399
> > Project: Solr
> >  Issue Type: Bug
> >  Security Level: Public(Default Security Level. Issues are Public)
> >  Components: SolrJ
> >Affects Versions: 6.0, 6.0.1, 6.x
> >Reporter: Susheel Kumar
> >  Labels: security
> >
> > The getRoutes(..) func of UpdateRequest do not pass credentials to
> LBHttpSolrClient when deleteById is set while for updates it passes the
> credentials.  See below code snippet
> >   if (deleteById != null) {
> >
> >   Iterator>> entries =
> deleteById.entrySet()
> >   .iterator();
> >   while (entries.hasNext()) {
> >
> > Map.Entry> entry = entries.next();
> >
> > String deleteId = entry.getKey();
> > Map map = entry.getValue();
> > Long version = null;
> > if (map != null) {
> >   version = (Long) map.get(VER);
> > }
> > Slice slice = router.getTargetSlice(deleteId, null, null, null,
> col);
> > if (slice == null) {
> >   return null;
> > }
> > List urls = urlMap.get(slice.getName());
> > if (urls == null) {
> >   return null;
> > }
> > String leaderUrl = urls.get(0);
> > LBHttpSolrClient.Req request = routes.get(leaderUrl);
> > if (request != null) {
> >   UpdateRequest urequest = (UpdateRequest) request.getRequest();
> >   urequest.deleteById(deleteId, version);
> > } else {
> >   UpdateRequest urequest = new UpdateRequest();
> >   urequest.setParams(params);
> >   urequest.deleteById(deleteId, version);
> >   urequest.setCommitWithin(getCommitWithin());
> >   request = new LBHttpSolrClient.Req(urequest, urls);
> >   routes.put(leaderUrl, request);
> > }
> >   }
> > }
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (LUCENE-7407) Explore switching doc values to an iterator API

2016-10-21 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594967#comment-15594967
 ] 

Michael McCandless commented on LUCENE-7407:


bq. At first blush, it doesn't look like the lucenebench tests cover sorting 
and faceting that well.

bq. For example, I tested function queries (ValueSource) and sorting by 
multiple docvalue fields. Are either of these things tested at all in 
https://home.apache.org/~mikemccand/lucenebench/ ?

bq. Running diverse fields in the same JVM run is esp important to prevent 
hotspot from over-optimizing for a single field cardinality (since different 
cardinalities have different docvalues encodings).

{quote}
How many different numeric fields are concurrently sorted on for 
https://home.apache.org/~mikemccand/lucenebench/ ?
The names suggest just one: " TermQuery (date/time sort)"
If that is actually the case, then you're in danger of hotspot 
over-specializing for that single field/cardinality.
{quote}

These are all good points, all things that I would like to improve
about Lucene's nightly benchmarks
(https://home.apache.org/~mikemccand/lucenebench/).  Patches welcome ;)

I'll try to add some low cardinality faceting/sorting coverage, maybe
using month name and day-of-the-year from the last modified date.

The nightly Wikipedia benchmark facets on Date field as a hierarchy
(year/month/day), and sorts on "last modified" (seconds resolution I
think) and title.

I've also long wanted to add highlighters...

bq. A quick test by hand is still more informative than having no information 
at all.

I disagree: it's reckless to run an overly synthetic benchmark and
then present the results as if they mean we should make poor API
tradeoffs.

bq. If one is measuring performance of a faceting change, then isolate it.

In the ideal world, yes, but this is notoriously problematic to do
with java: hotspot, GC, etc. will all behave very differently if you
are testing a very narrow part of the code.

{quote}
That's an unnecessary personal dig.
I've already put in a lot of effort into benchmarking this, only to have it 
dismissed with hand waves, for cases that may not even be covered (or may be 
under stated) by your own benchmarks.
I fully intend to dig into the solr side, but I was waiting until the API 
stabilizes (LUCENE-7462)
I pointed at specific examples that reside entirely in lucene code (the sorting 
examples)
{quote}

My point is that running synthetic benchmarks and mis-representing
them as "meaningful" is borderline reckless, and certainly nowhere
near as helpful as, say, improving our default codec, profiling and
removing slow spots, removing extra legacy wrappers, etc.  Those are
more positive ways to move our project forward.

Perhaps you feel you have put in a lot of effort here, but from where
I stand I see lots of complaining about how things got slower and little
effort to actually improve the sources.  This issue alone was a
tremendous amount of slogging for me, and I had to switch Solr over
without fully understanding its sources: you or other Solr experts
could have stepped in to help me then.

But why not do that now?  I.e. review my Solr changes or function
queries, etc.?  I could easily have done something silly: it was just
a "rote" cutover to the iterator API.

I think we could nicely optimize the browse only case, by just using
{{nextDoc}} to step through all doc values for a given field.  Does Solr
do that today?

Why not test the patch on LUCENE-7462 to see if that API change helps?

I am not disagreeing that DV access got slower: the Lucene nightly
benchmarks also show that.

Yet look at sort-by-title: at first it got slower, on initial cutover
to iterators, but then thanks to [~jpountz] (thank you Adrien!), it's
now faster than it was before:
https://home.apache.org/~mikemccand/lucenebench/TermTitleSort.html

With more iterations I expect we can do the same thing for the other
dense cases.  An iteration-only API means we can do all sorts of nice
compression improvements not possible with the random access API, we
don't need per-lookup bounds checks, etc.  We should adopt from the
many things we do to compress postings, which have been iterators only
forever.  And it means the sparse case, as a happy side effect,
get to improve too.

This could lead to a point in the future where the dense cases perform
better than they did with random access API, like sort-by-title does
already.  We've only just begun down this path, and in just a few
weeks [~jpountz] has already made big gains.


> Explore switching doc values to an iterator API
> ---
>
> Key: LUCENE-7407
> URL: https://issues.apache.org/jira/browse/LUCENE-7407
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>

[jira] [Resolved] (SOLR-9570) Logs backed up on restart are kept forever

2016-10-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-9570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-9570.
---
Resolution: Fixed

> Logs backed up on restart are kept forever
> --
>
> Key: SOLR-9570
> URL: https://issues.apache.org/jira/browse/SOLR-9570
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>  Labels: logging
> Fix For: 6.3, master (7.0)
>
> Attachments: SOLR-8370.patch
>
>
> When (re)starting Solr, the start script will backup any existing 
> {{solr.log}} or {{solr_gc.log}} to a file {{solr_log_}} and 
> {{solr_gc_log_}} respectively. That may be all good, but these old 
> copies are never cleaned up, as they are not under the control of log4j.
> This issue will instead rotate solr.log properly on startup, delete old 
> time-stamped files taking up place, back up (one generation only) of 
> console-log and solr_gc.log in $SOLR_LOGS_DIR/archived/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Adrien Grand

I suspect we could do something on the Scorer API indeed, eg. by giving
scorers a way to expose the double value of the score. However it's not
clear to me that this problem is worth making the Scorer API more complex?

Le ven. 21 oct. 2016 à 12:37, Robert Muir  a écrit :

> But maybe the old "trick" can still be used somehow: just means using
> double precision internally to erase most differences? Maybe it means
> a change to scorer api or whatever, but still I think its a good
> practical solution (vs something more extreme like kahan summation). I
> am sure it does not work if someone has like 500k boolean clauses or
> for more extreme cases, but it prevents these problems for typical
> cases like keyword searches.
>
>
> On Fri, Oct 21, 2016 at 6:31 AM, Adrien Grand  wrote:
> > Le ven. 21 oct. 2016 à 12:20, Robert Muir  a écrit :
> >>
> >> What changed?
> >
> >
> > The issue here is ReqOptSumScorer, which computes the score of the MUST
> and
> > SHOULD clauses separately and then sum them up. In that test case, in one
> > case body:d is in the list of SHOULD clauses, and in the other case it
> is in
> > the list of MUST clauses.
> >
> > For the same reason, "+a b", "+a +b" and "a +b" may return different
> scores
> > on the same documents.
> >
> > I can undo the change if you think this is a blocker, but that would be
> > disappointing as it would mean that we cannot do other exciting changes
> like
> > flattening nested disjunctions since it would cause the same problem.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Created] (LUCENE-7516) consider to remove DocSet.close()

2016-10-21 Thread Mikhail Khludnev (JIRA)

Mikhail Khludnev created LUCENE-7516:


 Summary: consider to remove DocSet.close()
 Key: LUCENE-7516
 URL: https://issues.apache.org/jira/browse/LUCENE-7516
 Project: Lucene - Core
  Issue Type: Sub-task
Reporter: Mikhail Khludnev
Priority: Trivial


[~romseygeek], I'd like to do the subj but I've found
{code}
  /** FUTURE: for off-heap */
  @Override
  public void close() throws IOException {
  }
{code}
and want to sync up with [~ysee...@gmail.com]. If he confirm, let's nuke it 
from master, keeping in 6x as well it's public method. it should be SOLR ticket 
for sure




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7512) supress ecj-lint warinings on precommit

2016-10-21 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594934#comment-15594934
 ] 

Alan Woodward commented on LUCENE-7512:
---

+1, this looks great.

Some of the suppression is fine, things like the linter misinterpreting casts 
as object creation, but we can probably clean up a couple of other places here. 
 For example, does DocSet really need to implement Closeable?  It's only every 
implemented in DocSetBase, and it's a no-op there...

> supress ecj-lint warinings on precommit
> ---
>
> Key: LUCENE-7512
> URL: https://issues.apache.org/jira/browse/LUCENE-7512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>  Labels: build
> Attachments: LUCENE-7512-solr-core-src.patch, 
> LUCENE-7512-solr-core-src.patch
>
>
> Turns out the subj noise too much and people miss significant ERRORs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9671) TestMiniSolrCloudCluster blowup jvm with remote /get requests

2016-10-21 Thread Mikhail Khludnev (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-9671:
---
Attachment: 
TestMiniSolrCloudCluster-testCollectionCreateSearchDelete-fail.zip

> TestMiniSolrCloudCluster blowup jvm with remote /get requests
> -
>
> Key: SOLR-9671
> URL: https://issues.apache.org/jira/browse/SOLR-9671
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mikhail Khludnev
>  Labels: cloud
> Attachments: 
> TestMiniSolrCloudCluster-testCollectionCreateSearchDelete-fail.zip
>
>
> this is epic https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1994/
> There is no many cores, I checked. It seems like cluster blow up when tries 
> to launch after collection remove. Haven't tried to reproduce it locally 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-9671) TestMiniSolrCloudCluster blowup jvm with remote /get requests

2016-10-21 Thread Mikhail Khludnev (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-9671:
---
Attachment: (was: TestMiniSolrCloudCluster-epic-fail.zip)

> TestMiniSolrCloudCluster blowup jvm with remote /get requests
> -
>
> Key: SOLR-9671
> URL: https://issues.apache.org/jira/browse/SOLR-9671
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mikhail Khludnev
>  Labels: cloud
>
> this is epic https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1994/
> There is no many cores, I checked. It seems like cluster blow up when tries 
> to launch after collection remove. Haven't tried to reproduce it locally 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9399) Delete requests do not send credentials & fails for Basic Authentication

2016-10-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594893#comment-15594893
 ] 

Jan Høydahl commented on SOLR-9399:
---

Did some more testing and managed to have the CloudSolrClient actually fail 
with 401, but only when calling update.commit() and patching CloudSolrClient, 
adding in line 799
{code}

nonRoutableRequest.setBasicAuthCredentials(updateRequest.getBasicAuthUser(), 
updateRequest.getBasicAuthPassword());
{code}

However, when calling update.process() the update request succeeds even with 
wrong credentials. I even verified that the doc gets added/deleted from the 
index when using wrong credentials. The process() method is using some retry 
logic, could it be that the retry succeeds using PKI auth?

> Delete requests do not send credentials & fails for Basic Authentication
> 
>
> Key: SOLR-9399
> URL: https://issues.apache.org/jira/browse/SOLR-9399
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0, 6.0.1, 6.x
>Reporter: Susheel Kumar
>  Labels: security
>
> The getRoutes(..) func of UpdateRequest do not pass credentials to 
> LBHttpSolrClient when deleteById is set while for updates it passes the 
> credentials.  See below code snippet
>   if (deleteById != null) {
>   
>   Iterator>> entries = 
> deleteById.entrySet()
>   .iterator();
>   while (entries.hasNext()) {
> 
> Map.Entry> entry = entries.next();
> 
> String deleteId = entry.getKey();
> Map map = entry.getValue();
> Long version = null;
> if (map != null) {
>   version = (Long) map.get(VER);
> }
> Slice slice = router.getTargetSlice(deleteId, null, null, null, col);
> if (slice == null) {
>   return null;
> }
> List urls = urlMap.get(slice.getName());
> if (urls == null) {
>   return null;
> }
> String leaderUrl = urls.get(0);
> LBHttpSolrClient.Req request = routes.get(leaderUrl);
> if (request != null) {
>   UpdateRequest urequest = (UpdateRequest) request.getRequest();
>   urequest.deleteById(deleteId, version);
> } else {
>   UpdateRequest urequest = new UpdateRequest();
>   urequest.setParams(params);
>   urequest.deleteById(deleteId, version);
>   urequest.setCommitWithin(getCommitWithin());
>   request = new LBHttpSolrClient.Req(urequest, urls);
>   routes.put(leaderUrl, request);
> }
>   }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2016-10-21 Thread Alex Watson (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594849#comment-15594849
 ] 

Alex Watson commented on LUCENE-2899:
-

I am currently on Annual Leave, returning on the 24th of October.
If your matter is urgent please contact the office on 01414401234.
Regards,
Alex


> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-2899-6.1.0.patch, LUCENE-2899-RJN.patch, 
> LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
> LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2016-10-21 Thread Joern Kottmann (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594847#comment-15594847
 ] 

Joern Kottmann commented on LUCENE-2899:


The patch has this comment:
"EN POS tagger sometimes tags last word as a period if no period at the end"

Do you remove punctuation from sentence before it is passed to the POS Tagger, 
Chunker, or Name Finder?

The sourceforge models are trained with punctuation, but it shouldn't be 
difficult to retrain them if this is necessary. 

> Add OpenNLP Analysis capabilities as a module
> -
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-2899-6.1.0.patch, LUCENE-2899-RJN.patch, 
> LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
> LUCENE-2899.patch, OpenNLPFilter.java, OpenNLPTokenizer.java
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice 
> to have a submodule (under analysis) that exposed capabilities for it. Drew 
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-490) dismax should autoescape + and - followed by whitespace (maybe?)

2016-10-21 Thread Alexandre Rafalovitch (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Rafalovitch closed SOLR-490.
--
Resolution: Workaround

A new issue can be opened against the latest implementation of a relevant query 
parser, if any of this is still relevant.

> dismax should autoescape + and - followed by whitespace (maybe?)
> 
>
> Key: SOLR-490
> URL: https://issues.apache.org/jira/browse/SOLR-490
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers
>Affects Versions: 1.1.0, 1.2, 1.3
>Reporter: Hoss Man
>
> As discussed in this thread...
> Date: Tue, 26 Feb 2008 04:13:54 -0500
> From: Kevin Xiao
> To: solr-user
> Subject: solr to handle special charater
> ...the docs for dismax said that + or - followed by *nonwhitespace* 
> characters had special meaning ... for some reason i thought the 
> dismaxhandler had code that would look for things like "xyz - abc" and 
> autoescape it to "xyz \- abc" (after calling partialEscape) so that the +/- 
> would only be special if the were treu prefix operators.
> apparently this never actually existed.
> we should figure out if that's how it *should* work, and if so implement it.
> this would also be a good time to make the autoescaping behavior of dismax 
> more configurable, or at least more overridable by subclasses (it's currently 
> handled by a static method call)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-7512) supress ecj-lint warinings on precommit

2016-10-21 Thread Mikhail Khludnev (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated LUCENE-7512:
-
Attachment: LUCENE-7512-solr-core-src.patch

and now [^LUCENE-7512-solr-core-src.patch] without test failures. Is anybody 
interested in it? 

> supress ecj-lint warinings on precommit
> ---
>
> Key: LUCENE-7512
> URL: https://issues.apache.org/jira/browse/LUCENE-7512
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>  Labels: build
> Attachments: LUCENE-7512-solr-core-src.patch, 
> LUCENE-7512-solr-core-src.patch
>
>
> Turns out the subj noise too much and people miss significant ERRORs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Robert Muir

But maybe the old "trick" can still be used somehow: just means using
double precision internally to erase most differences? Maybe it means
a change to scorer api or whatever, but still I think its a good
practical solution (vs something more extreme like kahan summation). I
am sure it does not work if someone has like 500k boolean clauses or
for more extreme cases, but it prevents these problems for typical
cases like keyword searches.

On Fri, Oct 21, 2016 at 6:31 AM, Adrien Grand  wrote:
> Le ven. 21 oct. 2016 à 12:20, Robert Muir  a écrit :
>>
>> What changed?
>
>
> The issue here is ReqOptSumScorer, which computes the score of the MUST and
> SHOULD clauses separately and then sum them up. In that test case, in one
> case body:d is in the list of SHOULD clauses, and in the other case it is in
> the list of MUST clauses.
>
> For the same reason, "+a b", "+a +b" and "a +b" may return different scores
> on the same documents.
>
> I can undo the change if you think this is a blocker, but that would be
> disappointing as it would mean that we cannot do other exciting changes like
> flattening nested disjunctions since it would cause the same problem.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Adrien Grand

Le ven. 21 oct. 2016 à 12:20, Robert Muir  a écrit :

> What changed?
>

The issue here is ReqOptSumScorer, which computes the score of the MUST and
SHOULD clauses separately and then sum them up. In that test case, in one
case body:d is in the list of SHOULD clauses, and in the other case it is
in the list of MUST clauses.

For the same reason, "+a b", "+a +b" and "a +b" may return different scores
on the same documents.

I can undo the change if you think this is a blocker, but that would be
disappointing as it would mean that we cannot do other exciting changes
like flattening nested disjunctions since it would cause the same problem.

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Robert Muir

We had similar tests before though, with BS1 vs BS2. Because BS1
scored out-of-order, it performed additions in different order, too.

But we did the math in these boolean scorers with double precision
internally and this kept the test happy (I am sure, if you had a
massive massive BQ it would have eventually failed).

What changed?

Sorry, I think as soon as you start adding leniency here, the test is
instantly worthless. You might as well delete it, because you can no
longer tell if a difference in score "matters" or not.

On Fri, Oct 21, 2016 at 6:15 AM, Adrien Grand  wrote:
> I don't think this test failure proves that the optimization is broken. The
> other failure is a bit easier to understand so I'll use it as an example,
> but this one is the same. We have an initial query which is #body:d
> (+body:b (body:d body:a))^4.0 +body:c body:d)^4.0)^8.0)^3.0)^9.0 and we are
> trying to check that we get the same scores if we rewrite it or if we don't.
>
> For that particular query, the rewritten form is:
> ((+body:b (body:d body:a))^4.0 +body:c +body:d)^864.0
>
> The only boolean simplification that happened is that body:d was made a MUST
> clause since it was both a FILTER and a SHOULD clause. The rest is only
> about simplifications of nested BoostQueries. The score is different due to
> the fact that additions are performed in a different order, which is
> expected with floats. And then the difference is amplified by the fact that
> this particular query has a high boost.
>
> Le ven. 21 oct. 2016 à 11:30, Robert Muir  a écrit :
>>
>> Personally I think that is not correct to do. Some optimization or
>> another is broken if scores differ in this way...
>>
>> Just because differences are "small" does not make them insignificant.
>> Think about the tie-break case to users and so on, especially with
>> smaller documents / less terms / etc.
>>
>> Even 1 ulp is gonna mess stuff up, i do not believe leniency in the
>> tests is the solution.
>>
>>
>>
>> On Fri, Oct 21, 2016 at 3:05 AM, Adrien Grand  wrote:
>> > It is the same one as the other day, we need to relax a bit the check on
>> > scores somehow.
>> >
>> > Le ven. 21 oct. 2016 à 08:40, Policeman Jenkins Server
>> > 
>> > a écrit :
>> >>
>> >> Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18103/
>> >> Java: 64bit/jdk-9-ea+140 -XX:+UseCompressedOops -XX:+UseParallelGC
>> >>
>> >> 1 tests failed.
>> >> FAILED:  org.apache.lucene.search.TestBooleanRewrites.testRandom
>> >>
>> >> Error Message:
>> >> expected:<2048.99267578125> but was:<2048.992919921875>
>> >>
>> >> Stack Trace:
>> >> java.lang.AssertionError: expected:<2048.99267578125> but
>> >> was:<2048.992919921875>
>> >> at
>> >>
>> >> __randomizedtesting.SeedInfo.seed([DDF676110AA095BA:AFBA531EBBC023C9]:0)
>> >> at org.junit.Assert.fail(Assert.java:93)
>> >> at org.junit.Assert.failNotEquals(Assert.java:647)
>> >> at org.junit.Assert.assertEquals(Assert.java:443)
>> >> at org.junit.Assert.assertEquals(Assert.java:512)
>> >> at
>> >>
>> >> org.apache.lucene.search.TestBooleanRewrites.assertEquals(TestBooleanRewrites.java:427)
>> >> at
>> >>
>> >> org.apache.lucene.search.TestBooleanRewrites.testRandom(TestBooleanRewrites.java:367)
>> >> at
>> >>
>> >> jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native
>> >> Method)
>> >> at
>> >>
>> >> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62)
>> >> at
>> >>
>> >> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43)
>> >> at
>> >> java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:535)
>> >> at
>> >>
>> >> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
>> >> at
>> >>
>> >> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
>> >> at
>> >>
>> >> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
>> >> at
>> >>
>> >> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
>> >> at
>> >>
>> >> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
>> >> at
>> >>
>> >> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>> >> at
>> >>
>> >> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>> >> at
>> >>
>> >> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>> >> at
>> >>
>> >> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>> >> at
>> >>
>> >>

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Adrien Grand

I don't think this test failure proves that the optimization is broken. The
other failure is a bit easier to understand so I'll use it as an example,
but this one is the same. We have an initial query which is #body:d
(+body:b (body:d body:a))^4.0 +body:c body:d)^4.0)^8.0)^3.0)^9.0 and we are
trying to check that we get the same scores if we rewrite it or if we don't.

For that particular query, the rewritten form is:
((+body:b (body:d body:a))^4.0 +body:c +body:d)^864.0

The only boolean simplification that happened is that body:d was made a
MUST clause since it was both a FILTER and a SHOULD clause. The rest is
only about simplifications of nested BoostQueries. The score is different
due to the fact that additions are performed in a different order, which is
expected with floats. And then the difference is amplified by the fact that
this particular query has a high boost.

Le ven. 21 oct. 2016 à 11:30, Robert Muir  a écrit :

> Personally I think that is not correct to do. Some optimization or
> another is broken if scores differ in this way...
>
> Just because differences are "small" does not make them insignificant.
> Think about the tie-break case to users and so on, especially with
> smaller documents / less terms / etc.
>
> Even 1 ulp is gonna mess stuff up, i do not believe leniency in the
> tests is the solution.
>
>
>
> On Fri, Oct 21, 2016 at 3:05 AM, Adrien Grand  wrote:
> > It is the same one as the other day, we need to relax a bit the check on
> > scores somehow.
> >
> > Le ven. 21 oct. 2016 à 08:40, Policeman Jenkins Server <
> jenk...@thetaphi.de>
> > a écrit :
> >>
> >> Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18103/
> >> Java: 64bit/jdk-9-ea+140 -XX:+UseCompressedOops -XX:+UseParallelGC
> >>
> >> 1 tests failed.
> >> FAILED:  org.apache.lucene.search.TestBooleanRewrites.testRandom
> >>
> >> Error Message:
> >> expected:<2048.99267578125> but was:<2048.992919921875>
> >>
> >> Stack Trace:
> >> java.lang.AssertionError: expected:<2048.99267578125> but
> >> was:<2048.992919921875>
> >> at
> >> __randomizedtesting.SeedInfo.seed([DDF676110AA095BA:AFBA531EBBC023C9]:0)
> >> at org.junit.Assert.fail(Assert.java:93)
> >> at org.junit.Assert.failNotEquals(Assert.java:647)
> >> at org.junit.Assert.assertEquals(Assert.java:443)
> >> at org.junit.Assert.assertEquals(Assert.java:512)
> >> at
> >>
> org.apache.lucene.search.TestBooleanRewrites.assertEquals(TestBooleanRewrites.java:427)
> >> at
> >>
> org.apache.lucene.search.TestBooleanRewrites.testRandom(TestBooleanRewrites.java:367)
> >> at
> >> jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea
> /Native
> >> Method)
> >> at
> >> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea
> /NativeMethodAccessorImpl.java:62)
> >> at
> >> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea
> /DelegatingMethodAccessorImpl.java:43)
> >> at java.lang.reflect.Method.invoke(java.base@9-ea
> /Method.java:535)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
> >> at
> >>
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
> >> at
> >>
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
> >> at
> >>
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
> >> at
> >>
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
> >> at
> >>
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
> >> at
> >>
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> >> at
> >>
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
> >> at
> >>
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
> >> at
> >>
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
> >> at
> >>
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
> >> at
> >>
>

Re: [JENKINS-EA] Lucene-Solr-master-Linux (64bit/jdk-9-ea+140) - Build # 18103 - Unstable!

2016-10-21 Thread Robert Muir

Personally I think that is not correct to do. Some optimization or
another is broken if scores differ in this way...

Just because differences are "small" does not make them insignificant.
Think about the tie-break case to users and so on, especially with
smaller documents / less terms / etc.

Even 1 ulp is gonna mess stuff up, i do not believe leniency in the
tests is the solution.



On Fri, Oct 21, 2016 at 3:05 AM, Adrien Grand  wrote:
> It is the same one as the other day, we need to relax a bit the check on
> scores somehow.
>
> Le ven. 21 oct. 2016 à 08:40, Policeman Jenkins Server 
> a écrit :
>>
>> Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18103/
>> Java: 64bit/jdk-9-ea+140 -XX:+UseCompressedOops -XX:+UseParallelGC
>>
>> 1 tests failed.
>> FAILED:  org.apache.lucene.search.TestBooleanRewrites.testRandom
>>
>> Error Message:
>> expected:<2048.99267578125> but was:<2048.992919921875>
>>
>> Stack Trace:
>> java.lang.AssertionError: expected:<2048.99267578125> but
>> was:<2048.992919921875>
>> at
>> __randomizedtesting.SeedInfo.seed([DDF676110AA095BA:AFBA531EBBC023C9]:0)
>> at org.junit.Assert.fail(Assert.java:93)
>> at org.junit.Assert.failNotEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:443)
>> at org.junit.Assert.assertEquals(Assert.java:512)
>> at
>> org.apache.lucene.search.TestBooleanRewrites.assertEquals(TestBooleanRewrites.java:427)
>> at
>> org.apache.lucene.search.TestBooleanRewrites.testRandom(TestBooleanRewrites.java:367)
>> at
>> jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native
>> Method)
>> at
>> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62)
>> at
>> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:535)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1764)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:871)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:907)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:921)
>> at
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
>> at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>> at
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>> at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>> at
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:809)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:460)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:880)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:781)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:816)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:827)
>> at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>> at
>>

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_102) - Build # 18104 - Failure!

2016-10-21 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18104/
Java: 64bit/jdk1.8.0_102 -XX:+UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 12646 lines...]
   [junit4] JVM J0: stdout was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-core/test/temp/junit4-J0-20161021_085142_433.sysout
   [junit4] >>> JVM J0 emitted unexpected output (verbatim) 
   [junit4] java.lang.OutOfMemoryError: Java heap space
   [junit4] Dumping heap to 
/home/jenkins/workspace/Lucene-Solr-master-Linux/heapdumps/java_pid7460.hprof 
...
   [junit4] Heap dump file created [545982654 bytes in 0.695 secs]
   [junit4] <<< JVM J0: EOF 

[...truncated 11045 lines...]
BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-master-Linux/build.xml:765: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-master-Linux/build.xml:717: Some of the 
tests produced a heap dump, but did not fail. Maybe a suppressed 
OutOfMemoryError? Dumps created:
* java_pid7460.hprof

Total time: 54 minutes 41 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-7515) RunListenerPrintReproduceInfo may try to access uninitialized rule fields resulting in an NPE

2016-10-21 Thread Dawid Weiss (JIRA)

Dawid Weiss created LUCENE-7515:
---

 Summary: RunListenerPrintReproduceInfo may try to access 
uninitialized rule fields resulting in an NPE
 Key: LUCENE-7515
 URL: https://issues.apache.org/jira/browse/LUCENE-7515
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Trivial
 Fix For: 6.x, master (7.0)






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7515) RunListenerPrintReproduceInfo may try to access uninitialized rule fields resulting in an NPE

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594600#comment-15594600
 ] 

ASF subversion and git services commented on LUCENE-7515:
-

Commit f379dde2d7206bd406b43a4ed7d0f8671b2f2e7b in lucene-solr's branch 
refs/heads/branch_6x from [~dawid.weiss]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f379dde ]

LUCENE-7515: RunListenerPrintReproduceInfo may try to access static rule fields 
without
the rule being called. This flag is needed to ensure this isn't the case.


> RunListenerPrintReproduceInfo may try to access uninitialized rule fields 
> resulting in an NPE
> -
>
> Key: LUCENE-7515
> URL: https://issues.apache.org/jira/browse/LUCENE-7515
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.x, master (7.0)
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-7515) RunListenerPrintReproduceInfo may try to access uninitialized rule fields resulting in an NPE

2016-10-21 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-7515.
-
Resolution: Fixed

> RunListenerPrintReproduceInfo may try to access uninitialized rule fields 
> resulting in an NPE
> -
>
> Key: LUCENE-7515
> URL: https://issues.apache.org/jira/browse/LUCENE-7515
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.x, master (7.0)
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7515) RunListenerPrintReproduceInfo may try to access uninitialized rule fields resulting in an NPE

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594601#comment-15594601
 ] 

ASF subversion and git services commented on LUCENE-7515:
-

Commit bc0116af6928ef921c03b6533c29f230a0fa193e in lucene-solr's branch 
refs/heads/master from [~dawid.weiss]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=bc0116a ]

LUCENE-7515: RunListenerPrintReproduceInfo may try to access static rule fields 
without
the rule being called. This flag is needed to ensure this isn't the case.


> RunListenerPrintReproduceInfo may try to access uninitialized rule fields 
> resulting in an NPE
> -
>
> Key: LUCENE-7515
> URL: https://issues.apache.org/jira/browse/LUCENE-7515
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.x, master (7.0)
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7513) Update to randomizedtesting 2.4.0

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594591#comment-15594591
 ] 

ASF subversion and git services commented on LUCENE-7513:
-

Commit a19ec194d25692f13e03d92450c1f261670e938a in lucene-solr's branch 
refs/heads/master from [~dawid.weiss]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a19ec19 ]

LUCENE-7513: Update to randomizedtesting 2.4.0.


> Update to randomizedtesting 2.4.0
> -
>
> Key: LUCENE-7513
> URL: https://issues.apache.org/jira/browse/LUCENE-7513
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.x, master (7.0)
>
>
> Update to randomizedtesting 2.4.0. Should help us diagnose issues with 
> hanging JVMs (SOLR-9618). 
> There's also an addition of "biased" (evil) random number generation 
> routines. Perhaps they'll be interesting to some of you.
> https://github.com/randomizedtesting/randomizedtesting/releases/tag/release%2F2.4.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-7513) Update to randomizedtesting 2.4.0

2016-10-21 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594590#comment-15594590
 ] 

ASF subversion and git services commented on LUCENE-7513:
-

Commit a08a2a2965ebf1ce25edabb97aaffc38fd78910a in lucene-solr's branch 
refs/heads/branch_6x from [~dawid.weiss]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a08a2a2 ]

LUCENE-7513: Update to randomizedtesting 2.4.0.


> Update to randomizedtesting 2.4.0
> -
>
> Key: LUCENE-7513
> URL: https://issues.apache.org/jira/browse/LUCENE-7513
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.x, master (7.0)
>
>
> Update to randomizedtesting 2.4.0. Should help us diagnose issues with 
> hanging JVMs (SOLR-9618). 
> There's also an addition of "biased" (evil) random number generation 
> routines. Perhaps they'll be interesting to some of you.
> https://github.com/randomizedtesting/randomizedtesting/releases/tag/release%2F2.4.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-7513) Update to randomizedtesting 2.4.0

2016-10-21 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-7513.
-
Resolution: Fixed

> Update to randomizedtesting 2.4.0
> -
>
> Key: LUCENE-7513
> URL: https://issues.apache.org/jira/browse/LUCENE-7513
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.x, master (7.0)
>
>
> Update to randomizedtesting 2.4.0. Should help us diagnose issues with 
> hanging JVMs (SOLR-9618). 
> There's also an addition of "biased" (evil) random number generation 
> routines. Perhaps they'll be interesting to some of you.
> https://github.com/randomizedtesting/randomizedtesting/releases/tag/release%2F2.4.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr

2016-10-21 Thread adeppa (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594543#comment-15594543
 ] 

adeppa commented on SOLR-8542:
--

Hi Team,

Could you help any one how to integrate LTR in to solr 5.1.0 ,if need to apply 
the any patch please help me ,

Thanks
Adeppa

> Integrate Learning to Rank into Solr
> 
>
> Key: SOLR-8542
> URL: https://issues.apache.org/jira/browse/SOLR-8542
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joshua Pantony
>Assignee: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch
>
>
> This is a ticket to integrate learning to rank machine learning models into 
> Solr. Solr Learning to Rank (LTR) provides a way for you to extract features 
> directly inside Solr for use in training a machine learned model. You can 
> then deploy that model to Solr and use it to rerank your top X search 
> results. This concept was previously [presented by the authors at Lucene/Solr 
> Revolution 
> 2015|http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp].
> [Read through the 
> README|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-release/solr/contrib/ltr]
>  for a tutorial on using the plugin, in addition to how to train your own 
> external model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2212) NoMergePolicy class does not load

2016-10-21 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594525#comment-15594525
 ] 

Cao Manh Dat commented on SOLR-2212:


Now, mergePolicy accept a MergePolicyFactory. So I think we can close this 
issue here and open another ticket like "NoMergePolicyFactory" to add a 
NoMergePolicyFactory for Solr.

> NoMergePolicy class does not load
> -
>
> Key: SOLR-2212
> URL: https://issues.apache.org/jira/browse/SOLR-2212
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 3.1, 4.0-ALPHA
>Reporter: Lance Norskog
>
> Solr cannot use the Lucene NoMergePolicy class. It will not instantiate 
> correctly when loading the core.
> Other MergePolicy classes work, including the BalancedSegmentMergePolicy.
> This is in trunk and 3.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-2212) NoMergePolicy class does not load

2016-10-21 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594525#comment-15594525
 ] 

Cao Manh Dat edited comment on SOLR-2212 at 10/21/16 8:55 AM:
--

Now, mergePolicy accept a MergePolicyFactory. So I think we can close this 
issue here and open another ticket like "NoMergePolicyFactory" to add a new 
NoMergePolicyFactory for Solr.


was (Author: caomanhdat):
Now, mergePolicy accept a MergePolicyFactory. So I think we can close this 
issue here and open another ticket like "NoMergePolicyFactory" to add a 
NoMergePolicyFactory for Solr.

> NoMergePolicy class does not load
> -
>
> Key: SOLR-2212
> URL: https://issues.apache.org/jira/browse/SOLR-2212
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 3.1, 4.0-ALPHA
>Reporter: Lance Norskog
>
> Solr cannot use the Lucene NoMergePolicy class. It will not instantiate 
> correctly when loading the core.
> Other MergePolicy classes work, including the BalancedSegmentMergePolicy.
> This is in trunk and 3.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-9671) TestMiniSolrCloudCluster blowup jvm with remote /get requests

2016-10-21 Thread Mikhail Khludnev (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594495#comment-15594495
 ] 

Mikhail Khludnev edited comment on SOLR-9671 at 10/21/16 8:44 AM:
--

thread dump is full of hanging remote /get requests
{quote}
 696703 ERROR (qtp1915946497-10408) [] o.a.s.s.HttpSolrCall 
null:org.apache.solr.common.SolrException: Error trying to proxy request for 
url: http://127.0.0.1:42320/solr/testcollection_shard2_replica1/get
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:590)
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:444)
{quote}
There are two container which are stuck on that :42320 and :36441. There is no 
stack of thread originates these requests..  


was (Author: mkhludnev):
thread dump is full of hanging remote /get requests
 696703 ERROR (qtp1915946497-10408) [] o.a.s.s.HttpSolrCall 
null:org.apache.solr.common.SolrException: Error trying to proxy request for 
url: http://127.0.0.1:42320/solr/testcollection_shard2_replica1/get
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:590)
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:444)
There are two container which are stuck on that :42320 and :36441

> TestMiniSolrCloudCluster blowup jvm with remote /get requests
> -
>
> Key: SOLR-9671
> URL: https://issues.apache.org/jira/browse/SOLR-9671
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mikhail Khludnev
>  Labels: cloud
> Attachments: TestMiniSolrCloudCluster-epic-fail.zip
>
>
> this is epic https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1994/
> There is no many cores, I checked. It seems like cluster blow up when tries 
> to launch after collection remove. Haven't tried to reproduce it locally 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9671) TestMiniSolrCloudCluster blowup jvm with remote /get requests

2016-10-21 Thread Mikhail Khludnev (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594495#comment-15594495
 ] 

Mikhail Khludnev commented on SOLR-9671:


thread dump is full of hanging remote /get requests
 696703 ERROR (qtp1915946497-10408) [] o.a.s.s.HttpSolrCall 
null:org.apache.solr.common.SolrException: Error trying to proxy request for 
url: http://127.0.0.1:42320/solr/testcollection_shard2_replica1/get
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:590)
   [junit4]   2>at 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:444)
There are two container which are stuck on that :42320 and :36441

> TestMiniSolrCloudCluster blowup jvm with remote /get requests
> -
>
> Key: SOLR-9671
> URL: https://issues.apache.org/jira/browse/SOLR-9671
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mikhail Khludnev
>  Labels: cloud
> Attachments: TestMiniSolrCloudCluster-epic-fail.zip
>
>
> this is epic https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1994/
> There is no many cores, I checked. It seems like cluster blow up when tries 
> to launch after collection remove. Haven't tried to reproduce it locally 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-7514) TestLatLonPointQueries fails with biased (evil) numbers

2016-10-21 Thread Dawid Weiss (JIRA)

Dawid Weiss created LUCENE-7514:
---

 Summary: TestLatLonPointQueries fails with biased (evil) numbers
 Key: LUCENE-7514
 URL: https://issues.apache.org/jira/browse/LUCENE-7514
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor


After I commit LUCENE-7513 and switch to evil numbers, some tests fail in 
TestLatLonPointQueries. Could be I made a mistake somewhere in BiasedNumbers, 
but a verification would be nice.

Example failing seed: 
-Dtests.seed=B6740F75309ABA5D

but it fails with multiple seeds, actually. The output for the seed above:

{code}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestLatLonPointQueries -Dtests.method=testAllLatEqual 
-Dtests.seed=B6740F75309ABA5D -Dtests.slow=true -Dtests.locale=lv-LV 
-Dtests.timezone=Antarctica/McMurdo -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] FAILURE 1.42s | TestLatLonPointQueries.testAllLatEqual <<<
   [junit4]> Throwable #1: java.lang.AssertionError: wrong hit (first of 
possibly more):
   [junit4]> FAIL: id=6 should not match but did
   [junit4]>   box=Rectangle(lat=0.0 TO 1.401298464324817E-45 
lon=179.97 TO 180.0)
   [junit4]>   query=point:[0.0 TO 0.0],[179.9991618097 TO 
179.9991618097] docID=6
   [junit4]>   lat=0.0 lon=179.9991618097
   [junit4]>   deleted?=false
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([B6740F75309ABA5D:BE159FB39579850]:0)
   [junit4]>at 
org.apache.lucene.geo.BaseGeoPointTestCase.verifyRandomRectangles(BaseGeoPointTestCase.java:858)
   [junit4]>at 
org.apache.lucene.geo.BaseGeoPointTestCase.verify(BaseGeoPointTestCase.java:740)
   [junit4]>at 
org.apache.lucene.geo.BaseGeoPointTestCase.testAllLatEqual(BaseGeoPointTestCase.java:449)
   [junit4]>at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): {id=FST50}, 
docValues:{id=DocValuesFormat(name=Asserting), 
point=DocValuesFormat(name=Direct)}, maxPointsInLeafNode=1823, 
maxMBSortInHeap=7.309388819818781, sim=RandomSimilarity(queryNorm=false): {}, 
locale=lv-LV, timezone=Antarctica/McMurdo
   [junit4]   2> NOTE: Windows 10 10.0 amd64/Oracle Corporation 1.8.0_102 
(64-bit)/cpus=8,threads=1,free=182766440,total=257425408
   [junit4]   2> NOTE: All tests run in this JVM: [TestLatLonPointQueries]
   [junit4] Completed [1/1 (1!)] in 1.98s, 1 test, 1 failure <<< FAILURES!
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 111 matches

Mail list logo