date:20190128

[jira] [Commented] (LUCENE-8660) Include totalHitsThreshold when tracking total hits in TopDocsCollector

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754707#comment-16754707
 ] 

ASF subversion and git services commented on LUCENE-8660:
-

Commit a269a4d1cb7889e5a69aa042316dffdf2a1050a5 in lucene-solr's branch 
refs/heads/master from Jim Ferenczi
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a269a4d ]

LUCENE-8660: TopDocsCollectors now return an accurate count (instead of a lower 
bound) if the total hit count is equal to the provided threshold.


> Include totalHitsThreshold when tracking total hits in TopDocsCollector
> ---
>
> Key: LUCENE-8660
> URL: https://issues.apache.org/jira/browse/LUCENE-8660
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8660.patch, LUCENE-8660.patch
>
>
> Today the total hits threshold in the top docs collector is not inclusive, 
> this means that total hits are tracked up to totalHitsThreshold-1. After 
> discussing with @jpountz we agreed that this is not intuitive to return a 
> lower bound that is equal to totalHitsThreshold even if the count is accurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2019-01-28 Thread Lucene/Solr QA (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754690#comment-16754690
 ] 

Lucene/Solr QA commented on SOLR-13132:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
10s{color} | {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 10s{color} 
| {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} Release audit (RAT) {color} | {color:red}  
1m 10s{color} | {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} Check forbidden APIs {color} | {color:red} 
 1m 10s{color} | {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} Validate source patterns {color} | 
{color:red}  1m 10s{color} | {color:red} core in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 26s{color} 
| {color:red} core in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  3m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-13132 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956591/SOLR-13132-with-cache.patch
 |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon 
Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 8bee03f |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
| compile | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/artifact/out/patch-compile-solr_core.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/artifact/out/patch-compile-solr_core.txt
 |
| Release audit (RAT) | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/artifact/out/patch-compile-solr_core.txt
 |
| Check forbidden APIs | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/artifact/out/patch-compile-solr_core.txt
 |
| Validate source patterns | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/artifact/out/patch-compile-solr_core.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/artifact/out/patch-unit-solr_core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/276/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-with-cache.patch, SOLR-13132.patch
>
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in pract

[jira] [Commented] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread Adrien Grand (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754688#comment-16754688
 ] 

Adrien Grand commented on LUCENE-8662:
--

Thanks for the ping [~markrmil...@gmail.com].

[~yuanyun.cn] I'd rather override {{seekExact}} in {{ExitableTermsEnum}} to 
delegate to the wrapped instance. In general we only delegate abstract methods 
in Filter* classes so that there is a minimal set of methods that need to be 
extended in order to write a well-behaved terms enum wrapper. For instance if 
FilterTermsEnum delegated seekExact, then any sub class that modifies the 
content of the terms enum (eg. filtering some terms out) would have to override 
both seekCeil and seekExact while today overriding seekCeil is enough. If we 
think that it is a performance trap, we could discuss making 
{{TermsEnum#seekExact}} abstract and delegate it in {{FilterTermsEnum}} but 
this wouldn't be a small change so we might want a stronger case for it.

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
> Attachments: output of test program.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
>  
> I added System.out.println("ord: " + ord); in 
> codecs.blocktree.SegmentTermsEnum.getFrame(int).
> Please check the attached output of test program.txt. 
>  
> We found out the root cause:
> we didn't implement seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef

[jira] [Commented] (SOLR-11883) NPE on missing nested query in QueryValueSource

2019-01-28 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754680#comment-16754680
 ] 

Munendra S N commented on SOLR-11883:
-

 [^SOLR-11883.patch] 
Fixed assertions in tests.

> NPE on missing nested query in QueryValueSource
> ---
>
> Key: SOLR-11883
> URL: https://issues.apache.org/jira/browse/SOLR-11883
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-11883.patch, SOLR-11883.patch
>
>
> When the nested query or query de-referencing is used but the query isn't 
> specified Solr throws NPE.
> For following request, 
> {code:java}
> http://localhost:8983/solr/blockjoin70001-1492010056/select?q=*&boost=query($qq)&defType=edismax
> {code}
> Solr returned 500 with stack trace
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.lucene.queries.function.valuesource.QueryValueSource.hashCode(QueryValueSource.java:63)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.hashCode(ValueSource.java:275)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.hashCode(FunctionScoreQuery.java:269)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery.hashCode(FunctionScoreQuery.java:130)
>   at org.apache.solr.search.QueryResultKey.(QueryResultKey.java:46)
>   at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1326)
>   at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:583)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at org.eclipse.jetty.server.Server.handle(Server.j

[jira] [Updated] (SOLR-11883) NPE on missing nested query in QueryValueSource

2019-01-28 Thread Munendra S N (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N updated SOLR-11883:

Attachment: SOLR-11883.patch

> NPE on missing nested query in QueryValueSource
> ---
>
> Key: SOLR-11883
> URL: https://issues.apache.org/jira/browse/SOLR-11883
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-11883.patch, SOLR-11883.patch
>
>
> When the nested query or query de-referencing is used but the query isn't 
> specified Solr throws NPE.
> For following request, 
> {code:java}
> http://localhost:8983/solr/blockjoin70001-1492010056/select?q=*&boost=query($qq)&defType=edismax
> {code}
> Solr returned 500 with stack trace
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.lucene.queries.function.valuesource.QueryValueSource.hashCode(QueryValueSource.java:63)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.hashCode(ValueSource.java:275)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.hashCode(FunctionScoreQuery.java:269)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery.hashCode(FunctionScoreQuery.java:130)
>   at org.apache.solr.search.QueryResultKey.(QueryResultKey.java:46)
>   at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1326)
>   at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:583)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at org.eclipse.jetty.server.Server.handle(Server.java:530)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChanne

[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0

2019-01-28 Thread JIRA



[ 
https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754660#comment-16754660
 ] 

Björn Häuser commented on SOLR-12743:
-

[~markus17] sorry for coming back to you this late, but we can confirm, no 
leaks anymore. Sorry for the caused inconvenience.

 

> Memory leak introduced in Solr 7.3.0
> 
>
> Key: SOLR-12743
> URL: https://issues.apache.org/jira/browse/SOLR-12743
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.3, 7.3.1, 7.4
>Reporter: Tomás Fernández Löbbe
>Priority: Critical
>
> Reported initially by [~markus17]([1], [2]), but other users have had the 
> same issue [3]. Some of the key parts:
> {noformat}
> Some facts:
> * problem started after upgrading from 7.2.1 to 7.3.0;
> * it occurs only in our main text search collection, all other collections 
> are unaffected;
> * despite what i said earlier, it is so far unreproducible outside 
> production, even when mimicking production as good as we can;
> * SortedIntDocSet instances and ConcurrentLRUCache$CacheEntry instances are 
> both leaked on commit;
> * filterCache is enabled using FastLRUCache;
> * filter queries are simple field:value using strings, and three filter query 
> for time range using [NOW/DAY TO NOW+1DAY/DAY] syntax for 'today', 'last 
> week' and 'last month', but rarely used;
> * reloading the core manually frees OldGen;
> * custom URP's don't cause the problem, disabling them doesn't solve it;
> * the collection uses custom extensions for QueryComponent and 
> QueryElevationComponent, ExtendedDismaxQParser and MoreLikeThisQParser, a 
> whole bunch of TokenFilters, and several DocTransformers and due it being 
> only reproducible on production, i really cannot switch these back to 
> Solr/Lucene versions;
> * useFilterForSortedQuery is/was not defined in schema so it was default 
> (true?), SOLR-11769 could be the culprit, i disabled it just now only for the 
> node running 7.4.0, rest of collection runs 7.2.1;
> {noformat}
> {noformat}
> You were right, it was leaking exactly one SolrIndexSearcher instance on each 
> commit. 
> {noformat}
> And from Björn Häuser ([3]):
> {noformat}
> Problem Suspect 1
> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by 
> "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 
> 1.981.148.336 (38,26%) bytes. 
> Biggest instances:
>         • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 
> (1,35%) bytes. 
>         • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 
> (1,27%) bytes. 
>         • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 
> (1,22%) bytes. 
> Problem Suspect 2
> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by 
> "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 
> 1.373.110.208 (26,52%) bytes. 
> {noformat}
> More details in the email threads.
> [1] 
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201804.mbox/%3Czarafa.5ae201c6.2f85.218a781d795b07b1%40mail1.ams.nl.openindex.io%3E]
>  [2] 
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201806.mbox/%3Czarafa.5b351537.7b8c.647ddc93059f68eb%40mail1.ams.nl.openindex.io%3E]
>  [3] 
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201809.mbox/%3c7b5e78c6-8cf6-42ee-8d28-872230ded...@gmail.com%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Mikhail Khludnev (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754653#comment-16754653
 ] 

Mikhail Khludnev commented on SOLR-12330:
-

fair

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13129) Document nested child docs in the ref guide

2019-01-28 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754640#comment-16754640
 ] 

David Smiley commented on SOLR-13129:
-

A fine start mosh!  My CR is on the PR.

> Document nested child docs in the ref guide
> ---
>
> Key: SOLR-13129
> URL: https://issues.apache.org/jira/browse/SOLR-13129
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Affects Versions: 8.0
>Reporter: David Smiley
>Priority: Major
> Fix For: 8.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Solr 8.0 will have nicer support for nested child documents than its 
> predecessors.  This should be documented in one place in the ref guide (to 
> the extent that makes sense). Users need to the schema ramifications (incl. 
> special fields and that some aspects are optional and when), what a nested 
> document "looks like" (XML, JSON, SolrJ), how to use the child doc 
> transformer, how to use block join queries, and get some overview of how this 
> all works.  Maybe mention some plausible future enhancements / direction this 
> is going in (e.g. path query language?).  Some of this is already done but 
> it's in various places and could be moved.  Unlike other features which 
> conveniently fit into one spot in the documentation (like a query parser), 
> this is a more complex issue that has multiple aspects – more 
> "cross-cutting", and so IMO doesn't belong in the current doc pigeon holes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251703561
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
+
+[source,xml]
+
+
+  
+1
+Solr adds block join support
+parentDocument
+
+  
+2
+SolrCloud supports it too!
+  
+
+  
+  
+3
+New Lucene and Solr release is out
+parentDocument
+
+  4
+  Lots of new features
+
+  
+
+
+
+In this example, we have indexed the parent documents with the field 
`content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of 
"true", or any other similar approach.
+
+=== JSON Examples
+
+This example is equivalent to the XML example above.
+Again, the fie

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251699618
 
 

 ##
 File path: solr/solr-ref-guide/src/json-facet-api.adoc
 ##
 @@ -664,7 +664,7 @@ NOTE: While a `query` domain can be combined with an 
additional domain `filter`,
 
 === Block Join Domain Changes
 
-When a collection contains 
<>, the `blockChildren` or `blockParent` domain options can be 
used transform an existing domain containing one type of document, into a 
domain containing the documents with the specified relationship (child or 
parent of) to the documents from the original domain.
 
 Review comment:
   Instead of the phrase "block join child documents", could we just say 
"child/nested documents"?  Same as seen for elsewhere; I won't repeat this 
comment.  I think you explained the "block" historical name well elsewhere.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251702607
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
 
 Review comment:
   "Legacy Schema Notes" confuses me, both the title and it's mostly redundant 
information.  Perhaps "Rudimentary Root-only schemas" and begin with describing 
what this is about.  I'm not sure Legacy is the right word since it's still 
valid.  People can tune down things to what they need that is more minimal or 
DIY.  It needn't redefine the root field's definition & purpose; you already 
defined it therefore here you can just refer to it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251704085
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
+
+[source,xml]
+
+
+  
+1
+Solr adds block join support
+parentDocument
+
+  
+2
+SolrCloud supports it too!
+  
+
+  
+  
+3
+New Lucene and Solr release is out
+parentDocument
+
+  4
+  Lots of new features
+
+  
+
+
+
+In this example, we have indexed the parent documents with the field 
`content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of 
"true", or any other similar approach.
+
+=== JSON Examples
+
+This example is equivalent to the XML example above.
+Again, the fie

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251703154
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
 
 Review comment:
   Our ref guide should assume the current version, and only rarely/sparingly 
refer backwards.  In the case above, I would just drop these last 2 sentences.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr.

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251700652
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
 
 Review comment:
   fields->field


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251704066
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
+
+[source,xml]
+
+
+  
+1
+Solr adds block join support
+parentDocument
+
+  
+2
+SolrCloud supports it too!
+  
+
+  
+  
+3
+New Lucene and Solr release is out
+parentDocument
+
+  4
+  Lots of new features
+
+  
+
+
+
+In this example, we have indexed the parent documents with the field 
`content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of 
"true", or any other similar approach.
+
+=== JSON Examples
+
+This example is equivalent to the XML example above.
+Again, the fie

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251703790
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
 
 Review comment:
   why link elsewhere; you have the content on this page now


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251699246
 
 

 ##
 File path: solr/solr-ref-guide/src/index.adoc
 ##
 @@ -90,6 +90,27 @@ The *<>* section guides yo
 
 --
 
+[.row.match-my-cols]
 
 Review comment:
   Did you add some sections here or something?  I'll have to view this in a 
viewer to see what's going on.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251704156
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
+
+[source,xml]
+
+
+  
+1
+Solr adds block join support
+parentDocument
+
+  
+2
+SolrCloud supports it too!
+  
+
+  
+  
+3
+New Lucene and Solr release is out
+parentDocument
+
+  4
+  Lots of new features
+
+  
+
+
+
+In this example, we have indexed the parent documents with the field 
`content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of 
"true", or any other similar approach.
+
+=== JSON Examples
+
+This example is equivalent to the XML example above.
+Again, the fie

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251704211
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
+
+[source,xml]
+
+
+  
+1
+Solr adds block join support
+parentDocument
+
+  
+2
+SolrCloud supports it too!
+  
+
+  
+  
+3
+New Lucene and Solr release is out
+parentDocument
+
+  4
+  Lots of new features
+
+  
+
+
+
+In this example, we have indexed the parent documents with the field 
`content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of 
"true", or any other similar approach.
+
+=== JSON Examples
+
+This example is equivalent to the XML example above.
+Again, the fie

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251701264
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
 
 Review comment:
   Maybe not even mention it's an URP.  It's hidden, just as populating root 
is.  Here I think we should say loosely what they are for and the ramification 
of not having them.  Also mention what "root" holds.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] dsmiley commented on a change in pull request #549: WIP:SOLR-13129

2019-01-28 Thread GitBox

dsmiley commented on a change in pull request #549: WIP:SOLR-13129
URL: https://github.com/apache/lucene-solr/pull/549#discussion_r251703956
 
 

 ##
 File path: solr/solr-ref-guide/src/nested-documents.adoc
 ##
 @@ -0,0 +1,299 @@
+= Nested Child Documents
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+Solr supports indexing nested documents such as a blog post parent document 
and comments as child documents -- or products as parent documents and sizes, 
colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some 
of the nomenclature of related features.
+At query time, the <> can search these relationships,
+ and the `[child]` 
<> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually 
yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be 
computed.
+However, nested documents are less flexible than query time joins as it 
imposes rules that some applications may not be able to accept.
+
+.Note
+[NOTE]
+
+A big limitation is that the whole block of parent-children documents must be 
updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is 
changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic 
query failures or incorrect results.
+
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is 
also supported by <> with javabin.
+
+=== Schema Notes
+
+ * The schema must include indexed fields field `\_root_`. The value of that 
field is populated automatically and is the same for all documents in the 
block, regardless of the inheritance depth.
+ Fields `\_nest_path_`, `\_nest_parent_` can be configured to store the path 
of the document in the hierarchy, and the unique `id` of the parent in the 
previous level.
+ These 2 fields will be used by NestedUpdateProcessor URP, which is implicitly 
configured under Solr 8, when `\_root_` field is defined.
+ * Nested documents are very much documents in their own right even if certain 
nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+
+
+=== Legacy Schema Notes
+ * The schema must include an indexed, non-stored field `\_root_`. The value 
of that field is populated automatically and is the same for all documents in 
the block, regardless of the inheritance depth.
+ * You must include a field that identifies the parent document as a parent; 
it can be any field that suits this purpose, and it will be used as input for 
the <>.
+ * If you associate a child document as a field (e.g., comment), that field 
need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field 
type.
+
+=== XML Examples
+
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated 
via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or 
"unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is 
ultimately discarded.
+Solr 8 will save the relationship.
+
+[source,xml]
+
+
+  
+1
+Solr adds block join support
+parentDocument
+
+  
+2
+SolrCloud supports it too!
+  
+
+  
+  
+3
+New Lucene and Solr release is out
+parentDocument
+
+  4
+  Lots of new features
+
+  
+
+
+
+In this example, we have indexed the parent documents with the field 
`content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of 
"true", or any other similar approach.
+
+=== JSON Examples
+
+This example is equivalent to the XML example above.
+Again, the fie

[jira] [Commented] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754637#comment-16754637
 ] 

Munendra S N commented on SOLR-12330:
-

[~mkhludnev]
 I feel handling generic exception is a bad idea. I would prefer hunting down 
all the cases which can throw NPE or unexpected errors and handling it even 
though this would be time consuming.

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-11883) NPE on missing nested query in QueryValueSource

2019-01-28 Thread Mikhail Khludnev (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev reassigned SOLR-11883:
---

Assignee: Mikhail Khludnev

> NPE on missing nested query in QueryValueSource
> ---
>
> Key: SOLR-11883
> URL: https://issues.apache.org/jira/browse/SOLR-11883
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-11883.patch
>
>
> When the nested query or query de-referencing is used but the query isn't 
> specified Solr throws NPE.
> For following request, 
> {code:java}
> http://localhost:8983/solr/blockjoin70001-1492010056/select?q=*&boost=query($qq)&defType=edismax
> {code}
> Solr returned 500 with stack trace
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.lucene.queries.function.valuesource.QueryValueSource.hashCode(QueryValueSource.java:63)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.hashCode(ValueSource.java:275)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.hashCode(FunctionScoreQuery.java:269)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery.hashCode(FunctionScoreQuery.java:130)
>   at org.apache.solr.search.QueryResultKey.(QueryResultKey.java:46)
>   at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1326)
>   at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:583)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at org.eclipse.jetty.server.Server.handle(Server.java:530)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.jav

[jira] [Commented] (SOLR-11883) NPE on missing nested query in QueryValueSource

2019-01-28 Thread Mikhail Khludnev (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754625#comment-16754625
 ] 

Mikhail Khludnev commented on SOLR-11883:
-

+1

> NPE on missing nested query in QueryValueSource
> ---
>
> Key: SOLR-11883
> URL: https://issues.apache.org/jira/browse/SOLR-11883
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Munendra S N
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-11883.patch
>
>
> When the nested query or query de-referencing is used but the query isn't 
> specified Solr throws NPE.
> For following request, 
> {code:java}
> http://localhost:8983/solr/blockjoin70001-1492010056/select?q=*&boost=query($qq)&defType=edismax
> {code}
> Solr returned 500 with stack trace
> {code:java}
> java.lang.NullPointerException
>   at 
> org.apache.lucene.queries.function.valuesource.QueryValueSource.hashCode(QueryValueSource.java:63)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.ValueSource$WrappedDoubleValuesSource.hashCode(ValueSource.java:275)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery$MultiplicativeBoostValuesSource.hashCode(FunctionScoreQuery.java:269)
>   at java.util.Arrays.hashCode(Arrays.java:4146)
>   at java.util.Objects.hash(Objects.java:128)
>   at 
> org.apache.lucene.queries.function.FunctionScoreQuery.hashCode(FunctionScoreQuery.java:130)
>   at org.apache.solr.search.QueryResultKey.(QueryResultKey.java:46)
>   at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1326)
>   at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:583)
>   at 
> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1435)
>   at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:375)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>   at org.eclipse.jetty.server.Server.handle(Server.java:530)
>   at org.eclipse.jetty.server.HttpChannel

[jira] [Commented] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Mikhail Khludnev (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754631#comment-16754631
 ] 

Mikhail Khludnev commented on SOLR-12330:
-

[~munendrasn], +1 for the patch. 

what's your thought regarding 
bq. I suppose we can't hunt for those NPE rows one by one, but rater wrap 
FacetModule invocation with catch(Exception e) {throw new SolrException(...,e);}
?

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-7672) introduce implicit _parent_:true

2019-01-28 Thread Mikhail Khludnev (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev reopened SOLR-7672:

  Assignee: (was: Mikhail Khludnev)

I'm sorry, [~dsmiley] I'm out of the recent work in this field. 

> introduce implicit _parent_:true  
> --
>
> Key: SOLR-7672
> URL: https://issues.apache.org/jira/browse/SOLR-7672
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, update
>Affects Versions: 5.2
>Reporter: Mikhail Khludnev
>Priority: Major
>
> Solr provides block join support in non-invasive manner. It turns out, it 
> gives a chance to shoot a leg. As it was advised by [~thetaphi] at SOLR-7606, 
> let AddUpdateCommand add _parent_:true field to the document (not to 
> children). Do it *always* no matter whether it has children or not.
> Also, introduce default values for for block join qparsers \{!parent 
> *which=\_parent\_:true*} \{!child *of=\_parent\_:true*} (sometimes, I rather 
> want to hide them from the user, because they are misunderstood quite often). 
>  
> Please share your concerns and vote.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13174) NPE in Json Facet API for Facet range

2019-01-28 Thread Lucene/Solr QA (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754610#comment-16754610
 ] 

Lucene/Solr QA commented on SOLR-13174:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  1m 38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  1m 38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  1m 38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 42m  
6s{color} | {color:green} core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-13174 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956579/SOLR-13174.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon 
Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 8bee03f |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/275/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/275/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> NPE in Json Facet API for Facet range
> -
>
> Key: SOLR-13174
> URL: https://issues.apache.org/jira/browse/SOLR-13174
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Munendra S N
>Priority: Minor
> Attachments: SOLR-13174.patch
>
>
> There is mismatch in the error and status code between JSON facet's facet 
> range and Classical facet range.
> When start or end or gap is not specified in the request, Classical faceting 
> returns Bad request where as JSON facet returns 500 without below trace
> {code:java}
> {
> "trace": "java.lang.NullPointerException\n\tat 
> org.apache.solr.search.facet.FacetRangeProcessor.createRangeList(FacetRange.java:216)\n\tat
>  
> org.apache.solr.search.facet.FacetRangeProcessor.getRangeCounts(FacetRange.java:206)\n\tat
>  
> org.apache.solr.search.facet.FacetRangeProcessor.process(FacetRange.java:98)\n\tat
>  
> org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:460)\n\tat
>  
> org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:407)\n\tat
>  
> org.apache.solr.search.facet.FacetQueryProcessor.process(FacetQuery.java:64)\n\tat
>  org.apache.solr.search.facet.FacetModule.process(FacetModule.java:154)\n\tat 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)\n\tat
>  
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)\n\tat
>  org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)\n\tat 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)\n\tat
>  
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)\n\tat
>  
> org.eclipse.jetty.servlet.Ser

[jira] [Commented] (SOLR-12304) Interesting Terms parameter is ignored by MLT Component

2019-01-28 Thread Lucene/Solr QA (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754609#comment-16754609
 ] 

Lucene/Solr QA commented on SOLR-12304:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  3m 25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  3m 25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  3m 25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 56s{color} 
| {color:red} core in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.cloud.LeaderTragicEventTest |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12304 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956556/SOLR-12304.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP 
Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 8bee03f |
| ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 |
| Default Java | 1.8.0_191 |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/274/artifact/out/patch-unit-solr_core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/274/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/274/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Interesting Terms parameter is ignored by MLT Component
> ---
>
> Key: SOLR-12304
> URL: https://issues.apache.org/jira/browse/SOLR-12304
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: MoreLikeThis
>Affects Versions: 7.2
>Reporter: Alessandro Benedetti
>Priority: Major
> Attachments: SOLR-12304.patch, SOLR-12304.patch, SOLR-12304.patch, 
> SOLR-12304.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently the More Like This component just ignores the mlt.InterestingTerms 
> parameter ( which is usable by the MoreLikeThisHandler).
> Scope of this issue is to fix the bug and add related tests ( which will 
> succeed after the fix )
> *N.B.* MoreLikeThisComponent and MoreLikeThisHandler are very coupled and the 
> tests for the MoreLikeThisHandler are intersecting the MoreLikeThisComponent 
> ones .
>  It is out of scope for this issue any consideration or refactor of that.
>  Other issues will follow.
> *N.B.* out of scope for this issue is the distributed case, which is much 
> more complicated and requires much deeper investigations



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754590#comment-16754590
 ] 

Munendra S N commented on SOLR-12330:
-

[~mkhludnev] 
[^SOLR-12330.patch]
This patch refactored version of your patch which handles the case of param 
getting silently ignored

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13176) Testing of TLOG Replicas needs to be re-instated, may be hiding bugs

2019-01-28 Thread JIRA



[ 
https://issues.apache.org/jira/browse/SOLR-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754596#comment-16754596
 ] 

Tomás Fernández Löbbe commented on SOLR-13176:
--

Much of the TLOG testing was added for the "onlyLeaderIndexes" changes, so 
[~caomanhdat] can probably comment on it more. I'm not sure I follow exactly, 
but if the logic in {{waitForInSyncWithLeader}} is commented out and just 
returns immediately I expect lots of tests to fail, something like:
 * add document
 * commit
 * search

won't work. All those other tests were not modified to handle TLOG replicas, 
they assume the same behavior of NRT. (Except for {{TestTlogReplica}} and maybe 
{{ChaosMonkeySafeLeaderWithPullReplicasTest}})

> Testing of TLOG Replicas needs to be re-instated, may be hiding bugs
> 
>
> Key: SOLR-13176
> URL: https://issues.apache.org/jira/browse/SOLR-13176
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
>
> As part of mark miller's push to cleanup tests, one change he made as part of 
> his _big__ SOLR-12801 commit (circa Nov2018) was to dissable the randomized 
> use of TLOG replicas in a lot of tests
> His comments at the time were that he suspected a lot of the problems he was 
> seeing was due to a poor implementation of 
> {{TestInjection.waitForInSyncWithLeader()}} (which only comes into play for 
> TLOG replicas) ultimately leading to him creating SOLR-12313.
> But based on some limited experimentation I made w/trying to re-enable TLOG 
> replica randomization in some tests after (essentially) removing 
> {{TestInjection.waitForInSyncWithLeader()}} in SOLR-13168 i'm still seeing a 
> lot of sporadic test failures when TLOG replicas get used... the only change 
> is that instead of "failing slow" because of the stalls introduced by 
> {{TestInjection.waitForInSyncWithLeader()}} they started failing quickly.
> *It's not clear if these failures are because the tests have bugs; or if the 
> tests don't account for the expected behavior of the TLOG replica types in 
> certain situations; or if the code paths being tested have bugs when dealing 
> with TLOG replicas.*
> 
> Bottom line: As things stand today, TLOG replicas aren't being very 
> thoroughly tested, particularly in edge cases (http partitions, LIR, leader 
> election, mixed used of replica types, etc...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754594#comment-16754594
 ] 

Munendra S N commented on SOLR-12330:
-

Also, there are other cases where referencing non-exist parameter will give NPE 
with status 500. One such case which I encountered is in function query 
SOLR-11883. I suspect there are other cases also where same issue occurs.
Should the cases be handled differently or should be handled in *parse* method 
so that all the cases are covered?

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Munendra S N (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N updated SOLR-12330:

Attachment: SOLR-12330.patch

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-6687) MLT term frequency calculation bug

2019-01-28 Thread Lucene/Solr QA (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754584#comment-16754584
 ] 

Lucene/Solr QA commented on LUCENE-6687:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  0m 25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  0m 25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} queries in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  3m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-6687 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956555/LUCENE-6687.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon 
Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 8bee03f |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/158/testReport/ |
| modules | C: lucene/queries U: lucene/queries |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/158/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> MLT term frequency calculation bug
> --
>
> Key: LUCENE-6687
> URL: https://issues.apache.org/jira/browse/LUCENE-6687
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/queryparser
>Affects Versions: 5.2.1, 6.0
> Environment: OS X v10.10.4; Solr 5.2.1
>Reporter: Marko Bonaci
>Priority: Major
> Fix For: 5.2.2
>
> Attachments: LUCENE-6687.patch, LUCENE-6687.patch, LUCENE-6687.patch, 
> LUCENE-6687.patch, buggy-method-usage.png, 
> solr-mlt-tf-doubling-bug-results.png, 
> solr-mlt-tf-doubling-bug-verify-accumulator-mintf14.png, 
> solr-mlt-tf-doubling-bug-verify-accumulator-mintf15.png, 
> solr-mlt-tf-doubling-bug.png, terms-accumulator.png, terms-angry.png, 
> terms-glass.png, terms-how.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In {{org.apache.lucene.queries.mlt.MoreLikeThis}}, there's a method 
> {{retrieveTerms}} that receives a {{Map}} of fields, i.e. a document 
> basically, but it doesn't have to be an existing doc.
> !solr-mlt-tf-doubling-bug.png|height=500!
> There are 2 for loops, one inside the other, which both loop through the same 
> set of fields.
> That effectively doubles the term frequency for all the terms from fields 
> that we provide in MLT QP {{qf}} parameter. 
> It basically goes two times over the list of fields and accumulates the term 
> frequencies from all fields into {{termFreqMap}}.
> The private method {{retrieveTerms}} is only called from one public method, 
> the version of overloaded method {{like}} that receives a Map: so that 
> private class member {{fieldNames}} is always derived from 
> {{retrieveTerms}}'s argument {{fields}}.
>  
> Uh, I don't understand what I wrote myself, but that basically means that, by 
> the time {{retrieveTerms}} method gets called, its parameter fields and 
> private member {{fieldNames}} always contain the same list of fields.
> Here's the proof:
> These are the final results of the calculation:
> !solr-mlt-tf-doubling-bug-results.png|h

Re: wip branch pattern

2019-01-28 Thread Gus Heck

https://issues.apache.org/jira/browse/INFRA-17771

On Mon, Jan 28, 2019 at 8:26 PM David Smiley 
wrote:

> I suspect the move to gitbox may be a factor too. I suggest filing an
> infra ticket
> On Mon, Jan 28, 2019 at 6:15 PM Gus Heck  wrote:
>
>> I just created this branch
>> https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/solr-13131
>> which seems ot coonform to the pattern, but it still generated mails. Did
>> the ignore pattern get lost in the move to gitbox?
>>
>> On Sat, Jan 26, 2019 at 9:43 PM Gus Heck  wrote:
>>
>>> Is the pattern described here:
>>> https://issues.apache.org/jira/browse/INFRA-11198 case sensitive?
>>>
>>> -Gus?
>>>
>>
>>
>> --
>> http://www.the111shift.com
>>
> --
> Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>


-- 
http://www.the111shift.com

[JENKINS] Lucene-Solr-Tests-master - Build # 3161 - Unstable

2019-01-28 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/3161/

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.handler.admin.AutoscalingHistoryHandlerTest

Error Message:
1 thread leaked from SUITE scope at 
org.apache.solr.handler.admin.AutoscalingHistoryHandlerTest: 1) 
Thread[id=6804, name=zkConnectionManagerCallback-1814-thread-1, state=WAITING, 
group=TGRP-AutoscalingHistoryHandlerTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
 at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)   
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.handler.admin.AutoscalingHistoryHandlerTest: 
   1) Thread[id=6804, name=zkConnectionManagerCallback-1814-thread-1, 
state=WAITING, group=TGRP-AutoscalingHistoryHandlerTest]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at __randomizedtesting.SeedInfo.seed([DC6FBEBE97287ADA]:0)


FAILED:  
junit.framework.TestSuite.org.apache.solr.handler.admin.AutoscalingHistoryHandlerTest

Error Message:
There are still zombie threads that couldn't be terminated:1) 
Thread[id=6804, name=zkConnectionManagerCallback-1814-thread-1, state=WAITING, 
group=TGRP-AutoscalingHistoryHandlerTest] at 
sun.misc.Unsafe.park(Native Method) at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
 at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)   
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: There are still zombie 
threads that couldn't be terminated:
   1) Thread[id=6804, name=zkConnectionManagerCallback-1814-thread-1, 
state=WAITING, group=TGRP-AutoscalingHistoryHandlerTest]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at __randomizedtesting.SeedInfo.seed([DC6FBEBE97287ADA]:0)




Build Log:
[...truncated 13054 lines...]
   [junit4] Suite: org.apache.solr.handler.admin.AutoscalingHistoryHandlerTest
   [junit4]   2> 809452 INFO  
(SUITE-AutoscalingHistoryHandlerTest-seed#[DC6FBEBE97287ADA]-worker) [] 
o.a.s.SolrTestCaseJ4 SecureRandom sanity checks: 
test.solr.allowed.securerandom=null & java.security.egd=file:/dev/./urandom
   [junit4]   2> Creating dataDir: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Tests-master/solr/build/solr-core/test/J2/temp/solr.handler.admin.AutoscalingHistoryHandlerTest_DC6FBEBE97287ADA-001/init-core-data-001
   [junit4]   2> 809453 INFO  
(SUITE-AutoscalingHistoryHandlerTest-seed#[DC6FBEBE97287ADA]-worker) [] 
o.a.s.SolrTestCaseJ4 Using PointFields (NUMERIC_POINTS_SYSPROP=true) 
w/NUMERIC_DOCVALUES_SYSPROP=true
   [junit4]   2> 809468 INFO  
(SUITE-AutoscalingHistoryHandlerTest-seed#[DC6FBEBE97287ADA]-worker) [] 
o.a.s.SolrTestCaseJ4 Randomized

[jira] [Comment Edited] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2019-01-28 Thread Michael Gibney (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754164#comment-16754164
 ] 

Michael Gibney edited comment on SOLR-13132 at 1/29/19 3:03 AM:


I've refined the earlier patch (implementing parallel facet count collection 
for sort-by-relatedness). For consideration, the [^SOLR-13132-with-cache.patch] 
also implements a per-segment (and top-level) cache of facet counts (and inline 
"missing" bucket collection, fwiw).

As described [in a more discursive blog 
post|https://michaelgibney.net/2019/01/solr-terms-skg-performance/], the facet 
cache is something that's been in the back of my mind for a while, but would 
have a particular impact on sort-by-relatedness with parallel facet count 
collection, so I modified an initial implementation from simple facets 
{{DocValuesFacets}} to make it compatible with JSON facets as well.

For my use case this yields anywhere from 5x-450x latency reduction for high- 
and even modestly-high-cardinality domain queries with sort-by-relatedness. 
Facet cache alone yields ~10x latency reduction for simple sort-by-count facets 
over common/cached high-cardinality domains (e.g., {{*:*}}). More detail (rough 
benchmarks, etc.) can be found in the blog post linked above.

To enable facet cache, in {{solrconfig.xml}}:
{code:xml}

{code}
(I realize the "facet cache" should probably be a separate issue, but given its 
particular relevance as a complement to this issue, I opted to include it in 
this patch. I hope that's ok ...)


was (Author: mgibney):
I've refined the earlier patch (implementing parallel facet count collection 
for sort-by-relatedness). For consideration, the [new 
patch|^SOLR-13132-with-cache.patch] also implements a per-segment (and 
top-level) cache of facet counts (and inline "missing" bucket collection, fwiw).

As described [in a more discursive blog 
post|https://michaelgibney.net/2019/01/solr-terms-skg-performance/], the facet 
cache is something that's been in the back of my mind for a while, but would 
have a particular impact on sort-by-relatedness with parallel facet count 
collection, so I modified an initial implementation from simple facets 
{{DocValuesFacets}} to make it compatible with JSON facets as well.

FYI, for my (real-world) test use case this yields anywhere from 5x-450x 
latency reduction for high- and even modestly-high-cardinality domain queries 
with sort-by-relatedness. Facet cache alone yields ~10x latency reduction for 
simple sort-by-count facets over common/cached high-cardinality domains (e.g., 
{{*:*}}). More detail (rough benchmarks, etc.) can be found in the blog post 
linked above.

To enable facet cache, in {{solrconfig.xml}}:
{code:xml}

{code}
(I realize the "facet cache" should probably be a separate issue, but given its 
particular relevance as a complement to this issue, I opted to include it in 
this patch. I hope that's ok ...)

> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-with-cache.patch, SOLR-13132.patch
>
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet cr

[jira] [Commented] (SOLR-13176) Testing of TLOG Replicas needs to be re-instated, may be hiding bugs

2019-01-28 Thread Hoss Man (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754521#comment-16754521
 ] 

Hoss Man commented on SOLR-13176:
-

Ping: [~tomasflobbe] / [~caomanhdat] ... i know you guys did a lot of this 
initial TLOG replica work and setup the randomized replica types in these tests.

> Testing of TLOG Replicas needs to be re-instated, may be hiding bugs
> 
>
> Key: SOLR-13176
> URL: https://issues.apache.org/jira/browse/SOLR-13176
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
>
> As part of mark miller's push to cleanup tests, one change he made as part of 
> his _big__ SOLR-12801 commit (circa Nov2018) was to dissable the randomized 
> use of TLOG replicas in a lot of tests
> His comments at the time were that he suspected a lot of the problems he was 
> seeing was due to a poor implementation of 
> {{TestInjection.waitForInSyncWithLeader()}} (which only comes into play for 
> TLOG replicas) ultimately leading to him creating SOLR-12313.
> But based on some limited experimentation I made w/trying to re-enable TLOG 
> replica randomization in some tests after (essentially) removing 
> {{TestInjection.waitForInSyncWithLeader()}} in SOLR-13168 i'm still seeing a 
> lot of sporadic test failures when TLOG replicas get used... the only change 
> is that instead of "failing slow" because of the stalls introduced by 
> {{TestInjection.waitForInSyncWithLeader()}} they started failing quickly.
> *It's not clear if these failures are because the tests have bugs; or if the 
> tests don't account for the expected behavior of the TLOG replica types in 
> certain situations; or if the code paths being tested have bugs when dealing 
> with TLOG replicas.*
> 
> Bottom line: As things stand today, TLOG replicas aren't being very 
> thoroughly tested, particularly in edge cases (http partitions, LIR, leader 
> election, mixed used of replica types, etc...)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13176) Testing of TLOG Replicas needs to be re-instated, may be hiding bugs

2019-01-28 Thread Hoss Man (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754517#comment-16754517
 ] 

Hoss Man commented on SOLR-13176:
-


A quick and dirty (non-exhaustive) list of just some of the places that tlog 
replicas were originally being tested but are not currently (typically because 
a randomized "boolean" is now hardcoded) ...

{noformat}
$ find solr/ -name \*.java | grep test | xargs egrep 
'SOLR-12313|waitForInSyncWithLeader|TODO:?\s*tlog'
solr/core/src/test/org/apache/solr/update/TestInPlaceUpdatesDistrib.java:
return false; // TODO: tlog replicas makes commits take way to long due to what 
is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/ForceLeaderTest.java:  // TODO: 
SOLR-12313 tlog replicas makes commits take way to long due to what is likely a 
bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/ChaosMonkeyNothingIsSafeWithPullReplicasTest.java:
return false; // TODO: tlog replicas makes commits take way to long due to 
what is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/TestTlogReplica.java:@AwaitsFix(bugUrl 
= "https://issues.apache.org/jira/browse/SOLR-12313";)
solr/core/src/test/org/apache/solr/cloud/HttpPartitionTest.java:return 
false; // TODO: tlog replicas makes commits take way to long due to what is 
likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/ChaosMonkeySafeLeaderWithPullReplicasTest.java:
return false; // TODO: tlog replicas makes commits take way to long due to 
what is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/ReplaceNodeTest.java:// TODO: tlog 
replicas do not work correctly in tests due to fault 
TestInjection#waitForInSyncWithLeader
solr/core/src/test/org/apache/solr/cloud/ChaosMonkeyNothingIsSafeTest.java:
return false; // TODO: tlog replicas makes commits take way to long due to what 
is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/api/collections/ShardSplitTest.java:   
 CollectionAdminRequest.Create create = 
CollectionAdminRequest.createCollection(collectionName, "conf1", 1, 2, 0, 2); 
// TODO tlog replicas disabled right now.
solr/core/src/test/org/apache/solr/cloud/HttpPartitionOnCommitTest.java:
return false; // TODO: tlog replicas makes commits take way to long due to what 
is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/BasicDistributedZk2Test.java:
return false; // TODO: tlog replicas makes commits take way to long due to what 
is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/RecoveryAfterSoftCommitTest.java:
return false; // TODO: tlog replicas makes commits take way to long due to what 
is likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/BasicDistributedZkTest.java:return 
false; // TODO: tlog replicas makes commits take way to long due to what is 
likely a bug and it's TestInjection use
solr/core/src/test/org/apache/solr/cloud/TestCloudRecovery.java:
tlogReplicas = 0; // onlyLeaderIndexes?2:0; TODO: SOLR-12313 tlog replicas 
break tests because
solr/core/src/test/org/apache/solr/cloud/TestCloudRecovery.java:
  // TestInjection#waitForInSyncWithLeader is broken

{noformat}

> Testing of TLOG Replicas needs to be re-instated, may be hiding bugs
> 
>
> Key: SOLR-13176
> URL: https://issues.apache.org/jira/browse/SOLR-13176
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Priority: Major
>
> As part of mark miller's push to cleanup tests, one change he made as part of 
> his _big__ SOLR-12801 commit (circa Nov2018) was to dissable the randomized 
> use of TLOG replicas in a lot of tests
> His comments at the time were that he suspected a lot of the problems he was 
> seeing was due to a poor implementation of 
> {{TestInjection.waitForInSyncWithLeader()}} (which only comes into play for 
> TLOG replicas) ultimately leading to him creating SOLR-12313.
> But based on some limited experimentation I made w/trying to re-enable TLOG 
> replica randomization in some tests after (essentially) removing 
> {{TestInjection.waitForInSyncWithLeader()}} in SOLR-13168 i'm still seeing a 
> lot of sporadic test failures when TLOG replicas get used... the only change 
> is that instead of "failing slow" because of the stalls introduced by 
> {{TestInjection.waitForInSyncWithLeader()}} they started failing quickly.
> *It's not clear if these failures are because the tests have bugs; or if the 
> tests don't account for the expected behavior of the TLOG replica types in 
> certa

[jira] [Created] (SOLR-13176) Testing of TLOG Replicas needs to be re-instated, may be hiding bugs

2019-01-28 Thread Hoss Man (JIRA)

Hoss Man created SOLR-13176:
---

 Summary: Testing of TLOG Replicas needs to be re-instated, may be 
hiding bugs
 Key: SOLR-13176
 URL: https://issues.apache.org/jira/browse/SOLR-13176
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Hoss Man


As part of mark miller's push to cleanup tests, one change he made as part of 
his _big__ SOLR-12801 commit (circa Nov2018) was to dissable the randomized use 
of TLOG replicas in a lot of tests

His comments at the time were that he suspected a lot of the problems he was 
seeing was due to a poor implementation of 
{{TestInjection.waitForInSyncWithLeader()}} (which only comes into play for 
TLOG replicas) ultimately leading to him creating SOLR-12313.

But based on some limited experimentation I made w/trying to re-enable TLOG 
replica randomization in some tests after (essentially) removing 
{{TestInjection.waitForInSyncWithLeader()}} in SOLR-13168 i'm still seeing a 
lot of sporadic test failures when TLOG replicas get used... the only change is 
that instead of "failing slow" because of the stalls introduced by 
{{TestInjection.waitForInSyncWithLeader()}} they started failing quickly.

*It's not clear if these failures are because the tests have bugs; or if the 
tests don't account for the expected behavior of the TLOG replica types in 
certain situations; or if the code paths being tested have bugs when dealing 
with TLOG replicas.*



Bottom line: As things stand today, TLOG replicas aren't being very thoroughly 
tested, particularly in edge cases (http partitions, LIR, leader election, 
mixed used of replica types, etc...)




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-SmokeRelease-8.x - Build # 13 - Failure

2019-01-28 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-8.x/13/

No tests ran.

Build Log:
[...truncated 23464 lines...]
[asciidoctor:convert] asciidoctor: ERROR: about-this-guide.adoc: line 1: 
invalid part, must have at least one section (e.g., chapter, appendix, etc.)
[asciidoctor:convert] asciidoctor: ERROR: solr-glossary.adoc: line 1: invalid 
part, must have at least one section (e.g., chapter, appendix, etc.)
 [java] Processed 2467 links (2018 relative) to 3229 anchors in 247 files
 [echo] Validated Links & Anchors via: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/build/solr-ref-guide/bare-bones-html/

-dist-changes:
 [copy] Copying 4 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/package/changes

package:

-unpack-solr-tgz:

-ensure-solr-tgz-exists:
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/build/solr.tgz.unpacked
[untar] Expanding: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/package/solr-8.0.0.tgz
 into 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/solr/build/solr.tgz.unpacked

generate-maven-artifacts:

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-8.x/lucene/top-level-ivy-settings.xml

resolve:

ivy-availability-check:
[loadresource] Do not set property disallowed.ivy.jars.list as its length is 0.

-ivy-fail-disallowed-ivy-version:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings ::

Re: wip branch pattern

2019-01-28 Thread David Smiley

I suspect the move to gitbox may be a factor too. I suggest filing an infra
ticket
On Mon, Jan 28, 2019 at 6:15 PM Gus Heck  wrote:

> I just created this branch
> https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/solr-13131
> which seems ot coonform to the pattern, but it still generated mails. Did
> the ignore pattern get lost in the move to gitbox?
>
> On Sat, Jan 26, 2019 at 9:43 PM Gus Heck  wrote:
>
>> Is the pattern described here:
>> https://issues.apache.org/jira/browse/INFRA-11198 case sensitive?
>>
>> -Gus?
>>
>
>
> --
> http://www.the111shift.com
>
-- 
Lucene/Solr Search Committer (PMC), Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

[jira] [Commented] (SOLR-13175) Rollup does not roll up NULL values.

2019-01-28 Thread Erick Erickson (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754485#comment-16754485
 ] 

Erick Erickson commented on SOLR-13175:
---

This appears to be that NULL and zero are treated equivalently.

> Rollup does not roll up NULL values.
> 
>
> Key: SOLR-13175
> URL: https://issues.apache.org/jira/browse/SOLR-13175
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Priority: Major
>
> Steps to reproduce:
> - Create a 2-shard collection for sample tech products configs and index all 
> the example documents to it (I always use "eoe" as my collection ;) )
> - Submit this expression:
> {code{
> rollup(
>   search(eoe, q="*:*", fl="popularity", qt="/export", sort="popularity 
> asc"),
> over="popularity",count(*))
> {code}
> Results: Note that the "NULL" bucket is repeated, presumably once from each 
> shard. The rest of the rollup is aggregated appropriately.
> {code}
> {
>   "result-set": {
> "docs": [
>   {
> "count(*)": 15,
> "popularity": "NULL"
>   },
>   {
> "count(*)": 1,
> "popularity": 0
>   },
>   {
> "count(*)": 2,
> "popularity": "NULL"
>   },
>   {
> "count(*)": 2,
> "popularity": 1
>   },
> etc.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-13175) Rollup does not roll up NULL values.

2019-01-28 Thread Erick Erickson (JIRA)

Erick Erickson created SOLR-13175:
-

 Summary: Rollup does not roll up NULL values.
 Key: SOLR-13175
 URL: https://issues.apache.org/jira/browse/SOLR-13175
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Erick Erickson


Steps to reproduce:
- Create a 2-shard collection for sample tech products configs and index all 
the example documents to it (I always use "eoe" as my collection ;) )
- Submit this expression:
{code{
rollup(
  search(eoe, q="*:*", fl="popularity", qt="/export", sort="popularity 
asc"),
over="popularity",count(*))
{code}

Results: Note that the "NULL" bucket is repeated, presumably once from each 
shard. The rest of the rollup is aggregated appropriately.

{code}
{
  "result-set": {
"docs": [
  {
"count(*)": 15,
"popularity": "NULL"
  },
  {
"count(*)": 1,
"popularity": 0
  },
  {
"count(*)": 2,
"popularity": "NULL"
  },
  {
"count(*)": 2,
"popularity": 1
  },
etc.
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread Mark Miller (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754458#comment-16754458
 ] 

Mark Miller commented on LUCENE-8662:
-

[~jpountz] any thoughts on this one?

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
> Attachments: output of test program.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
>  
> I added System.out.println("ord: " + ord); in 
> codecs.blocktree.SegmentTermsEnum.getFrame(int).
> Please check the attached output of test program.txt. 
>  
> We found out the root cause:
> we didn't implement seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef text) throws IOException {
>   return in.seekExact(text);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-BadApples-Tests-8.x - Build # 13 - Still Unstable

2019-01-28 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-8.x/13/

1 tests failed.
FAILED:  org.apache.solr.client.solrj.TestLBHttpSolrClient.testReliability

Error Message:
No live SolrServers available to handle this request

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request
at 
__randomizedtesting.SeedInfo.seed([6F8B56201AC6F5E9:AE438B66BBA02440]:0)
at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:669)
at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:581)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:983)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:998)
at 
org.apache.solr.client.solrj.TestLBHttpSolrClient.testReliability(TestLBHttpSolrClient.java:221)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)

[jira] [Commented] (SOLR-13131) Category Routed Aliases

2019-01-28 Thread Gus Heck (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754439#comment-16754439
 ] 

Gus Heck commented on SOLR-13131:
-

Started a feature branch for this group of tickets: 
https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/solr-13131

> Category Routed Aliases
> ---
>
> Key: SOLR-13131
> URL: https://issues.apache.org/jira/browse/SOLR-13131
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
>
> This ticket is to add a second type of routed alias in addition to the 
> current time routed aliases. The new type of alias will allow data driven 
> creation of collections based on the values of a field and automated 
> organization of these collections under an alias that allows the collections 
> to also be searched as a whole.
> The use case in mind at present is an IOT device type segregation, but I 
> could also see this leading to the ability to direct updates to tenant 
> specific hardware (in cooperation with autoscaling). 
> This ticket also looks forward to (but does not include) the creation of a 
> Dimensionally Routed Alias which would allow organizing time routed data also 
> segregated by device
> Further design details to be added in comments.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13148) Move time based logic into TimeRoutedAlias class

2019-01-28 Thread Gus Heck (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754442#comment-16754442
 ] 

Gus Heck commented on SOLR-13148:
-

Pushed a feature branch with most of the suggested chanes, but I changed the 
way the factory worked to make it a regular static method which seemed to make 
it easier to read/maintain (IMHO). Also adjusted references to TimeRoutedAlias 
in Collections Handler. That could possibly have been left for SOLR-13152, but 
I wanted to move to non-public constructors on implementations of RoutedAlias 
to enforce the use of the factory. I think all the major work here is done. 
Going to call this ticket done and move on to the next one.

> Move time based logic into TimeRoutedAlias class
> 
>
> Key: SOLR-13148
> URL: https://issues.apache.org/jira/browse/SOLR-13148
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: UpdateRequestProcessors
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13148.patch, SOLR-13148.patch, SOLR-13148.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To pave the way for new types of routed aliases we need to get any and all 
> time related logic out of the URP and into TimeRoutedAlias. This ticket will 
> do that, Rename the URP and extract an initial proposed Generic RoutedAlias 
> interface implemented by both TimeRoutedAlias and a skeleton place holder for 
> CategoryRoutedAlias



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: wip branch pattern

2019-01-28 Thread Gus Heck

I just created this branch
https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/solr-13131
which seems ot coonform to the pattern, but it still generated mails. Did
the ignore pattern get lost in the move to gitbox?

On Sat, Jan 26, 2019 at 9:43 PM Gus Heck  wrote:

> Is the pattern described here:
> https://issues.apache.org/jira/browse/INFRA-11198 case sensitive?
>
> -Gus?
>

-- 
http://www.the111shift.com

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Description: 
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:194) 
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
 (DistributedUpdateProcessor.java:1051)  
{code}
We reproduced the problem locally with the following code using Lucene code.
{code:java}
public static void main(String[] args) throws IOException {
  FSDirectory index = FSDirectory.open(Paths.get("the-index"));
  try (IndexReader reader = new   
ExitableDirectoryReader(DirectoryReader.open(index),
new QueryTimeoutImpl(1000 * 60 * 5))) {
String id = "the-id";

BytesRef text = new BytesRef(id);
for (LeafReaderContext lf : reader.leaves()) {
  TermsEnum te = lf.reader().terms("id").iterator();
  System.out.println(te.seekExact(text));
}
  }
}
{code}
 

 I added System.out.println("ord: " + ord); in 
codecs.blocktree.SegmentTermsEnum.getFrame(int).

Please check the attached output of test program.txt. 

 

We found out the root cause:

we didn't implement seekExact(BytesRef) method in FilterLeafReader.FilterTerms, 
so it uses the base class TermsEnum.seekExact(BytesRef) implementation which is 
very inefficient in this case.
{code:java}
public boolean seekExact(BytesRef text) throws IOException {
  return seekCeil(text) == SeekStatus.FOUND;
}
{code}
The fix is simple, just override seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms
{code:java}
@Override
public boolean seekExact(BytesRef text) throws IOException {
  return in.seekExact(text);
}
{code}

  was:
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/ap

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Description: 
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:194) 
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
 (DistributedUpdateProcessor.java:1051)  
{code}
We reproduced the problem locally with the following code using Lucene code.
{code:java}
public static void main(String[] args) throws IOException {
  FSDirectory index = FSDirectory.open(Paths.get("the-index"));
  try (IndexReader reader = new   
ExitableDirectoryReader(DirectoryReader.open(index),
new QueryTimeoutImpl(1000 * 60 * 5))) {
String id = "the-id";

BytesRef text = new BytesRef(id);
for (LeafReaderContext lf : reader.leaves()) {
  TermsEnum te = lf.reader().terms("id").iterator();
  System.out.println(te.seekExact(text));
}
  }
}
{code}
 

Please check the attched 

We found out the root cause:

we didn't implement seekExact(BytesRef) method in FilterLeafReader.FilterTerms, 
so it uses the base class TermsEnum.seekExact(BytesRef) implementation which is 
very inefficient in this case.
{code:java}
public boolean seekExact(BytesRef text) throws IOException {
  return seekCeil(text) == SeekStatus.FOUND;
}
{code}
The fix is simple, just override seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms
{code:java}
@Override
public boolean seekExact(BytesRef text) throws IOException {
  return in.seekExact(text);
}
{code}

  was:
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apa

[jira] [Comment Edited] (SOLR-7672) introduce implicit _parent_:true

2019-01-28 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754413#comment-16754413
 ] 

David Smiley edited comment on SOLR-7672 at 1/28/19 10:29 PM:
--

How so [~mkhludnev]?  I _wish_ it were so.  The recent improvements to nested 
docs in Solr have yet to touch the block join queries.  In 8.0, I think these 
query parsers could assume {{which:-\_nest_path_:*}} but only if this field is 
defined.  This approach is taken with the updated child doc transformer.


was (Author: dsmiley):
How so [~mkhludnev]?  I _wish_ it were so.  The recent improvements to nested 
docs in Solr have yet to touch the block join queries.  In 8.0, I think these 
query parsers could assume {{parent:-\_nest_path_:*}} but only if this field is 
defined.  This approach is taken with the updated child doc transformer.

> introduce implicit _parent_:true  
> --
>
> Key: SOLR-7672
> URL: https://issues.apache.org/jira/browse/SOLR-7672
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, update
>Affects Versions: 5.2
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
>
> Solr provides block join support in non-invasive manner. It turns out, it 
> gives a chance to shoot a leg. As it was advised by [~thetaphi] at SOLR-7606, 
> let AddUpdateCommand add _parent_:true field to the document (not to 
> children). Do it *always* no matter whether it has children or not.
> Also, introduce default values for for block join qparsers \{!parent 
> *which=\_parent\_:true*} \{!child *of=\_parent\_:true*} (sometimes, I rather 
> want to hide them from the user, because they are misunderstood quite often). 
>  
> Please share your concerns and vote.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7672) introduce implicit _parent_:true

2019-01-28 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754413#comment-16754413
 ] 

David Smiley commented on SOLR-7672:


How so [~mkhludnev]?  I _wish_ it were so.  The recent improvements to nested 
docs in Solr have yet to touch the block join queries.  In 8.0, I think these 
query parsers could assume {{parent:-\_nest_path_:*}} but only if this field is 
defined.  This approach is taken with the updated child doc transformer.

> introduce implicit _parent_:true  
> --
>
> Key: SOLR-7672
> URL: https://issues.apache.org/jira/browse/SOLR-7672
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, update
>Affects Versions: 5.2
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
>
> Solr provides block join support in non-invasive manner. It turns out, it 
> gives a chance to shoot a leg. As it was advised by [~thetaphi] at SOLR-7606, 
> let AddUpdateCommand add _parent_:true field to the document (not to 
> children). Do it *always* no matter whether it has children or not.
> Also, introduce default values for for block join qparsers \{!parent 
> *which=\_parent\_:true*} \{!child *of=\_parent\_:true*} (sometimes, I rather 
> want to hide them from the user, because they are misunderstood quite often). 
>  
> Please share your concerns and vote.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Description: 
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:194) 
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
 (DistributedUpdateProcessor.java:1051)  
{code}
We reproduced the problem locally with the following code using Lucene code.
{code:java}
public static void main(String[] args) throws IOException {
  FSDirectory index = FSDirectory.open(Paths.get("the-index"));
  try (IndexReader reader = new   
ExitableDirectoryReader(DirectoryReader.open(index),
new QueryTimeoutImpl(1000 * 60 * 5))) {
String id = "the-id";

BytesRef text = new BytesRef(id);
for (LeafReaderContext lf : reader.leaves()) {
  TermsEnum te = lf.reader().terms("id").iterator();
  System.out.println(te.seekExact(text));
}
  }
}
{code}
 

I added System.out.println("ord: " + ord); in 
codecs.blocktree.SegmentTermsEnum.getFrame(int).

Please check the attached output of test program.txt. 

 

We found out the root cause:

we didn't implement seekExact(BytesRef) method in FilterLeafReader.FilterTerms, 
so it uses the base class TermsEnum.seekExact(BytesRef) implementation which is 
very inefficient in this case.
{code:java}
public boolean seekExact(BytesRef text) throws IOException {
  return seekCeil(text) == SeekStatus.FOUND;
}
{code}
The fix is simple, just override seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms
{code:java}
@Override
public boolean seekExact(BytesRef text) throws IOException {
  return in.seekExact(text);
}
{code}

  was:
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apa

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Attachment: output of test program.txt

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
> Attachments: output of test program.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
>  
>  I added System.out.println("ord: " + ord); in 
> codecs.blocktree.SegmentTermsEnum.getFrame(int).
> Please check the attached output of test program.txt. 
>  
> We found out the root cause:
> we didn't implement seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef text) throws IOException {
>   return in.seekExact(text);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Attachment: (was: output of test program.txt)

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
> Attachments: output of test program.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
>  
>  I added System.out.println("ord: " + ord); in 
> codecs.blocktree.SegmentTermsEnum.getFrame(int).
> Please check the attached output of test program.txt. 
>  
> We found out the root cause:
> we didn't implement seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef text) throws IOException {
>   return in.seekExact(text);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Attachment: output of test program.txt

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
> Attachments: output of test program.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
>  
>  
> We found out the root cause:
> we didn't implement seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef text) throws IOException {
>   return in.seekExact(text);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Description: 
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:194) 
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
 (DistributedUpdateProcessor.java:1051)  
{code}
We reproduced the problem locally with the following code using Lucene code.
{code:java}
public static void main(String[] args) throws IOException {
  FSDirectory index = FSDirectory.open(Paths.get("the-index"));
  try (IndexReader reader = new   
ExitableDirectoryReader(DirectoryReader.open(index),
new QueryTimeoutImpl(1000 * 60 * 5))) {
String id = "the-id";

BytesRef text = new BytesRef(id);
for (LeafReaderContext lf : reader.leaves()) {
  TermsEnum te = lf.reader().terms("id").iterator();
  System.out.println(te.seekExact(text));
}
  }
}
{code}
 

 

We found out the root cause:

we didn't implement seekExact(BytesRef) method in FilterLeafReader.FilterTerms, 
so it uses the base class TermsEnum.seekExact(BytesRef) implementation which is 
very inefficient in this case.
{code:java}
public boolean seekExact(BytesRef text) throws IOException {
  return seekCeil(text) == SeekStatus.FOUND;
}
{code}
The fix is simple, just override seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms
{code:java}
@Override
public boolean seekExact(BytesRef text) throws IOException {
  return in.seekExact(text);
}
{code}

  was:
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef

[jira] [Comment Edited] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754405#comment-16754405
 ] 

jefferyyuan edited comment on LUCENE-8662 at 1/28/19 10:06 PM:
---

At https://issues.apache.org/jira/browse/LUCENE-4874
 - Don't override non abstract methods that have an impl through other abstract 
methods in FilterAtomicReader and related classes
 
[https://github.com/apache/lucene-solr/commit/9588a84dec9fe5da210a9210cb0efbe3221c9f9e]

 
 - Should we add exception for seekExact(BytesRef) in 
FilterLeafReader.FilterTermsEnum due to the performance issue?
 - FilterLeafReader.FilterTermsEnum delegates all calls to its field TermsEnum 
in, seems it makes sense to override seekExact(BytesRef).
  


was (Author: yuanyun.cn):
At https://issues.apache.org/jira/browse/LUCENE-4874

- Don't override non abstract methods that have an impl through other abstract 
methods in FilterAtomicReader and related classes
[https://github.com/apache/lucene-solr/commit/9588a84dec9fe5da210a9210cb0efbe3221c9f9e]
 
 
- Should we add exception for seekExact(BytesRef) in 
FilterLeafReader.FilterTermsEnum due to the performance issue?
- FilterLeafReader.FilterTermsEnum delegates all calls to its field TermsEnum 
in, seems it makes sense to override seekExact(BytesRef).
 

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
> We found out the root cause: we didn't implement seekExact(BytesRef) method 
> in FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekE

[jira] [Commented] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754405#comment-16754405
 ] 

jefferyyuan commented on LUCENE-8662:
-

At https://issues.apache.org/jira/browse/LUCENE-4874

- Don't override non abstract methods that have an impl through other abstract 
methods in FilterAtomicReader and related classes
[https://github.com/apache/lucene-solr/commit/9588a84dec9fe5da210a9210cb0efbe3221c9f9e]
 
 
- Should we add exception for seekExact(BytesRef) in 
FilterLeafReader.FilterTermsEnum due to the performance issue?
- FilterLeafReader.FilterTermsEnum delegates all calls to its field TermsEnum 
in, seems it makes sense to override seekExact(BytesRef).
 

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
> We found out the root cause: we didn't implement seekExact(BytesRef) method 
> in FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef text) throws IOException {
>   return in.seekExact(text);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5211) updating parent as childless makes old children orphans

2019-01-28 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754367#comment-16754367
 ] 

David Smiley commented on SOLR-5211:


I attached a patch for documentation.  It updates SolrClient deleteById calls 
to note it doesn't work on child IDs.  I also added a note to the ref guide for 
XML & JSON.  The JSON side diff might suggest I replaced different text but 
that same text was duplicated with previous text a few lines up.

In the CHANGES.txt I added a new "Upgrade Notes" section while leaving the 
other "Improvement" issue reference mostly unchanged.  I'll post this addition 
here to ensure you all can see:
{noformat}
* SOLR-5211: Deleting (or updating) documents by their uniqueKey is now scoped 
to only consider root documents, not
  child/nested documents.  Thus a delete-by-id won't work on a child doc 
(no-op), and an attempt to update a child doc
  by providing a new doc with the same ID would add a new doc (probably 
erroneous).  Both these actions were and still
  are problematic.  In-place-updates are safe though.  If you want to delete 
certain child documents and if you know
  they don't themselves have nested children then you must do so with a 
delete-by-query technique.
{noformat}

> updating parent as childless makes old children orphans
> ---
>
> Key: SOLR-5211
> URL: https://issues.apache.org/jira/browse/SOLR-5211
> Project: Solr
>  Issue Type: Sub-task
>  Components: update
>Affects Versions: 4.5, 6.0
>Reporter: Mikhail Khludnev
>Assignee: David Smiley
>Priority: Blocker
> Fix For: 8.0
>
> Attachments: SOLR-5211.patch, SOLR-5211.patch, SOLR-5211.patch, 
> SOLR-5211.patch, SOLR-5211.patch, SOLR-5211_docs.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> if I have parent with children in the index, I can send update omitting 
> children. as a result old children become orphaned. 
> I suppose separate \_root_ fields makes much trouble. I propose to extend 
> notion of uniqueKey, and let it spans across blocks that makes updates 
> unambiguous.  
> WDYT? Do you like to see a test proves this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5211) updating parent as childless makes old children orphans

2019-01-28 Thread David Smiley (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-5211:
---
Attachment: SOLR-5211_docs.patch

> updating parent as childless makes old children orphans
> ---
>
> Key: SOLR-5211
> URL: https://issues.apache.org/jira/browse/SOLR-5211
> Project: Solr
>  Issue Type: Sub-task
>  Components: update
>Affects Versions: 4.5, 6.0
>Reporter: Mikhail Khludnev
>Assignee: David Smiley
>Priority: Blocker
> Fix For: 8.0
>
> Attachments: SOLR-5211.patch, SOLR-5211.patch, SOLR-5211.patch, 
> SOLR-5211.patch, SOLR-5211.patch, SOLR-5211_docs.patch
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> if I have parent with children in the index, I can send update omitting 
> children. as a result old children become orphaned. 
> I suppose separate \_root_ fields makes much trouble. I propose to extend 
> notion of uniqueKey, and let it spans across blocks that makes updates 
> unambiguous.  
> WDYT? Do you like to see a test proves this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754356#comment-16754356
 ] 

jefferyyuan commented on LUCENE-8662:
-

PR here:

[https://github.com/apache/lucene-solr/pull/551]

Thanks.

> Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
> 
>
> Key: LUCENE-8662
> URL: https://issues.apache.org/jira/browse/LUCENE-8662
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 5.5.5, 6.6.5, 7.6, 8.0
>Reporter: jefferyyuan
>Priority: Major
>  Labels: query
> Fix For: 8.0, 7.7
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently in our production, we found that Sole uses a lot of memory(more than 
> 10g) during recovery or commit for a small index (3.5gb)
>  The stack trace is:
>  
> {code:java}
> Thread 0x4d4b115c0 
>   at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
>   at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
> (SegmentTermsEnumFrame.java:157) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:786) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnumFrame.java:538) 
>   at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (SegmentTermsEnum.java:757) 
>   at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
>  (FilterLeafReader.java:185) 
>   at 
> org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z
>  (TermsEnum.java:74) 
>   at 
> org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
>  (SolrIndexSearcher.java:823) 
>   at 
> org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:204) 
>   at 
> org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (UpdateLog.java:786) 
>   at 
> org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
>  (VersionInfo.java:194) 
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
>  (DistributedUpdateProcessor.java:1051)  
> {code}
> We reproduced the problem locally with the following code using Lucene code.
> {code:java}
> public static void main(String[] args) throws IOException {
>   FSDirectory index = FSDirectory.open(Paths.get("the-index"));
>   try (IndexReader reader = new   
> ExitableDirectoryReader(DirectoryReader.open(index),
> new QueryTimeoutImpl(1000 * 60 * 5))) {
> String id = "the-id";
> BytesRef text = new BytesRef(id);
> for (LeafReaderContext lf : reader.leaves()) {
>   TermsEnum te = lf.reader().terms("id").iterator();
>   System.out.println(te.seekExact(text));
> }
>   }
> }
> {code}
> We found out the root cause: we didn't implement seekExact(BytesRef) method 
> in FilterLeafReader.FilterTerms, so it uses the base class 
> TermsEnum.seekExact(BytesRef) implementation which is very inefficient in 
> this case.
> {code:java}
> public boolean seekExact(BytesRef text) throws IOException {
>   return seekCeil(text) == SeekStatus.FOUND;
> }
> {code}
> The fix is simple, just override seekExact(BytesRef) method in 
> FilterLeafReader.FilterTerms
> {code:java}
> @Override
> public boolean seekExact(BytesRef text) throws IOException {
>   return in.seekExact(text);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] jefferyyuan opened a new pull request #551: LUCENE-8662: Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread GitBox

jefferyyuan opened a new pull request #551: LUCENE-8662: Override 
seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum
URL: https://github.com/apache/lucene-solr/pull/551
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13159) Autoscaling not distributing collection evenly

2019-01-28 Thread Gus Heck (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754353#comment-16754353
 ] 

Gus Heck commented on SOLR-13159:
-

One additional bit of info that I've begun to suspect may be relevant is that 
this cluster is built (and rebuilt) on spot instances. I've begun to wonder if 
anywhere in auto-scaling is caching ip addresses instead of domain names 
(looked briefly, but didn't see it).

> Autoscaling not distributing collection evenly
> --
>
> Key: SOLR-13159
> URL: https://issues.apache.org/jira/browse/SOLR-13159
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 8.0
>Reporter: Gus Heck
>Priority: Major
> Attachments: autoscaling.json, clstat.json
>
>
> I recently ran into a very strange behavior described in detail in the mail 
> linked at the bottom of this description. In short: 
>  # Default settings didn't distribute nodes evenly on brand new 50 node 
> cluster
>  # Can't seem to write rules producing suggestions to distribute them evenly 
>  # Suggestions are made that then fail despite quiet cluster, no changes.
> Also of note was diagnostic output containing this seemingly impossible 
> result with 2 cores counted and no replicas listed:
> {code:java}
> {
> "node": "solr-2.customer.redacted.com:8983_solr",
> "isLive": true,
> "cores": 2,
> "freedisk": 140.03918838500977,
> "totaldisk": 147.5209503173828,
> "replicas": {}
> },{code}
> I will attach anonymized cluster status output and autoscaling.json shortly 
> This issue may be related to SOLR-13142
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201901.mbox/%3CCAEUNc48HRZA7qo-uKtJQEtZnO9VG9OErQZGzoOmCTBe7C9zvNw%40mail.gmail.com%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jefferyyuan updated LUCENE-8662:

Description: 
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
 The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:194) 
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
 (DistributedUpdateProcessor.java:1051)  
{code}
We reproduced the problem locally with the following code using Lucene code.
{code:java}
public static void main(String[] args) throws IOException {
  FSDirectory index = FSDirectory.open(Paths.get("the-index"));
  try (IndexReader reader = new   
ExitableDirectoryReader(DirectoryReader.open(index),
new QueryTimeoutImpl(1000 * 60 * 5))) {
String id = "the-id";

BytesRef text = new BytesRef(id);
for (LeafReaderContext lf : reader.leaves()) {
  TermsEnum te = lf.reader().terms("id").iterator();
  System.out.println(te.seekExact(text));
}
  }
}
{code}
We found out the root cause: we didn't implement seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms, so it uses the base class 
TermsEnum.seekExact(BytesRef) implementation which is very inefficient in this 
case.
{code:java}
public boolean seekExact(BytesRef text) throws IOException {
  return seekCeil(text) == SeekStatus.FOUND;
}
{code}
The fix is simple, just override seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms
{code:java}
@Override
public boolean seekExact(BytesRef text) throws IOException {
  return in.seekExact(text);
}
{code}

  was:
Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava

Re: [JENKINS] Lucene-Solr-NightlyTests-master - Build # 1761 - Unstable

2019-01-28 Thread Dawid Weiss

This looks strange and like something I might have caused by changes
to NRT directory (or an existing problem revealed by stricter checks
in byte buffers directory). I'll take a look, but tomorrow.

Dawid

On Mon, Jan 28, 2019 at 9:16 PM Apache Jenkins Server
 wrote:
>
> Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1761/
>
> 3 tests failed.
> FAILED:  org.apache.lucene.util.TestOfflineSorter.testThreadSafety
>
> Error Message:
> Captured an uncaught exception in thread: Thread[id=2665, name=Thread-2493, 
> state=RUNNABLE, group=TGRP-TestOfflineSorter]
>
> Stack Trace:
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=2665, name=Thread-2493, 
> state=RUNNABLE, group=TGRP-TestOfflineSorter]
> Caused by: java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
> Can't open a file still open for writing: unsorted_tmp_0.tmp
> at __randomizedtesting.SeedInfo.seed([972EEC94E272C842]:0)
> at 
> org.apache.lucene.util.TestOfflineSorter$2.run(TestOfflineSorter.java:271)
> Caused by: java.nio.file.AccessDeniedException: Can't open a file still open 
> for writing: unsorted_tmp_0.tmp
> at 
> org.apache.lucene.store.ByteBuffersDirectory$FileEntry.openInput(ByteBuffersDirectory.java:250)
> at 
> org.apache.lucene.store.ByteBuffersDirectory.openInput(ByteBuffersDirectory.java:222)
> at 
> org.apache.lucene.store.NRTCachingDirectory.slowFileExists(NRTCachingDirectory.java:292)
> at 
> org.apache.lucene.store.NRTCachingDirectory.createTempOutput(NRTCachingDirectory.java:266)
> at 
> org.apache.lucene.store.MockDirectoryWrapper.createTempOutput(MockDirectoryWrapper.java:697)
> at 
> org.apache.lucene.util.TestOfflineSorter.checkSort(TestOfflineSorter.java:183)
> at 
> org.apache.lucene.util.TestOfflineSorter.access$100(TestOfflineSorter.java:49)
> at 
> org.apache.lucene.util.TestOfflineSorter$2.run(TestOfflineSorter.java:267)
>
>
> FAILED:  
> org.apache.solr.cloud.autoscaling.sim.TestSimTriggerIntegration.testEventQueue
>
> Error Message:
> action wasn't interrupted
>
> Stack Trace:
> java.lang.AssertionError: action wasn't interrupted
> at 
> __randomizedtesting.SeedInfo.seed([BBE49E48E44CDBA7:7251DCE6ED2B1D52]:0)
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.solr.cloud.autoscaling.sim.TestSimTriggerIntegration.testEventQueue(TestSimTriggerIntegration.java:757)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>

[jira] [Created] (LUCENE-8662) Override seekExact(BytesRef) in FilterLeafReader.FilterTermsEnum

2019-01-28 Thread jefferyyuan (JIRA)

jefferyyuan created LUCENE-8662:
---

 Summary: Override seekExact(BytesRef) in 
FilterLeafReader.FilterTermsEnum
 Key: LUCENE-8662
 URL: https://issues.apache.org/jira/browse/LUCENE-8662
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Affects Versions: 7.6, 6.6.5, 5.5.5, 8.0
Reporter: jefferyyuan
 Fix For: 8.0, 7.7


Recently in our production, we found that Sole uses a lot of memory(more than 
10g) during recovery or commit for a small index (3.5gb)
The stack trace is:

 
{code:java}
Thread 0x4d4b115c0 
  at org.apache.lucene.store.DataInput.readVInt()I (DataInput.java:125) 
  at org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.loadBlock()V 
(SegmentTermsEnumFrame.java:157) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermNonLeaf(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:786) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(Lorg/apache/lucene/util/BytesRef;Z)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnumFrame.java:538) 
  at 
org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (SegmentTermsEnum.java:757) 
  at 
org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekCeil(Lorg/apache/lucene/util/BytesRef;)Lorg/apache/lucene/index/TermsEnum$SeekStatus;
 (FilterLeafReader.java:185) 
  at 
org.apache.lucene.index.TermsEnum.seekExact(Lorg/apache/lucene/util/BytesRef;)Z 
(TermsEnum.java:74) 
  at 
org.apache.solr.search.SolrIndexSearcher.lookupId(Lorg/apache/lucene/util/BytesRef;)J
 (SolrIndexSearcher.java:823) 
  at 
org.apache.solr.update.VersionInfo.getVersionFromIndex(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:204) 
  at 
org.apache.solr.update.UpdateLog.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (UpdateLog.java:786) 
  at 
org.apache.solr.update.VersionInfo.lookupVersion(Lorg/apache/lucene/util/BytesRef;)Ljava/lang/Long;
 (VersionInfo.java:194) 
  at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(Lorg/apache/solr/update/AddUpdateCommand;)Z
 (DistributedUpdateProcessor.java:1051)  
{code}
We reproduced the problem locally with the following code using Lucene code.
{code:java}
public static void main(String[] args) throws IOException {
  FSDirectory index = FSDirectory.open(Paths.get("the-index"));
  try (IndexReader reader = new   
ExitableDirectoryReader(DirectoryReader.open(index),
new QueryTimeoutImpl(1000 * 60 * 5))) {
String id = "the-id";

BytesRef text = new BytesRef(id);
for (LeafReaderContext lf : reader.leaves()) {
  TermsEnum te = lf.reader().terms("id").iterator();
  System.out.println(te.seekExact(text));
}
  }
}
{code}

We found out the root cause: we didn't implement seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms, so it uses the base class 
TermsEnum.seekExact(BytesRef) implementation which is very inefficient in this 
case.
{code:java}
public boolean seekExact(BytesRef text) throws IOException {
  return seekCeil(text) == SeekStatus.FOUND;
}
{code}

The fix is simple, just override seekExact(BytesRef) method in 
FilterLeafReader.FilterTerms
{code:java}
@Override
public boolean seekExact(BytesRef text) throws IOException {
  return in.seekExact(text);
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-master - Build # 1761 - Unstable

2019-01-28 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1761/

3 tests failed.
FAILED:  org.apache.lucene.util.TestOfflineSorter.testThreadSafety

Error Message:
Captured an uncaught exception in thread: Thread[id=2665, name=Thread-2493, 
state=RUNNABLE, group=TGRP-TestOfflineSorter]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=2665, name=Thread-2493, state=RUNNABLE, 
group=TGRP-TestOfflineSorter]
Caused by: java.lang.RuntimeException: java.nio.file.AccessDeniedException: 
Can't open a file still open for writing: unsorted_tmp_0.tmp
at __randomizedtesting.SeedInfo.seed([972EEC94E272C842]:0)
at 
org.apache.lucene.util.TestOfflineSorter$2.run(TestOfflineSorter.java:271)
Caused by: java.nio.file.AccessDeniedException: Can't open a file still open 
for writing: unsorted_tmp_0.tmp
at 
org.apache.lucene.store.ByteBuffersDirectory$FileEntry.openInput(ByteBuffersDirectory.java:250)
at 
org.apache.lucene.store.ByteBuffersDirectory.openInput(ByteBuffersDirectory.java:222)
at 
org.apache.lucene.store.NRTCachingDirectory.slowFileExists(NRTCachingDirectory.java:292)
at 
org.apache.lucene.store.NRTCachingDirectory.createTempOutput(NRTCachingDirectory.java:266)
at 
org.apache.lucene.store.MockDirectoryWrapper.createTempOutput(MockDirectoryWrapper.java:697)
at 
org.apache.lucene.util.TestOfflineSorter.checkSort(TestOfflineSorter.java:183)
at 
org.apache.lucene.util.TestOfflineSorter.access$100(TestOfflineSorter.java:49)
at 
org.apache.lucene.util.TestOfflineSorter$2.run(TestOfflineSorter.java:267)


FAILED:  
org.apache.solr.cloud.autoscaling.sim.TestSimTriggerIntegration.testEventQueue

Error Message:
action wasn't interrupted

Stack Trace:
java.lang.AssertionError: action wasn't interrupted
at 
__randomizedtesting.SeedInfo.seed([BBE49E48E44CDBA7:7251DCE6ED2B1D52]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.solr.cloud.autoscaling.sim.TestSimTriggerIntegration.testEventQueue(TestSimTriggerIntegration.java:757)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsear

[JENKINS] Lucene-Solr-NightlyTests-7.x - Build # 443 - Still Unstable

2019-01-28 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/443/

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.ChaosMonkeyNothingIsSafeWithPullReplicasTest

Error Message:
ObjectTracker found 6 object(s) that were not released!!! [MMapDirectory, 
InternalHttpClient, MMapDirectory, SolrCore, MMapDirectory, MMapDirectory] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MMapDirectory  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:503)  
at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:346) 
 at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:425) 
 at 
org.apache.solr.handler.ReplicationHandler.lambda$setupPolling$13(ReplicationHandler.java:1171)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)  
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.http.impl.client.InternalHttpClient  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:321)
  at 
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
  at 
org.apache.solr.handler.IndexFetcher.createHttpClient(IndexFetcher.java:225)  
at org.apache.solr.handler.IndexFetcher.(IndexFetcher.java:267)  at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:421) 
 at org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:237) 
 at 
org.apache.solr.cloud.RecoveryStrategy.doReplicateOnlyRecovery(RecoveryStrategy.java:382)
  at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:328)  
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:307)  at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)  
at java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
 at java.lang.Thread.run(Thread.java:748)  
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.lucene.store.MMapDirectory  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:348)
  at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:359)  at 
org.apache.solr.core.SolrCore.initIndex(SolrCore.java:738)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:967)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:874)  at 
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1178)
  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1088)  at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:92)
  at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:360)
  at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:395)
  at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
  at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
  at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:735)  
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:716)  
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:496)  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)
  at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
  at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
  at 
org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:158)
  at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(S

[jira] [Resolved] (SOLR-7672) introduce implicit _parent_:true

2019-01-28 Thread Mikhail Khludnev (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev resolved SOLR-7672.

   Resolution: Won't Fix
Fix Version/s: (was: 6.0)
   (was: 5.5)

this feature is superseded by some from the recent issues

> introduce implicit _parent_:true  
> --
>
> Key: SOLR-7672
> URL: https://issues.apache.org/jira/browse/SOLR-7672
> Project: Solr
>  Issue Type: Improvement
>  Components: query parsers, update
>Affects Versions: 5.2
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
>
> Solr provides block join support in non-invasive manner. It turns out, it 
> gives a chance to shoot a leg. As it was advised by [~thetaphi] at SOLR-7606, 
> let AddUpdateCommand add _parent_:true field to the document (not to 
> children). Do it *always* no matter whether it has children or not.
> Also, introduce default values for for block join qparsers \{!parent 
> *which=\_parent\_:true*} \{!child *of=\_parent\_:true*} (sometimes, I rather 
> want to hide them from the user, because they are misunderstood quite often). 
>  
> Please share your concerns and vote.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13159) Autoscaling not distributing collection evenly

2019-01-28 Thread Andrzej Bialecki (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754243#comment-16754243
 ] 

Andrzej Bialecki  commented on SOLR-13159:
--

FWIW I couldn't reproduce this on a local cluster, or using a simulator. I'm 
leaving this issue open for now, maybe more data will become available.

> Autoscaling not distributing collection evenly
> --
>
> Key: SOLR-13159
> URL: https://issues.apache.org/jira/browse/SOLR-13159
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 8.0
>Reporter: Gus Heck
>Priority: Major
> Attachments: autoscaling.json, clstat.json
>
>
> I recently ran into a very strange behavior described in detail in the mail 
> linked at the bottom of this description. In short: 
>  # Default settings didn't distribute nodes evenly on brand new 50 node 
> cluster
>  # Can't seem to write rules producing suggestions to distribute them evenly 
>  # Suggestions are made that then fail despite quiet cluster, no changes.
> Also of note was diagnostic output containing this seemingly impossible 
> result with 2 cores counted and no replicas listed:
> {code:java}
> {
> "node": "solr-2.customer.redacted.com:8983_solr",
> "isLive": true,
> "cores": 2,
> "freedisk": 140.03918838500977,
> "totaldisk": 147.5209503173828,
> "replicas": {}
> },{code}
> I will attach anonymized cluster status output and autoscaling.json shortly 
> This issue may be related to SOLR-13142
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201901.mbox/%3CCAEUNc48HRZA7qo-uKtJQEtZnO9VG9OErQZGzoOmCTBe7C9zvNw%40mail.gmail.com%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13172) Deprecate MoreLikeTHisHandler

2019-01-28 Thread David Smiley (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754236#comment-16754236
 ] 

David Smiley commented on SOLR-13172:
-

Yeah I've come around to agree with Dawid.  I suppose the distinctions between 
the three choices should be better documented so it's clear to our users (and 
us :)  ) what their distinct values are.

> Deprecate MoreLikeTHisHandler
> -
>
> Key: SOLR-13172
> URL: https://issues.apache.org/jira/browse/SOLR-13172
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: MoreLikeThis
>Reporter: Alessandro Benedetti
>Priority: Major
>
> Following the discussions with [~dsmi...@mac.com]
> Currently the Lucene More Like This functionality is offered in Apache Solr 
> through :
>  * More Like This Handler
>  * More Like This Component
>  * More Like This Query Parser
> The query parser is the most flexible approach and it is well supported, it 
> is a good candidate to become the main entry point if a user wnat the MLT 
> functionality.
> The More Like This component is quite coupled with the others but it has a 
> sense and offers slightly different features from the query parser ( *Using 
> MoreLikeThis as a search component returns similar documents for each 
> document in the response set.*)
> So the proposal here is to deprecate and remove the More Like This Handler, 
> to ease the maintenance  of the functionality and to simplify the way new 
> users approach it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-12768) Determine how _nest_path_ should be analyzed to support various use-cases

2019-01-28 Thread David Smiley (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-12768.
-
Resolution: Fixed

> Determine how _nest_path_ should be analyzed to support various use-cases
> -
>
> Key: SOLR-12768
> URL: https://issues.apache.org/jira/browse/SOLR-12768
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Blocker
> Fix For: 8.0
>
> Attachments: SOLR-12768.patch, SOLR-12768.patch, SOLR-12768.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We know we need {{\_nest\_path\_}} in the schema for the new nested documents 
> support, and we loosely know what goes in it.  From a DocValues perspective, 
> we've got it down; though we might tweak it.  From an indexing (text 
> analysis) perspective, we're not quite sure yet, though we've got a test 
> schema, {{schema-nest.xml}} with a decent shot at it.  Ultimately, how we 
> index it will depend on the query/filter use-cases we need to support.  So 
> we'll review some of them here.
> TBD: Not sure if the outcome of this task is just a "decide" or wether we 
> also potentially add a few tests for some of these cases, and/or if we also 
> add a FieldType to make declaring it as easy as a one-liner.  A FieldType 
> would have other benefits too once we're ready to make querying on the path 
> easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12768) Determine how _nest_path_ should be analyzed to support various use-cases

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754227#comment-16754227
 ] 

ASF subversion and git services commented on SOLR-12768:


Commit 8413b105c200d7e602fb10935565a39e23a8c96b in lucene-solr's branch 
refs/heads/branch_8x from David Wayne Smiley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8413b10 ]

SOLR-12768: added _nest_path_ to the default schema (thereby enabling nested 
docs)
* new NestPathField encapsulating details for how _nest_path_ is indexed
** tweaked the analysis to index 1 token instead of variable
* TokenizerChain has new CustomAnalyzer copy-constructor

(cherry picked from commit 381a30b26ca1737123b65aefc685367d1aa038b9)


> Determine how _nest_path_ should be analyzed to support various use-cases
> -
>
> Key: SOLR-12768
> URL: https://issues.apache.org/jira/browse/SOLR-12768
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Blocker
> Fix For: 8.0
>
> Attachments: SOLR-12768.patch, SOLR-12768.patch, SOLR-12768.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We know we need {{\_nest\_path\_}} in the schema for the new nested documents 
> support, and we loosely know what goes in it.  From a DocValues perspective, 
> we've got it down; though we might tweak it.  From an indexing (text 
> analysis) perspective, we're not quite sure yet, though we've got a test 
> schema, {{schema-nest.xml}} with a decent shot at it.  Ultimately, how we 
> index it will depend on the query/filter use-cases we need to support.  So 
> we'll review some of them here.
> TBD: Not sure if the outcome of this task is just a "decide" or wether we 
> also potentially add a few tests for some of these cases, and/or if we also 
> add a FieldType to make declaring it as easy as a one-liner.  A FieldType 
> would have other benefits too once we're ready to make querying on the path 
> easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12768) Determine how _nest_path_ should be analyzed to support various use-cases

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754225#comment-16754225
 ] 

ASF subversion and git services commented on SOLR-12768:


Commit 381a30b26ca1737123b65aefc685367d1aa038b9 in lucene-solr's branch 
refs/heads/master from David Wayne Smiley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=381a30b ]

SOLR-12768: added _nest_path_ to the default schema (thereby enabling nested 
docs)
* new NestPathField encapsulating details for how _nest_path_ is indexed
** tweaked the analysis to index 1 token instead of variable
* TokenizerChain has new CustomAnalyzer copy-constructor


> Determine how _nest_path_ should be analyzed to support various use-cases
> -
>
> Key: SOLR-12768
> URL: https://issues.apache.org/jira/browse/SOLR-12768
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Blocker
> Fix For: 8.0
>
> Attachments: SOLR-12768.patch, SOLR-12768.patch, SOLR-12768.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We know we need {{\_nest\_path\_}} in the schema for the new nested documents 
> support, and we loosely know what goes in it.  From a DocValues perspective, 
> we've got it down; though we might tweak it.  From an indexing (text 
> analysis) perspective, we're not quite sure yet, though we've got a test 
> schema, {{schema-nest.xml}} with a decent shot at it.  Ultimately, how we 
> index it will depend on the query/filter use-cases we need to support.  So 
> we'll review some of them here.
> TBD: Not sure if the outcome of this task is just a "decide" or wether we 
> also potentially add a few tests for some of these cases, and/or if we also 
> add a FieldType to make declaring it as easy as a one-liner.  A FieldType 
> would have other benefits too once we're ready to make querying on the path 
> easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-8659) Upgrade OpenNLP to 1.9.1

2019-01-28 Thread Tommaso Teofili (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-8659:

Fix Version/s: 8.0

> Upgrade OpenNLP to 1.9.1
> 
>
> Key: LUCENE-8659
> URL: https://issues.apache.org/jira/browse/LUCENE-8659
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>Priority: Major
> Fix For: 8.0, master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since Apache OpenNLP 1.9.1 has been released it would be nice to upgrade 
> Lucene/Solr to use that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8659) Upgrade OpenNLP to 1.9.1

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754188#comment-16754188
 ] 

ASF subversion and git services commented on LUCENE-8659:
-

Commit fbb28406fc6c3fba84ba38b76a274aa5eec72d16 in lucene-solr's branch 
refs/heads/branch_8x from Tommaso Teofili
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fbb2840 ]

LUCENE-8659 - updated sha1 for OpenNLP dependency

(cherry picked from commit 000785e68e69480743128b59c8838e0983e196c3)


> Upgrade OpenNLP to 1.9.1
> 
>
> Key: LUCENE-8659
> URL: https://issues.apache.org/jira/browse/LUCENE-8659
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since Apache OpenNLP 1.9.1 has been released it would be nice to upgrade 
> Lucene/Solr to use that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-8659) Upgrade OpenNLP to 1.9.1

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754187#comment-16754187
 ] 

ASF subversion and git services commented on LUCENE-8659:
-

Commit be34f0837b59224f22922ea5050ea741c0577a4c in lucene-solr's branch 
refs/heads/branch_8x from Tommaso Teofili
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=be34f08 ]

LUCENE-8659 - upgrade Lucene/Solr to use OpenNLP 1.9.1

(cherry picked from commit 48073a9778730bed93cd9a33723a99679182ad0f)


> Upgrade OpenNLP to 1.9.1
> 
>
> Key: LUCENE-8659
> URL: https://issues.apache.org/jira/browse/LUCENE-8659
> Project: Lucene - Core
>  Issue Type: Task
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since Apache OpenNLP 1.9.1 has been released it would be nice to upgrade 
> Lucene/Solr to use that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754185#comment-16754185
 ] 

ASF subversion and git services commented on SOLR-13072:


Commit 5e1b08878f070baae459b110380ff95e77d0d7bc in lucene-solr's branch 
refs/heads/branch_8x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5e1b088 ]

SOLR-13072: Make sure the new overseer leader is present.


> Management of markers for nodeLost / nodeAdded events is broken
> ---
>
> Key: SOLR-13072
> URL: https://issues.apache.org/jira/browse/SOLR-13072
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 7.5, 7.6, 8.0
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.0, 7.7, master (9.0)
>
>
> In order to prevent {{nodeLost}} events from being lost when it's the 
> Overseer leader that is the node that was lost a mechanism was added to 
> record markers for these events by any other live node, in 
> {{ZkController.registerLiveNodesListener()}}. As similar mechanism also 
> exists for {{nodeAdded}} events.
> On Overseer leader restart if the autoscaling configuration didn't contain 
> any triggers that consume {{nodeLost}} events then these markers are removed. 
> If there are 1 or more trigger configs that consume {{nodeLost}} events then 
> these triggers would read the markers, remove them and generate appropriate 
> events.
> However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is 
> broken and susceptible to race conditions.
> It's not unusual to have more than 1 {{nodeLost}} trigger because in addition 
> to any user-defined triggers there's always one that is automatically defined 
> if missing: {{.auto_add_replicas}}. However, if there's more than 1 
> {{nodeLost}} trigger then the process of consuming and removing the markers 
> becomes non-deterministic - each trigger may pick up (and delete) all, none, 
> or some of the markers.
> So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more 
> than 1 {{nodeAdded}} trigger is defined.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken

2019-01-28 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754166#comment-16754166
 ] 

ASF subversion and git services commented on SOLR-13072:


Commit 692e6381934739626db03c30fe398594f7d5ef33 in lucene-solr's branch 
refs/heads/branch_7x from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=692e638 ]

SOLR-13072: Make sure the new overseer leader is present.


> Management of markers for nodeLost / nodeAdded events is broken
> ---
>
> Key: SOLR-13072
> URL: https://issues.apache.org/jira/browse/SOLR-13072
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 7.5, 7.6, 8.0
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.0, 7.7, master (9.0)
>
>
> In order to prevent {{nodeLost}} events from being lost when it's the 
> Overseer leader that is the node that was lost a mechanism was added to 
> record markers for these events by any other live node, in 
> {{ZkController.registerLiveNodesListener()}}. As similar mechanism also 
> exists for {{nodeAdded}} events.
> On Overseer leader restart if the autoscaling configuration didn't contain 
> any triggers that consume {{nodeLost}} events then these markers are removed. 
> If there are 1 or more trigger configs that consume {{nodeLost}} events then 
> these triggers would read the markers, remove them and generate appropriate 
> events.
> However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is 
> broken and susceptible to race conditions.
> It's not unusual to have more than 1 {{nodeLost}} trigger because in addition 
> to any user-defined triggers there's always one that is automatically defined 
> if missing: {{.auto_add_replicas}}. However, if there's more than 1 
> {{nodeLost}} trigger then the process of consuming and removing the markers 
> becomes non-deterministic - each trigger may pick up (and delete) all, none, 
> or some of the markers.
> So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more 
> than 1 {{nodeAdded}} trigger is defined.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reporting multiple issues triggering an HTTP 500 in Solr

2019-01-28 Thread César Rodríguez

Thanks, we will do that.

Just to be clear, we are talking about opening from 50 to 70 jira
tickets. We found 77 unique points in the source code where an
exception is thrown that causes an HTTP 500, but I'm guessing that
some of them will not be serious enough to be reported.

We can provide patches for the two issues described below. We will do
our best to describe the probable cause of the error on each
individual report, but we won't be able to provide patches for most of
them.

More information about this testing effort can be found here:
https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results

On Mon, Jan 28, 2019 at 3:03 PM Mikhail Khludnev  wrote:
>
> Yes. Please create jiras and attach patches. Tests are highly appreciated.
>
>
> On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez 
>  wrote:
>>
>> Hi there,
>>
>> We analyzed the source code of Apache Solr and found a number of
>> issues that we would like to report. We configured Solr using the
>> films collection from the quick start tutorial [2]. For every issue
>> that we found we can provide a request URL for which Solr returns an
>> HTTP 500 error (usually due to an uncaught exception).
>>
>> Please find below two example issues we found, together with an
>> explanation of the cause and a patch fixing the bug (attached patches
>> pass the unit tests). The issues we would like to report are similar
>> to those. We found them with help from Diffblue Microservice Testing
>> [1].
>>
>> We would like to know:
>>  * Should we open JIRA tickets for each one of them?
>>  * We can provide a URL that triggers the problem, a stack trace, and
>> information about the server setup. What other information would you
>> like to see?
>>
>> I look forward to your response.
>>
>> Best regards,
>> César
>>
>> PS. If this question is not suitable for this list, please indicate
>> where to follow up.
>>
>> === Issue 1: Null pointer exception due to unhandled case in
>> ComplexPhraseQueryParser  ===
>>
>> Assume that we request the following URL:
>>
>> /solr/films/select?q={!complexphrase}genre:"-om*"
>>
>> Handling this query involves constructing a SpanQuery, which happens
>> in the rewrite method of ComplexPhraseQueryParser. In particular, the
>> expression is decomposed into a BooleanQuery, which has exactly one
>> clause, namely the negative clause -genre:”om*”. The rewrite method
>> then further transforms this into a SpanQuery; in this case, it goes
>> into the path that handles complex queries with both positive and
>> negative clauses. It extracts the subset of positive clauses - note
>> that this set of clauses is empty for this query. The positive clauses
>> are then combined into a SpanNearQuery (around line 340), which is
>> then used to build a SpanNotQuery.
>>
>> Further down the line, the field attribute of the SpanNearQuery is
>> accessed and used as an index into a TreeMap. But since we had an
>> empty set of positive clauses, the SpanNearQuery does not have its
>> field attribute set, so we get a null here - this leads to an
>> exception.
>>
>> A possible fix would be to detect the situation where we have an empty
>> set of positive clauses and include a single synthetic clause that
>> matches either everything or nothing. See attached file
>> 0001-Fix-NullPointerException.patch.
>>
>> === Issue 2: StringIndexOutOfBoundsException when expanding macros ===
>>
>> Assume that we request the following URL:
>>
>> /solr/films/select?a=${${b}}
>>
>> Parameter macro expansion [3]  seems to take place in
>> org.apache.solr.request.macro.MacroExpander._expand(String val).
>> However, this method throws a StringIndexOutOfBoundsException for the
>> URL above. From reading the code it seems that macros are not expanded
>> inside curly brackets ${...}, and so the “${b}” inside “${${b}}”
>> should not be expanded. But the function seems to fail to detect this
>> specific case and graciously refuse to expand it. Instead, it
>> unexpectedly throws the exception. A possible fix could be updating
>> the ‘idx’ variable when the StrParser detects that no valid identifier
>> can be found inside the brackets. See attached file
>> 0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch.
>>
>> References:
>> [1] https://www.diffblue.com/labs/
>> [2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html
>> [3] http://yonik.com/solr-query-parameter-substitution/
>>
>> --
>> Diffblue Limited, a company registered in England and Wales number
>> 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street,
>> Oxford, OX1 1PT, England
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>
> Diffblue Limited, a company registered in England and Wales number 09958102, 
> with its registered office at Rams

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-9.0.4) - Build # 23586 - Unstable!

2019-01-28 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/23586/
Java: 64bit/jdk-9.0.4 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  
org.apache.solr.client.solrj.io.stream.StreamDecoratorTest.testParallelDaemonUpdateStream

Error Message:
Failed while waiting for active collection Timeout waiting to see state for 
collection=parallelDestinationCollection1 :null Live Nodes: 
[127.0.0.1:34093_solr, 127.0.0.1:40187_solr, 127.0.0.1:42225_solr, 
127.0.0.1:42521_solr] Last available state: null

Stack Trace:
java.lang.RuntimeException: Failed while waiting for active collection
Timeout waiting to see state for collection=parallelDestinationCollection1 :null
Live Nodes: [127.0.0.1:34093_solr, 127.0.0.1:40187_solr, 127.0.0.1:42225_solr, 
127.0.0.1:42521_solr]
Last available state: null
at 
__randomizedtesting.SeedInfo.seed([ADCC7A53EDAC3502:DD735B7305414CBF]:0)
at 
org.apache.solr.cloud.MiniSolrCloudCluster.waitForActiveCollection(MiniSolrCloudCluster.java:728)
at 
org.apache.solr.cloud.MiniSolrCloudCluster.waitForActiveCollection(MiniSolrCloudCluster.java:734)
at 
org.apache.solr.client.solrj.io.stream.StreamDecoratorTest.testParallelDaemonUpdateStream(StreamDecoratorTest.java:2601)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(T

[jira] [Commented] (LUCENE-8660) Include totalHitsThreshold when tracking total hits in TopDocsCollector

2019-01-28 Thread Adrien Grand (JIRA)



[ 
https://issues.apache.org/jira/browse/LUCENE-8660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754076#comment-16754076
 ] 

Adrien Grand commented on LUCENE-8660:
--

+1

> Include totalHitsThreshold when tracking total hits in TopDocsCollector
> ---
>
> Key: LUCENE-8660
> URL: https://issues.apache.org/jira/browse/LUCENE-8660
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Jim Ferenczi
>Priority: Minor
> Attachments: LUCENE-8660.patch, LUCENE-8660.patch
>
>
> Today the total hits threshold in the top docs collector is not inclusive, 
> this means that total hits are tracked up to totalHitsThreshold-1. After 
> discussing with @jpountz we agreed that this is not intuitive to return a 
> lower bound that is equal to totalHitsThreshold even if the count is accurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Reporting multiple issues triggering an HTTP 500 in Solr

2019-01-28 Thread Mikhail Khludnev

Yes. Please create jiras and attach patches. Tests are highly appreciated.


On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez <
cesar.rodrig...@diffblue.com> wrote:

> Hi there,
>
> We analyzed the source code of Apache Solr and found a number of
> issues that we would like to report. We configured Solr using the
> films collection from the quick start tutorial [2]. For every issue
> that we found we can provide a request URL for which Solr returns an
> HTTP 500 error (usually due to an uncaught exception).
>
> Please find below two example issues we found, together with an
> explanation of the cause and a patch fixing the bug (attached patches
> pass the unit tests). The issues we would like to report are similar
> to those. We found them with help from Diffblue Microservice Testing
> [1].
>
> We would like to know:
>  * Should we open JIRA tickets for each one of them?
>  * We can provide a URL that triggers the problem, a stack trace, and
> information about the server setup. What other information would you
> like to see?
>
> I look forward to your response.
>
> Best regards,
> César
>
> PS. If this question is not suitable for this list, please indicate
> where to follow up.
>
> === Issue 1: Null pointer exception due to unhandled case in
> ComplexPhraseQueryParser  ===
>
> Assume that we request the following URL:
>
> /solr/films/select?q={!complexphrase}genre:"-om*"
>
> Handling this query involves constructing a SpanQuery, which happens
> in the rewrite method of ComplexPhraseQueryParser. In particular, the
> expression is decomposed into a BooleanQuery, which has exactly one
> clause, namely the negative clause -genre:”om*”. The rewrite method
> then further transforms this into a SpanQuery; in this case, it goes
> into the path that handles complex queries with both positive and
> negative clauses. It extracts the subset of positive clauses - note
> that this set of clauses is empty for this query. The positive clauses
> are then combined into a SpanNearQuery (around line 340), which is
> then used to build a SpanNotQuery.
>
> Further down the line, the field attribute of the SpanNearQuery is
> accessed and used as an index into a TreeMap. But since we had an
> empty set of positive clauses, the SpanNearQuery does not have its
> field attribute set, so we get a null here - this leads to an
> exception.
>
> A possible fix would be to detect the situation where we have an empty
> set of positive clauses and include a single synthetic clause that
> matches either everything or nothing. See attached file
> 0001-Fix-NullPointerException.patch.
>
> === Issue 2: StringIndexOutOfBoundsException when expanding macros ===
>
> Assume that we request the following URL:
>
> /solr/films/select?a=${${b}}
>
> Parameter macro expansion [3]  seems to take place in
> org.apache.solr.request.macro.MacroExpander._expand(String val).
> However, this method throws a StringIndexOutOfBoundsException for the
> URL above. From reading the code it seems that macros are not expanded
> inside curly brackets ${...}, and so the “${b}” inside “${${b}}”
> should not be expanded. But the function seems to fail to detect this
> specific case and graciously refuse to expand it. Instead, it
> unexpectedly throws the exception. A possible fix could be updating
> the ‘idx’ variable when the StrParser detects that no valid identifier
> can be found inside the brackets. See attached file
> 0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch.
>
> References:
> [1] https://www.diffblue.com/labs/
> [2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html
> [3] http://yonik.com/solr-query-parameter-substitution/
>
> --
> Diffblue Limited, a company registered in England and Wales number
> 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street,
> Oxford, OX1 1PT, England
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org



-- 
Sincerely yours
Mikhail Khludnev

[JENKINS] Lucene-Solr-repro - Build # 2745 - Unstable

2019-01-28 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-repro/2745/

[...truncated 28 lines...]
[repro] Jenkins log URL: 
https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/442/consoleText

[repro] Revision: b7a8ca98b6e42e1d48952cd20f1957c19cf3b73b

[repro] Ant options: -Dtests.multiplier=2 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
[repro] Repro line:  ant test  -Dtestcase=TestStressCloudBlindAtomicUpdates 
-Dtests.method=test_dv_stored -Dtests.seed=1F0BCE896BC15E68 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=nl-BE -Dtests.timezone=Etc/Greenwich -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8

[repro] git rev-parse --abbrev-ref HEAD
[repro] git rev-parse HEAD
[repro] Initial local git branch/revision: 
7713a4f2458c77de08193dc548807b9e90214aaf
[repro] git fetch
[repro] git checkout b7a8ca98b6e42e1d48952cd20f1957c19cf3b73b

[...truncated 2 lines...]
[repro] git merge --ff-only

[...truncated 1 lines...]
[repro] ant clean

[...truncated 6 lines...]
[repro] Test suites by module:
[repro]solr/core
[repro]   TestStressCloudBlindAtomicUpdates
[repro] ant compile-test

[...truncated 3583 lines...]
[repro] ant test-nocompile -Dtests.dups=5 -Dtests.maxfailures=5 
-Dtests.class="*.TestStressCloudBlindAtomicUpdates" -Dtests.showOutput=onerror 
-Dtests.multiplier=2 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.seed=1F0BCE896BC15E68 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=nl-BE -Dtests.timezone=Etc/Greenwich -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8

[...truncated 428889 lines...]
[repro] Setting last failure code to 256

[repro] Failures:
[repro]   3/5 failed: org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates
[repro] git checkout 7713a4f2458c77de08193dc548807b9e90214aaf

[...truncated 2 lines...]
[repro] Exiting with code 256

[...truncated 6 lines...]

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13101) Shared storage support in SolrCloud

2019-01-28 Thread Yonik Seeley (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754041#comment-16754041
 ] 

Yonik Seeley commented on SOLR-13101:
-

Thinking about how to kick this off... 
At the most basic level, looking at the HDFS layout scheme we see this ("test" 
is the name of the collection):
{code}
local_file_system://.../node1/test_shard1_replica_n1/core.properties
hdfs://.../data/test/core_node2/data/
{code}
And core.properties looks like:
{code}
numShards=1
collection.configName=conf1
name=test_shard1_replica_n1
replicaType=NRT
shard=shard1
collection=test
coreNodeName=core_node2
{code}

It seems like the most basic desirable change would be to the naming scheme for 
collections with shared storage.
Instead of ...///data
it should be ...///data
since there is only one canonical index per shard.



> Shared storage support in SolrCloud
> ---
>
> Key: SOLR-13101
> URL: https://issues.apache.org/jira/browse/SOLR-13101
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Priority: Major
>
> Solr should have first-class support for shared storage (blob/object stores 
> like S3, google cloud storage, etc. and shared filesystems like HDFS, NFS, 
> etc).
> The key component will likely be a new replica type for shared storage.  It 
> would have many of the benefits of the current "pull" replicas (not indexing 
> on all replicas, all shards identical with no shards getting out-of-sync, 
> etc), but would have additional benefits:
>  - Any shard could become leader (the blob store always has the index)
>  - Better elasticity scaling down
>- durability not linked to number of replcias.. a single replica could be 
> common for write workloads
>- could drop to 0 replicas for a shard when not needed (blob store always 
> has index)
>  - Allow for higher performance write workloads by skipping the transaction 
> log
>- don't pay for what you don't need
>- a commit will be necessary to flush to stable storage (blob store)
>  - A lot of the complexity and failure modes go away
> An additional component a Directory implementation that will work well with 
> blob stores.  We probably want one that treats local disk as a cache since 
> the latency to remote storage is so large.  I think there are still some 
> "locking" issues to be solved here (ensuring that more than one writer to the 
> same index won't corrupt it).  This should probably be pulled out into a 
> different JIRA issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13155) CLI tool for testing autoscaling suggestions against a live cluster

2019-01-28 Thread Andrzej Bialecki (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754034#comment-16754034
 ] 

Andrzej Bialecki  commented on SOLR-13155:
--

Updated patch with some improvements in the RedactionUtils.

I'll leave it here for now without committing - I need someone else to add the 
Windows script support.

> CLI tool for testing autoscaling suggestions against a live cluster
> ---
>
> Key: SOLR-13155
> URL: https://issues.apache.org/jira/browse/SOLR-13155
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.0, master (9.0)
>
> Attachments: SOLR-13155.patch, SOLR-13155.patch
>
>
> Solr already provides /autoscaling/diagnostics and /autoscaling/suggestions 
> endpoints. In some situations it would be very helpful to be able to run 
> "what if" scenarios using data about nodes and replicas taken from a 
> production cluster but with a different autoscaling policy than the one that 
> is deployed, without also worrying that the calculations would negatively 
> impact a production cluster's Overseer leader.
> All necessary classes (including the Policy engine) are self-contained in the 
> SolrJ component, so it's just a matter of packaging and writing a CLI tool + 
> a wrapper script.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-13155) CLI tool for testing autoscaling suggestions against a live cluster

2019-01-28 Thread Andrzej Bialecki (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-13155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-13155:
-
Attachment: SOLR-13155.patch

> CLI tool for testing autoscaling suggestions against a live cluster
> ---
>
> Key: SOLR-13155
> URL: https://issues.apache.org/jira/browse/SOLR-13155
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Reporter: Andrzej Bialecki 
>Assignee: Andrzej Bialecki 
>Priority: Major
> Fix For: 8.0, master (9.0)
>
> Attachments: SOLR-13155.patch, SOLR-13155.patch
>
>
> Solr already provides /autoscaling/diagnostics and /autoscaling/suggestions 
> endpoints. In some situations it would be very helpful to be able to run 
> "what if" scenarios using data about nodes and replicas taken from a 
> production cluster but with a different autoscaling policy than the one that 
> is deployed, without also worrying that the calculations would negatively 
> impact a production cluster's Overseer leader.
> All necessary classes (including the Policy engine) are self-contained in the 
> SolrJ component, so it's just a matter of packaging and writing a CLI tool + 
> a wrapper script.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12330) Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) either reported too little and even might be ignored

2019-01-28 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-12330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754028#comment-16754028
 ] 

Munendra S N commented on SOLR-12330:
-

[~mkhludnev]
I came across same issue for not passing start, end, gap, JSON facet throws 
NPE. I have created SOLR-13174 and small fix for this particular issue.
How should the filter case be handled?? I will also try to find other cases 
which can give NPE in Facet Module

> Referencing non existing parameter in JSON Facet "filter" (and/or other NPEs) 
> either reported too little and even might be ignored 
> ---
>
> Key: SOLR-12330
> URL: https://issues.apache.org/jira/browse/SOLR-12330
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Affects Versions: 7.3
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Attachments: SOLR-12330.patch, SOLR-12330.patch
>
>
> Just encounter such weird behaviour, will recheck and followup. 
> {{"filter":["\{!v=$bogus}"]}} responds back with just NPE which makes 
> impossible to guess the reason.
> It might be even worse, since {{"filter":[\{"param":"bogus"}]}} seems like 
> just silently ignored.
> Once agin, I'll double check. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Unicode Quotes in query parser

2019-01-28 Thread John Ryan

Thanks Michael,

The dismax hardler does indeed run and escape all non standard characters 
before handing it off to the analysers and tokenisers. This fix looks like it 
belongs more in the handler, more than the parser. I wrote a SearchComponent 
handler to do the same thing at that level and can drop it in as a plugin - 
although it seems like something that should be out-of-the-box.

I think I’ll close it and reconsider the implementation.

John

> On 22 Jan 2019, at 16:20, Michael Sokolov  wrote:
> 
> Right - QueryParsers generally do a first pass, parsing incoming Strings 
> using their operator characters tok tokenize the input and only after that do 
> they pass the tokens (or phrases) to an Analyzer. I haven't checked Dismax - 
> not sure how it does its parsing exactly, but I doubt you can just "turn on 
> the right Analyzer" to get it to recognize curly quotes as phrase operators, 
> eg.
> 
> On Tue, Jan 22, 2019 at 10:39 AM Mikhail Khludnev  > wrote:
> My impression that these quotes are ones which are part of dismax query 
> syntax ie they should be handled before the analysis happens. 
> 
> On Mon, Jan 21, 2019 at 8:09 PM Walter Underwood  > wrote:
> First, check which transforms are already handled by Unicode normalization. 
> Put this in all of your analyzer chains:
> 
> 
> 
> Probably need this in solrconfig.xml:
> 
>  
>regex=".*\.jar" />
>dir="${solr.install.dir:../../../..}/contrib/analysis-extras/lucene-libs" 
> regex=".*\.jar" />
> 
> I really cannot think of a reason to use unnormalized Unicode in Solr. That 
> should be in all the sample files.
> 
> For search character matching, yes, all spaces should be normalized. I have 
> too many hacks fixing non-breaking spaces spread around the code. When 
> matching, there is zero use for stuff like ideographic space (U+3000).
> 
> I’m not sure if quotes are normalized. I did some searching around without 
> success. That might come under character folding. There was a draft, now 
> withdrawn, for standard character folding. I’d probably start there for a 
> Unicode folding char filter.
> 
> https://www.unicode.org/reports/tr30/tr30-4.html 
> 
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org 
> http://observer.wunderwood.org/   (my blog)
> 
>> On Jan 21, 2019, at 7:43 AM, Michael Sokolov > > wrote:
>> 
>> I think this is probably better to discuss on solr-user, or maybe solr-dev, 
>> since it is dismax parser you are talking about, which really lives in Solr. 
>> However, my 2c  - this seems somewhat dubious. Maybe people want to include 
>> those in their terms? Also, it leads to a kind of slippery slope: would you 
>> also want to convert all the various white space characters (no-break space, 
>> thin space, em space, etc)  as vanilla ascii 32? How about all the other 
>> "operator" characters like brackets?
>> 
>> On Mon, Jan 21, 2019 at 9:50 AM John Ryan > > wrote:
>> I'm looking to create an issue to add support for Unicode Double Quotes to 
>> the dismax parser. 
>> 
>> I want to replace all types of double quotes with standard ones before they 
>> get stripped 
>> 
>> i.e.
>> “ ” „ “ „ « » ‟ ❝ ❞ ⹂ ＂
>> 
>> With 
>> "
>> I presume this has been discussed before?
>> 
>> I have a POC here: 
>> https://github.com/apache/lucene-solr/compare/branch_7x...jnyryan:branch_7x 
>> 
>> 
>> Thanks, 
>> 
>> John
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>> 
>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>> 
>> 
> 
> 
> 
> -- 
> Sincerely yours
> Mikhail Khludnev

[jira] [Commented] (SOLR-13174) NPE in Json Facet API for Facet range

2019-01-28 Thread Munendra S N (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754020#comment-16754020
 ] 

Munendra S N commented on SOLR-13174:
-

[^SOLR-13174.patch]

Patch with simple fix. Here, trying to mimic classical facets behavior

> NPE in Json Facet API for Facet range
> -
>
> Key: SOLR-13174
> URL: https://issues.apache.org/jira/browse/SOLR-13174
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Munendra S N
>Priority: Minor
> Attachments: SOLR-13174.patch
>
>
> There is mismatch in the error and status code between JSON facet's facet 
> range and Classical facet range.
> When start or end or gap is not specified in the request, Classical faceting 
> returns Bad request where as JSON facet returns 500 without below trace
> {code:java}
> {
> "trace": "java.lang.NullPointerException\n\tat 
> org.apache.solr.search.facet.FacetRangeProcessor.createRangeList(FacetRange.java:216)\n\tat
>  
> org.apache.solr.search.facet.FacetRangeProcessor.getRangeCounts(FacetRange.java:206)\n\tat
>  
> org.apache.solr.search.facet.FacetRangeProcessor.process(FacetRange.java:98)\n\tat
>  
> org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:460)\n\tat
>  
> org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:407)\n\tat
>  
> org.apache.solr.search.facet.FacetQueryProcessor.process(FacetQuery.java:64)\n\tat
>  org.apache.solr.search.facet.FacetModule.process(FacetModule.java:154)\n\tat 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)\n\tat
>  
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)\n\tat
>  org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)\n\tat 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)\n\tat
>  
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\n\tat
>  
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
>  
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>  
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat
>  
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat
>  
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
>  
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
>  
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>  
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
>  
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>  org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat
>  
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
>  org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
>  
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat
>  
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat
>  
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\tat
>  
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\n\tat
>  
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\n\tat
>  java.lang.Thread.run(Thread.java:748)\n",
> "code": 500
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

--

Reporting multiple issues triggering an HTTP 500 in Solr

2019-01-28 Thread César Rodríguez

Hi there,

We analyzed the source code of Apache Solr and found a number of
issues that we would like to report. We configured Solr using the
films collection from the quick start tutorial [2]. For every issue
that we found we can provide a request URL for which Solr returns an
HTTP 500 error (usually due to an uncaught exception).

Please find below two example issues we found, together with an
explanation of the cause and a patch fixing the bug (attached patches
pass the unit tests). The issues we would like to report are similar
to those. We found them with help from Diffblue Microservice Testing
[1].

We would like to know:
 * Should we open JIRA tickets for each one of them?
 * We can provide a URL that triggers the problem, a stack trace, and
information about the server setup. What other information would you
like to see?

I look forward to your response.

Best regards,
César

PS. If this question is not suitable for this list, please indicate
where to follow up.

=== Issue 1: Null pointer exception due to unhandled case in
ComplexPhraseQueryParser  ===

Assume that we request the following URL:

/solr/films/select?q={!complexphrase}genre:"-om*"

Handling this query involves constructing a SpanQuery, which happens
in the rewrite method of ComplexPhraseQueryParser. In particular, the
expression is decomposed into a BooleanQuery, which has exactly one
clause, namely the negative clause -genre:”om*”. The rewrite method
then further transforms this into a SpanQuery; in this case, it goes
into the path that handles complex queries with both positive and
negative clauses. It extracts the subset of positive clauses - note
that this set of clauses is empty for this query. The positive clauses
are then combined into a SpanNearQuery (around line 340), which is
then used to build a SpanNotQuery.

Further down the line, the field attribute of the SpanNearQuery is
accessed and used as an index into a TreeMap. But since we had an
empty set of positive clauses, the SpanNearQuery does not have its
field attribute set, so we get a null here - this leads to an
exception.

A possible fix would be to detect the situation where we have an empty
set of positive clauses and include a single synthetic clause that
matches either everything or nothing. See attached file
0001-Fix-NullPointerException.patch.

=== Issue 2: StringIndexOutOfBoundsException when expanding macros ===

Assume that we request the following URL:

/solr/films/select?a=${${b}}

Parameter macro expansion [3]  seems to take place in
org.apache.solr.request.macro.MacroExpander._expand(String val).
However, this method throws a StringIndexOutOfBoundsException for the
URL above. From reading the code it seems that macros are not expanded
inside curly brackets ${...}, and so the “${b}” inside “${${b}}”
should not be expanded. But the function seems to fail to detect this
specific case and graciously refuse to expand it. Instead, it
unexpectedly throws the exception. A possible fix could be updating
the ‘idx’ variable when the StrParser detects that no valid identifier
can be found inside the brackets. See attached file
0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch.

References:
[1] https://www.diffblue.com/labs/
[2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html
[3] http://yonik.com/solr-query-parameter-substitution/

-- 
Diffblue Limited, a company registered in England and Wales number 
09958102, with its registered office at Ramsey House, 10 St. Ebbes Street, 
Oxford, OX1 1PT, England
From 5cf5371c1d6febf1a6423aa65aecde8be7eed77c Mon Sep 17 00:00:00 2001
From: Johannes Kloos 
Date: Thu, 24 Jan 2019 16:54:20 +
Subject: [PATCH] Fix NullPointerException.

---
 .../src/java/org/apache/lucene/search/spans/SpanNearQuery.java | 2 ++
 .../lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.java | 7 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java b/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java
index 17b9e51..a312ad2 100644
--- a/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java
+++ b/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java
@@ -130,6 +130,8 @@ public class SpanNearQuery extends SpanQuery implements Cloneable {
* @param inOrder true if order is important
*/
   public SpanNearQuery(SpanQuery[] clausesIn, int slop, boolean inOrder) {
+if (clausesIn.length == 0)
+  throw new IllegalArgumentException("SpanNearQuery with no clauses");
 this.clauses = new ArrayList<>(clausesIn.length);
 for (SpanQuery clause : clausesIn) {
   if (this.field == null) {   // check field
diff --git a/lucene/queryparser/src/java/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.java b/lucene/queryparser/src/java/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.java
index ffe0066..574589f

[jira] [Updated] (SOLR-13174) NPE in Json Facet API for Facet range

2019-01-28 Thread Munendra S N (JIRA)



 [ 
https://issues.apache.org/jira/browse/SOLR-13174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munendra S N updated SOLR-13174:

Attachment: SOLR-13174.patch

> NPE in Json Facet API for Facet range
> -
>
> Key: SOLR-13174
> URL: https://issues.apache.org/jira/browse/SOLR-13174
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Facet Module
>Reporter: Munendra S N
>Priority: Minor
> Attachments: SOLR-13174.patch
>
>
> There is mismatch in the error and status code between JSON facet's facet 
> range and Classical facet range.
> When start or end or gap is not specified in the request, Classical faceting 
> returns Bad request where as JSON facet returns 500 without below trace
> {code:java}
> {
> "trace": "java.lang.NullPointerException\n\tat 
> org.apache.solr.search.facet.FacetRangeProcessor.createRangeList(FacetRange.java:216)\n\tat
>  
> org.apache.solr.search.facet.FacetRangeProcessor.getRangeCounts(FacetRange.java:206)\n\tat
>  
> org.apache.solr.search.facet.FacetRangeProcessor.process(FacetRange.java:98)\n\tat
>  
> org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:460)\n\tat
>  
> org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:407)\n\tat
>  
> org.apache.solr.search.facet.FacetQueryProcessor.process(FacetQuery.java:64)\n\tat
>  org.apache.solr.search.facet.FacetModule.process(FacetModule.java:154)\n\tat 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)\n\tat
>  
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)\n\tat
>  org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)\n\tat 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)\n\tat 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)\n\tat
>  
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\n\tat
>  
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
>  
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>  
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat
>  
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat
>  
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat
>  
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
>  
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
>  
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
>  
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>  
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
>  
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
>  org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat
>  
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
>  org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
>  
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat
>  
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat
>  
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\tat
>  
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\n\tat
>  
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\n\tat
>  java.lang.Thread.run(Thread.java:748)\n",
> "code": 500
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@l

[jira] [Created] (SOLR-13174) NPE in Json Facet API for Facet range

2019-01-28 Thread Munendra S N (JIRA)

Munendra S N created SOLR-13174:
---

 Summary: NPE in Json Facet API for Facet range
 Key: SOLR-13174
 URL: https://issues.apache.org/jira/browse/SOLR-13174
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Facet Module
Reporter: Munendra S N


There is mismatch in the error and status code between JSON facet's facet range 
and Classical facet range.
When start or end or gap is not specified in the request, Classical faceting 
returns Bad request where as JSON facet returns 500 without below trace
{code:java}
{
"trace": "java.lang.NullPointerException\n\tat 
org.apache.solr.search.facet.FacetRangeProcessor.createRangeList(FacetRange.java:216)\n\tat
 
org.apache.solr.search.facet.FacetRangeProcessor.getRangeCounts(FacetRange.java:206)\n\tat
 
org.apache.solr.search.facet.FacetRangeProcessor.process(FacetRange.java:98)\n\tat
 
org.apache.solr.search.facet.FacetProcessor.processSubs(FacetProcessor.java:460)\n\tat
 
org.apache.solr.search.facet.FacetProcessor.fillBucket(FacetProcessor.java:407)\n\tat
 
org.apache.solr.search.facet.FacetQueryProcessor.process(FacetQuery.java:64)\n\tat
 org.apache.solr.search.facet.FacetModule.process(FacetModule.java:154)\n\tat 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)\n\tat 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)\n\tat 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
 org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat 
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat
 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\n\tat
 org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\n\tat 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat
 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat
 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\n\tat
 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\n\tat
 java.lang.Thread.run(Thread.java:748)\n",
"code": 500
}
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-13148) Move time based logic into TimeRoutedAlias class

2019-01-28 Thread Gus Heck (JIRA)



[ 
https://issues.apache.org/jira/browse/SOLR-13148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754013#comment-16754013
 ] 

Gus Heck commented on SOLR-13148:
-

Ah yes, I hadn't yet tracked down references to TimeRoutedAlias there... good 
catch, though I'm sure it would have washed out in future work, it belongs here 
in this ticket. Not sure if I like the factory, I'd like to make things simpler 
than what I see if possible (not keen on BiFunction here), but we'll see what 
happens when I fiddle with it a bit. Maybe I'll agree in the end.

> Move time based logic into TimeRoutedAlias class
> 
>
> Key: SOLR-13148
> URL: https://issues.apache.org/jira/browse/SOLR-13148
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: UpdateRequestProcessors
>Affects Versions: master (9.0)
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Attachments: SOLR-13148.patch, SOLR-13148.patch, SOLR-13148.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To pave the way for new types of routed aliases we need to get any and all 
> time related logic out of the URP and into TimeRoutedAlias. This ticket will 
> do that, Rename the URP and extract an initial proposed Generic RoutedAlias 
> interface implemented by both TimeRoutedAlias and a skeleton place holder for 
> CategoryRoutedAlias



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-8406) Make ByteBufferIndexInput public

2019-01-28 Thread Dawid Weiss (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-8406.
-
Resolution: Duplicate

Replaced by LUCENE-8661.

> Make ByteBufferIndexInput public
> 
>
> Key: LUCENE-8406
> URL: https://issues.apache.org/jira/browse/LUCENE-8406
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
>
> The logic of handling byte buffers splits, their proper closing (cleaner) and 
> all the trickery involved in slicing, cloning and proper exception handling 
> is quite daunting. 
> While ByteBufferIndexInput.newInstance(..) is public, the parent class 
> ByteBufferIndexInput is not. I think we should make the parent class public 
> to allow advanced users to make use of this (complex) piece of code to create 
> IndexInput based on a sequence of ByteBuffers.
> One particular example here is RAMDirectory, which currently uses a custom 
> IndexInput implementation, which in turn reaches to RAMFile's synchronized 
> methods. This is the cause of quite dramatic congestions on multithreaded 
> systems. While we clearly discourage RAMDirectory from being used in 
> production environments, there really is no need for it to be slow. If 
> modified only slightly (to use ByteBuffer-based input), the performance is on 
> par with FSDirectory. Here's a sample log comparing FSDirectory with 
> RAMDirectory and the "modified" RAMDirectory making use of the ByteBuffer 
> input:
> {code}
> 14:26:40 INFO  console: FSDirectory index.
> 14:26:41 INFO  console: Opened with 299943 documents.
> 14:26:50 INFO  console: Finished: 8.820 s, 24 matches.
> 14:26:50 INFO  console: RAMDirectory index.
> 14:26:50 INFO  console: Opened with 299943 documents.
> 14:28:50 INFO  console: Finished: 2.012 min, 24 matches.
> 14:28:50 INFO  console: RAMDirectory2 index (wrapped byte[] buffers).
> 14:28:50 INFO  console: Opened with 299943 documents.
> 14:29:00 INFO  console: Finished: 9.215 s, 24 matches.
> 14:29:00 INFO  console: RAMDirectory2 index (direct memory buffers).
> 14:29:00 INFO  console: Opened with 299943 documents.
> 14:29:08 INFO  console: Finished: 8.817 s, 24 matches.
> {code}
> Note the performance difference is an order of magnitude on this 32-CPU 
> system (2 minutes vs. 9 seconds). The tiny performance difference between the 
> implementation based on direct memory buffers vs. those acquired via 
> ByteBuffer.wrap(byte[]) is due to the fact that direct buffers access their 
> data via unsafe and the wrapped counterpart uses regular java array access 
> (my best guess).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-8661) Replace ByteBuffersIndexInput with ByteBufferIndexInput (replace and rename)

2019-01-28 Thread Dawid Weiss (JIRA)

Dawid Weiss created LUCENE-8661:
---

 Summary: Replace ByteBuffersIndexInput with ByteBufferIndexInput 
(replace and rename)
 Key: LUCENE-8661
 URL: https://issues.apache.org/jira/browse/LUCENE-8661
 Project: Lucene - Core
  Issue Type: Task
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: master (9.0)


This is a follow-up to removal  of RAMDirectory. We can now clean up and 
finalize the API of byte buffer abstractions. I will clean up the duplications 
there on master.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-8.x-Linux (64bit/jdk-10.0.1) - Build # 94 - Unstable!

2019-01-28 Thread Policeman Jenkins Server

Build: https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Linux/94/
Java: 64bit/jdk-10.0.1 -XX:+UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  
org.apache.solr.cloud.api.collections.CollectionsAPIAsyncDistributedZkTest.testAsyncRequests

Error Message:
AddReplica did not complete expected same: was not:

Stack Trace:
java.lang.AssertionError: AddReplica did not complete expected same: 
was not:
at 
__randomizedtesting.SeedInfo.seed([B28CC3AC122CFDC7:56C8FF1BB484B318]:0)
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotSame(Assert.java:828)
at org.junit.Assert.assertSame(Assert.java:771)
at 
org.apache.solr.cloud.api.collections.CollectionsAPIAsyncDistributedZkTest.testAsyncRequests(CollectionsAPIAsyncDistributedZkTest.java:162)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:844)




Build Log:
[...truncated 13509 lines...]
   [junit4] Suite: 
org.apache.solr.cloud.api.collections.CollectionsAPIAsyncDistributedZkTest
   [jun

Re: Lucene/Solr 8.0

2019-01-28 Thread Tommaso Teofili

sure, thanks Jim!

Tommaso

Il giorno lun 28 gen 2019 alle ore 10:35 jim ferenczi
 ha scritto:
>
> Go ahead Tommaso the branch is not created yet.
> The plan is to create the branches (7.7 and 8.0)  tomorrow or wednesday and 
> to announce the feature freeze the same day.
> For blocker issues that are still open this leaves another week to work on a 
> patch and we can update the status at the end of the week in order to decide 
> if we can start the first build candidate
> early next week. Would that work for you ?
>
> Le lun. 28 janv. 2019 à 10:19, Tommaso Teofili  a 
> écrit :
>>
>> I'd like to backport https://issues.apache.org/jira/browse/LUCENE-8659
>> (upgrade to OpenNLP 1.9.1) to 8x branch, if there's still time.
>>
>> Regards,
>> Tommaso
>>
>> Il giorno lun 28 gen 2019 alle ore 07:59 Adrien Grand
>>  ha scritto:
>> >
>> > Hi Noble,
>> >
>> > No it hasn't created yet.
>> >
>> > On Mon, Jan 28, 2019 at 3:55 AM Noble Paul  wrote:
>> > >
>> > > Is the branch already cut for 8.0? which is it?
>> > >
>> > > On Mon, Jan 28, 2019 at 4:03 AM David Smiley  
>> > > wrote:
>> > > >
>> > > > I finally have a patch up for 
>> > > > https://issues.apache.org/jira/browse/SOLR-12768 (already marked as 
>> > > > 8.0 blocker) that I feel pretty good about.  This provides a key part 
>> > > > of the nested document support.
>> > > > I will work on some documentation for it this week -- SOLR-13129
>> > > >
>> > > > On Fri, Jan 25, 2019 at 3:07 PM Jan Høydahl  
>> > > > wrote:
>> > > >>
>> > > >> I don't think it is critical for this to be a blocker for 8.0. If it 
>> > > >> gets fixed in 8.0.1 that's ok too, given this is an ooold bug.
>> > > >> I think we should simply remove the buffering feature in the UI and 
>> > > >> replace it with an error message popup or something.
>> > > >> I'll try to take a look next week.
>> > > >>
>> > > >> --
>> > > >> Jan Høydahl, search solution architect
>> > > >> Cominvent AS - www.cominvent.com
>> > > >>
>> > > >> 25. jan. 2019 kl. 20:39 skrev Tomás Fernández Löbbe 
>> > > >> :
>> > > >>
>> > > >> I think the UI is an important Solr feature. As long as there is a 
>> > > >> reasonable time horizon for the issue being resolved I'm +1 on making 
>> > > >> it a blocker. I'm not familiar enough with the UI code to help either 
>> > > >> unfortunately.
>> > > >>
>> > > >> On Fri, Jan 25, 2019 at 11:24 AM Gus Heck  wrote:
>> > > >>>
>> > > >>> It looks like someone tried to make it a blocker once before... And 
>> > > >>> it's actually a duplicate of an earlier issue 
>> > > >>> (https://issues.apache.org/jira/browse/SOLR-9818). I guess its a 
>> > > >>> question of whether or not overall quality has a bearing on the 
>> > > >>> decision to release. As it turns out the screen shot I posted to the 
>> > > >>> issue is less than half of the shards that eventually got created 
>> > > >>> since there was an outstanding queue of requests still processing at 
>> > > >>> the time. I'm now having to delete 50 or so cores, which luckily are 
>> > > >>> small 100 Mb initial testing cores, not the 20GB cores we'll be 
>> > > >>> testing on in the near future. It more or less makes it impossible 
>> > > >>> to recommend the use of the admin UI for anything other than read 
>> > > >>> only observation of the cluster. Now imagine someone leaves a 
>> > > >>> browser window open and forgets about it rather than browsing away 
>> > > >>> or closing the window, not knowing that it's silently pumping out 
>> > > >>> requests after showing an error... would completely hose a node, and 
>> > > >>> until they tracked down the source of the requests, (hope he didn't 
>> > > >>> go home) it would be impossible to resolve...
>> > > >>>
>> > > >>> On Fri, Jan 25, 2019 at 1:25 PM Adrien Grand  
>> > > >>> wrote:
>> > > 
>> > >  Releasing a new major is very challenging on its own, I'd rather not
>> > >  call it a blocker and delay the release for it since this isn't a 
>> > >  new
>> > >  regression in 8.0: it looks like a problem that has affected Solr
>> > >  since at least 6.3? I'm not familiar with the UI code at all, but
>> > >  maybe this is something that could get fixed before we build a RC?
>> > > 
>> > > 
>> > > 
>> > > 
>> > >  On Fri, Jan 25, 2019 at 6:06 PM Gus Heck  wrote:
>> > >  >
>> > >  > I'd like to suggest that 
>> > >  > https://issues.apache.org/jira/browse/SOLR-10211 be promoted to 
>> > >  > block 8.0. I just got burned by it a second time.
>> > >  >
>> > >  > On Thu, Jan 24, 2019 at 1:05 PM Uwe Schindler  
>> > >  > wrote:
>> > >  >>
>> > >  >> Cool,
>> > >  >>
>> > >  >> I am working on giving my best release time guess as possible on 
>> > >  >> the FOSDEM conference!
>> > >  >>
>> > >  >> Uwe
>> > >  >>
>> > >  >> -
>> > >  >> Uwe Schindler
>> > >  >> Achterdiek 19, D-28357 Bremen
>> > >  >> http://www.thetaphi.de
>> > >  >> eMail: u...@thetaphi.de
>> > >

[jira] [Resolved] (LUCENE-8438) RAMDirectory speed improvements and cleanup

2019-01-28 Thread Dawid Weiss (JIRA)



 [ 
https://issues.apache.org/jira/browse/LUCENE-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-8438.
-
   Resolution: Fixed
Fix Version/s: master (9.0)

> RAMDirectory speed improvements and cleanup
> ---
>
> Key: LUCENE-8438
> URL: https://issues.apache.org/jira/browse/LUCENE-8438
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: master (9.0)
>
> Attachments: capture-1.png, capture-4.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> RAMDirectory screams for a cleanup. It is used and abused in many places and 
> even if we discourage its use in favor of native (mmapped) buffers, there 
> seem to be benefits of keeping RAMDirectory available (quick throw-away 
> indexes without the need to setup external tmpfs, for example).
> Currently RAMDirectory performs very poorly under concurrent loads. The 
> implementation is also open for all sorts of abuses – the streams can be 
> reset and are used all around the place as temporary buffers, even without 
> the presence of RAMDirectory itself. This complicates the implementation and 
> is pretty confusing.
> An example of how dramatically slow RAMDirectory is under concurrent load, 
> consider this PoC pseudo-benchmark. It creates a single monolithic segment 
> with 500K very short documents (single field, with norms). The index is ~60MB 
> once created. We then run semi-complex Boolean queries on top of that index 
> from N concurrent threads. The attached capture-4 shows the result (queries 
> per second over 5-second spans) for a varying number of concurrent threads on 
> an AWS machine with 32 CPUs available (of which it seems 16 seem to be real, 
> 16 hyper-threaded). That red line at the bottom (which drops compared to a 
> single-threaded performance) is the current RAMDirectory. RAMDirectory2 is an 
> alternative implementation I wrote that uses ByteBuffers. Yes, it's slower 
> than the native mmapped implementation, but a *lot* faster then the current 
> RAMDirectory (and more GC-friendly because it uses dynamic progressive block 
> scaling internally).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 138 matches

Mail list logo