[jira] [Commented] (SOLR-5894) Speed up high-cardinality facets with sparse counters

2016-12-05 Thread Toke Eskildsen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724715#comment-15724715
 ] 

Toke Eskildsen commented on SOLR-5894:
--

No, those are fully separate issues.

Faceting is a bit of a mess with multiple implementations at multiple levels. 
See SOLR-7296 for more on this.

> Speed up high-cardinality facets with sparse counters
> -
>
> Key: SOLR-5894
> URL: https://issues.apache.org/jira/browse/SOLR-5894
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 4.7.1
>Reporter: Toke Eskildsen
>Priority: Minor
>  Labels: faceted-search, faceting, memory, performance
> Attachments: SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> SOLR-5894_test.zip, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> author_7M_tags_1852_logged_queries_warmed.png, 
> sparse_200docs_fc_cutoff_20140403-145412.png, 
> sparse_500docs_20140331-151918_multi.png, 
> sparse_500docs_20140331-151918_single.png, 
> sparse_5051docs_20140328-152807.png
>
>
> Multiple performance enhancements to Solr String faceting.
> * Sparse counters, switching the constant time overhead of extracting top-X 
> terms with time overhead linear to result set size
> * Counter re-use for reduced garbage collection and lower per-call overhead
> * Optional counter packing, trading speed for space
> * Improved distribution count logic, greatly improving the performance of 
> distributed faceting
> * In-segment threaded faceting
> * Regexp based white- and black-listing of facet terms
> * Heuristic faceting for large result sets
> Currently implemented for Solr 4.10. Source, detailed description and 
> directly usable WAR at http://tokee.github.io/lucene-solr/
> This project has grown beyond a simple patch and will require a fair amount 
> of co-operation with a committer to get into Solr. Splitting into smaller 
> issues is a possibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9829) Solr cannot provide index service after a large GC pause

2016-12-05 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9829:
-

 Summary: Solr cannot provide index service after a large GC pause
 Key: SOLR-9829
 URL: https://issues.apache.org/jira/browse/SOLR-9829
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: update
Affects Versions: 5.3.2
 Environment: Redhat enterprise server 64bit 
Reporter: Forest Soup


When Solr meets a large GC pause like 
https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it cannot 
provide service and never come back until restart. 

But in the ZooKeeper, the cores on that server still shows active. 

Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
http 400 due to "possible analysis error." whose root cause is still 
"IndexWriter is closed", which we think it should return 500 instead(documented 
in https://issues.apache.org/jira/browse/SOLR-9825).

Our questions in this JIRA are:
1, should solr mark it as down when it cannot provide index service?
2, Is it possible solr re-open the IndexWriter to provide index service again?

solr log snippets:
2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: Exception writing document id 
Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
to the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at 

[jira] [Updated] (SOLR-9825) Solr should not return HTTP 400 for some cases

2016-12-05 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9825:
--
Affects Version/s: (was: 5.3)
   5.3.2

> Solr should not return HTTP 400 for some cases
> --
>
> Key: SOLR-9825
> URL: https://issues.apache.org/jira/browse/SOLR-9825
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.2
>Reporter: Forest Soup
>
> For some cases, when solr handling requests, it should not always return http 
> 400.  We met several cases, here is the recent two:
> Case 1:  When adding a doc, if there is runtime error happens, even it's a 
> solr internal issue, it returns http 400 to confuse the client. Actually the 
> request is good, while IndexWriter is closed. 
> The exception stack is:
> 2016-11-22 21:23:32.858 ERROR (qtp2011912080-83) [c:collection12 s:shard1 
> r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: Exception writing document id 
> Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20824042!8918AB024CF638F685257DDC00074D78 
> to the index; possible analysis error.
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>   at org.eclipse.jetty.server.Server.handle(Server.java:499)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>   at 
> 

[jira] [Updated] (SOLR-9828) Very long young generation stop the world GC pause

2016-12-05 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9828:
--
Description: 
We are using oracle jdk8u92 64bit.
The jvm memory related options:
-Xms32768m 
-Xmx32768m 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-XX:+UseG1GC 
-XX:+PerfDisableSharedMem 
-XX:+ParallelRefProcEnabled 
-XX:G1HeapRegionSize=8m 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+AggressiveOpts 
-XX:+AlwaysPreTouch 
-XX:ConcGCThreads=16 
-XX:ParallelGCThreads=18 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-verbose:gc 
-XX:+PrintHeapAtGC 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-XX:+PrintGCTimeStamps 
-XX:+PrintTenuringDistribution 
-XX:+PrintGCApplicationStoppedTime 
-Xloggc:/mnt/solrdata1/log/solr_gc.log

It usually works fine. But recently we met very long stop the world young 
generation GC pause. Some snippets of the gc log are as below:
2016-11-22T20:43:16.436+: 2942054.483: Total time for which application 
threads were stopped: 0.0005510 seconds, Stopping threads took: 0.894 
seconds
2016-11-22T20:43:16.463+: 2942054.509: Total time for which application 
threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 
seconds
{Heap before GC invocations=2246 (full 0):
 garbage-first heap   total 26673152K, used 4683965K [0x7f0c1000, 
0x7f0c108065c0, 0x7f141000)
  region size 8192K, 162 young (1327104K), 17 survivors (139264K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) 
(young)
Desired survivor size 88080384 bytes, new threshold 15 (max 15)
- age   1:   28176280 bytes,   28176280 total
- age   2:5632480 bytes,   33808760 total
- age   3:9719072 bytes,   43527832 total
- age   4:6219408 bytes,   49747240 total
- age   5:4465544 bytes,   54212784 total
- age   6:3417168 bytes,   57629952 total
- age   7:5343072 bytes,   62973024 total
- age   8:2784808 bytes,   65757832 total
- age   9:6538056 bytes,   72295888 total
- age  10:6368016 bytes,   78663904 total
- age  11: 695216 bytes,   79359120 total
, 97.2044320 secs]
   [Parallel Time: 19.8 ms, GC Workers: 18]
  [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 
2942054612.7, Diff: 10.6]
  [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 
43.5]
  [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0]
 [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192]
  [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.0]
  [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9]
  [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1]
 [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6]
  [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 
308.7]
  [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 
2942054621.8, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 97184.3 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 8.5 ms]
  [Ref Enq: 0.2 ms]
  [Redirty Cards: 0.2 ms]
  [Humongous Register: 0.1 ms]
  [Humongous Reclaim: 0.1 ms]
  [Free CSet: 0.4 ms]
   [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 
4574.2M(25.4G)->3450.8M(26.8G)]
Heap after GC invocations=2247 (full 0):
 garbage-first heap   total 28049408K, used 3533601K [0x7f0c1000, 
0x7f0c10806b00, 0x7f141000)
  region size 8192K, 21 young (172032K), 21 survivors (172032K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
}
 [Times: user=0.00 sys=94.28, real=97.19 secs] 
2016-11-22T20:44:53.760+: 2942151.806: Total time for which application 
threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 
seconds
2016-11-22T20:44:53.762+: 2942151.809: Total time for which application 
threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 
seconds

And CPU reached near 100% during the GC.
The load is normal at that time according to the stats of solr 
update/select/delete handler and jetty request log.



  was:
We are using oracle jdk8u92 64bit.
The jvm memory related options:
-Xms32768m 
-Xmx32768m 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-XX:+UseG1GC 
-XX:+PerfDisableSharedMem 
-XX:+ParallelRefProcEnabled 
-XX:G1HeapRegionSize=8m 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+AggressiveOpts 
-XX:+AlwaysPreTouch 

[jira] [Created] (SOLR-9828) Very long young generation stop the world GC pause

2016-12-05 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9828:
-

 Summary: Very long young generation stop the world GC pause 
 Key: SOLR-9828
 URL: https://issues.apache.org/jira/browse/SOLR-9828
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 5.3.2
 Environment: Linux Redhat 64bit
Reporter: Forest Soup


We are using oracle jdk8u92 64bit.
The jvm memory related options:
-Xms32768m 
-Xmx32768m 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-XX:+UseG1GC 
-XX:+PerfDisableSharedMem 
-XX:+ParallelRefProcEnabled 
-XX:G1HeapRegionSize=8m 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+AggressiveOpts 
-XX:+AlwaysPreTouch 
-XX:ConcGCThreads=16 
-XX:ParallelGCThreads=18 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-verbose:gc 
-XX:+PrintHeapAtGC 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-XX:+PrintGCTimeStamps 
-XX:+PrintTenuringDistribution 
-XX:+PrintGCApplicationStoppedTime 
-Xloggc:/mnt/solrdata1/log/solr_gc.log

It usually works fine. But recently we met very long stop the world young 
generation GC pause. Some snippets of the gc log are as below:
2016-11-22T20:43:16.436+: 2942054.483: Total time for which application 
threads were stopped: 0.0005510 seconds, Stopping threads took: 0.894 
seconds
2016-11-22T20:43:16.463+: 2942054.509: Total time for which application 
threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 
seconds
{Heap before GC invocations=2246 (full 0):
 garbage-first heap   total 26673152K, used 4683965K [0x7f0c1000, 
0x7f0c108065c0, 0x7f141000)
  region size 8192K, 162 young (1327104K), 17 survivors (139264K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) 
(young)
Desired survivor size 88080384 bytes, new threshold 15 (max 15)
- age   1:   28176280 bytes,   28176280 total
- age   2:5632480 bytes,   33808760 total
- age   3:9719072 bytes,   43527832 total
- age   4:6219408 bytes,   49747240 total
- age   5:4465544 bytes,   54212784 total
- age   6:3417168 bytes,   57629952 total
- age   7:5343072 bytes,   62973024 total
- age   8:2784808 bytes,   65757832 total
- age   9:6538056 bytes,   72295888 total
- age  10:6368016 bytes,   78663904 total
- age  11: 695216 bytes,   79359120 total
, 97.2044320 secs]
   [Parallel Time: 19.8 ms, GC Workers: 18]
  [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 
2942054612.7, Diff: 10.6]
  [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 
43.5]
  [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0]
 [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192]
  [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.0]
  [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9]
  [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1]
 [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6]
  [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 
308.7]
  [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 
2942054621.8, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 97184.3 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 8.5 ms]
  [Ref Enq: 0.2 ms]
  [Redirty Cards: 0.2 ms]
  [Humongous Register: 0.1 ms]
  [Humongous Reclaim: 0.1 ms]
  [Free CSet: 0.4 ms]
   [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 
4574.2M(25.4G)->3450.8M(26.8G)]
Heap after GC invocations=2247 (full 0):
 garbage-first heap   total 28049408K, used 3533601K [0x7f0c1000, 
0x7f0c10806b00, 0x7f141000)
  region size 8192K, 21 young (172032K), 21 survivors (172032K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
}
 [Times: user=0.00 sys=94.28, real=97.19 secs] 
2016-11-22T20:44:53.760+: 2942151.806: Total time for which application 
threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 
seconds
2016-11-22T20:44:53.762+: 2942151.809: Total time for which application 
threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 
seconds

And CPU reached near 100% during the GC.
The load is not visibly high at that time.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For 

[jira] [Commented] (SOLR-9251) Allow a tag role:!overseer in replica placement rules

2016-12-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724425#comment-15724425
 ] 

David Smiley commented on SOLR-9251:


_Thanks for your help._  I realize that most apps/use-cases call for additional 
replicas but mine doesn't -- it's an identified and acceptable limitation for 
the confines of a slim operational budget.  The system that can be reloaded if 
need be.

To simplify this a bit, the example below uses a {{host}} tag instead of the 
role.  I don't see an error but I do see a shard going where I don't want it to 
go.  In particular, I want the "RT" shard on a specified host $rtHostName -- 
this worked okay.  But once I got to the 3rd S shard, I saw it on the host 
rtHostName.  I repeated this experiment after deleting the collection, and 
switching up which host is the designated RT host, and it observed this time it 
was the 4th numbered shard that was co-located with RT (the thing I'm trying to 
avoid), not the 3rd.  Interesting.  The cluster I am trying this on has 3 Solr 
nodes.

{noformat}curl -XPOST --fail "$SOLR_URL/admin/collections" -F action=CREATE -F 
name="$COLLECTION" \
  -F router.name=implicit -F shards=RT -F 
createNodeSet="${rtHostName}:8983_solr" -F maxShardsPerNode=4 \
  -F rule="shard:RT,host:$rtHostName" -F rule="shard:\!RT,host:\!$rtHostName"
// note escaping of the exclaimations to make Bash happy

curl -XPOST --fail "$SOLR_URL/admin/collections" -F action=CREATESHARD \
  -F collection="$COLLECTION" -F shard=s1

//repeat above several times varying shard name: s1, s2, s3
{noformat}

> Allow a tag role:!overseer in replica placement rules
> -
>
> Key: SOLR-9251
> URL: https://issues.apache.org/jira/browse/SOLR-9251
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-9251.patch
>
>
> The reason to assign an overseer role to  a node is to ensure that the node 
> is exclusively used as overseer. replica placement should support tag called 
> {{role}}
> So if a collection is created with {{rule=role:!overseer}} no replica should 
> be created in nodes designated as overseer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_102) - Build # 18454 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18454/
Java: 64bit/jdk1.8.0_102 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.LeaderFailoverAfterPartitionTest.test

Error Message:
Expected 2 of 3 replicas to be active but only found 1; 
[core_node2:{"core":"c8n_1x3_lf_shard1_replica2","base_url":"http://127.0.0.1:35339","node_name":"127.0.0.1:35339_","state":"active","leader":"true"}];
 clusterState: 
DocCollection(c8n_1x3_lf//collections/c8n_1x3_lf/state.json/17)={   
"replicationFactor":"3",   "shards":{"shard1":{   
"range":"8000-7fff",   "state":"active",   "replicas":{ 
"core_node1":{   "state":"down",   
"base_url":"http://127.0.0.1:39287;,   
"core":"c8n_1x3_lf_shard1_replica1",   "node_name":"127.0.0.1:39287_"}, 
"core_node2":{   "core":"c8n_1x3_lf_shard1_replica2",   
"base_url":"http://127.0.0.1:35339;,   "node_name":"127.0.0.1:35339_",  
 "state":"active",   "leader":"true"}, "core_node3":{   
"core":"c8n_1x3_lf_shard1_replica3",   
"base_url":"http://127.0.0.1:35639;,   "node_name":"127.0.0.1:35639_",  
 "state":"down",   "router":{"name":"compositeId"},   
"maxShardsPerNode":"1",   "autoAddReplicas":"false"}

Stack Trace:
java.lang.AssertionError: Expected 2 of 3 replicas to be active but only found 
1; 
[core_node2:{"core":"c8n_1x3_lf_shard1_replica2","base_url":"http://127.0.0.1:35339","node_name":"127.0.0.1:35339_","state":"active","leader":"true"}];
 clusterState: DocCollection(c8n_1x3_lf//collections/c8n_1x3_lf/state.json/17)={
  "replicationFactor":"3",
  "shards":{"shard1":{
  "range":"8000-7fff",
  "state":"active",
  "replicas":{
"core_node1":{
  "state":"down",
  "base_url":"http://127.0.0.1:39287;,
  "core":"c8n_1x3_lf_shard1_replica1",
  "node_name":"127.0.0.1:39287_"},
"core_node2":{
  "core":"c8n_1x3_lf_shard1_replica2",
  "base_url":"http://127.0.0.1:35339;,
  "node_name":"127.0.0.1:35339_",
  "state":"active",
  "leader":"true"},
"core_node3":{
  "core":"c8n_1x3_lf_shard1_replica3",
  "base_url":"http://127.0.0.1:35639;,
  "node_name":"127.0.0.1:35639_",
  "state":"down",
  "router":{"name":"compositeId"},
  "maxShardsPerNode":"1",
  "autoAddReplicas":"false"}
at 
__randomizedtesting.SeedInfo.seed([3D354D726C0C7887:B56172A8C2F0157F]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.cloud.LeaderFailoverAfterPartitionTest.testRf3WithLeaderFailover(LeaderFailoverAfterPartitionTest.java:170)
at 
org.apache.solr.cloud.LeaderFailoverAfterPartitionTest.test(LeaderFailoverAfterPartitionTest.java:57)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 

[JENKINS] Lucene-Solr-6.x-Windows (64bit/jdk1.8.0_102) - Build # 605 - Still Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/605/
Java: 64bit/jdk1.8.0_102 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.CdcrBootstrapTest.testBootstrapWithSourceCluster

Error Message:
Document mismatch on target after sync expected:<1> but was:<0>

Stack Trace:
java.lang.AssertionError: Document mismatch on target after sync 
expected:<1> but was:<0>
at 
__randomizedtesting.SeedInfo.seed([8558272DFF6B1132:5C0E76E9FC0F0278]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.CdcrBootstrapTest.testBootstrapWithSourceCluster(CdcrBootstrapTest.java:206)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 12106 lines...]
   [junit4] Suite: 

[jira] [Commented] (SOLR-9251) Allow a tag role:!overseer in replica placement rules

2016-12-05 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724268#comment-15724268
 ] 

Noble Paul commented on SOLR-9251:
--

the rule {{rule=shard:RT,role:overseer}}  just does not look right. It means 
all replicas  of shard {{RT}} must live in the node which is designated as 
overseer. why would you want all replicas of a given shard live in one node? it 
must be conflicting with {{maxShardsperNode}} ?

Anyway, share your full create command

> Allow a tag role:!overseer in replica placement rules
> -
>
> Key: SOLR-9251
> URL: https://issues.apache.org/jira/browse/SOLR-9251
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-9251.patch
>
>
> The reason to assign an overseer role to  a node is to ensure that the node 
> is exclusively used as overseer. replica placement should support tag called 
> {{role}}
> So if a collection is created with {{rule=role:!overseer}} no replica should 
> be created in nodes designated as overseer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-6.x - Build # 585 - Still Unstable

2016-12-05 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-6.x/585/

1 tests failed.
FAILED:  
org.apache.solr.schema.PreAnalyzedFieldManagedSchemaCloudTest.testAdd2Fields

Error Message:
No live SolrServers available to handle this 
request:[https://127.0.0.1:52246/solr/managed-preanalyzed, 
https://127.0.0.1:57262/solr/managed-preanalyzed]

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request:[https://127.0.0.1:52246/solr/managed-preanalyzed, 
https://127.0.0.1:57262/solr/managed-preanalyzed]
at 
__randomizedtesting.SeedInfo.seed([B5255F59868644E5:1D30EDB25A80D713]:0)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:414)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1344)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1095)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1037)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at 
org.apache.solr.schema.PreAnalyzedFieldManagedSchemaCloudTest.addField(PreAnalyzedFieldManagedSchemaCloudTest.java:61)
at 
org.apache.solr.schema.PreAnalyzedFieldManagedSchemaCloudTest.testAdd2Fields(PreAnalyzedFieldManagedSchemaCloudTest.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
   

[JENKINS] Lucene-Solr-Tests-6.x - Build # 584 - Unstable

2016-12-05 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-6.x/584/

1 tests failed.
FAILED:  org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv

Error Message:
There are still nodes recoverying - waited for 330 seconds

Stack Trace:
java.lang.AssertionError: There are still nodes recoverying - waited for 330 
seconds
at 
__randomizedtesting.SeedInfo.seed([85025F84010A9786:B3163DC28B57AD97]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:184)
at 
org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.waitForRecoveriesToFinish(TestStressCloudBlindAtomicUpdates.java:459)
at 
org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.checkField(TestStressCloudBlindAtomicUpdates.java:304)
at 
org.apache.solr.cloud.TestStressCloudBlindAtomicUpdates.test_dv(TestStressCloudBlindAtomicUpdates.java:193)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
   

[jira] [Updated] (SOLR-5944) Support updates of numeric DocValues

2016-12-05 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5944:
---
Attachment: SOLR-5944.patch


The only diff between this patch and the previous one is some improvements to 
TestInPlaceUpdatesStandalone and some neccessary additions to 
schema-inplace-updates.xml to support them...


* replace hackish getFieldValueRTG with a regular SolrClient.getById 
simplifying most caller code
* refactor getFieldValueIndex / getDocId
** only usaage of getFieldValueIndex was in getDocId or places that should have 
been calling getDocId instead
** refactored logic to use SolrClient.query instead of hackish low level access
* added new randomized test methods leveraging checkReply
** this helped uncover a few minor bugs in checkReplay which i fixed
** in the process I also cleaned up the datatypes used in the existing callers 
of checkReply to reduce a lot of String/Number parsing/formatting logic in 
checkReply
** this uncovered a ClassCastException when inplace updates are mixed with 
non-inplace atomic updates (see below)
*** added testReplay_MixOfInplaceAndNonInPlaceAtomicUpdates to demonstrate this 
directly
* new schema assertions warranted by fields added for the above changes

Here's an example of the ClassCastException that shows up in the logs when 
running testReplay_MixOfInplaceAndNonInPlaceAtomicUpdates ...

{noformat}
   [junit4]   2> 2514 ERROR 
(TEST-TestInPlaceUpdatesStandalone.testReplay_MixOfInplaceAndNonInPlaceAtomicUpdates-seed#[70DBFB363B6DA180])
 [] o.a.s.h.RequestHandlerBase java.lang.ClassCastException: 
org.apache.solr.common.SolrDocument cannot be cast to 
org.apache.solr.common.SolrInputDocument
   [junit4]   2>at 
org.apache.solr.handler.component.RealTimeGetComponent.getInputDocumentFromTlog(RealTimeGetComponent.java:512)
   [junit4]   2>at 
org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:568)
   [junit4]   2>at 
org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:546)
   [junit4]   2>at 
org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1424)
   [junit4]   2>at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1072)
   [junit4]   2>at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:751)
   [junit4]   2>at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
   [junit4]   2>at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:492)
   [junit4]   2>at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:139)
   [junit4]   2>at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:115)
   [junit4]   2>at 
org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:78)
   [junit4]   2>at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
   [junit4]   2>at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
   [junit4]   2>at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:152)
   [junit4]   2>at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2228)
   [junit4]   2>at 
org.apache.solr.servlet.DirectSolrConnection.request(DirectSolrConnection.java:124)
   [junit4]   2>at 
org.apache.solr.SolrTestCaseJ4.updateJ(SolrTestCaseJ4.java:1173)
   [junit4]   2>at 
org.apache.solr.SolrTestCaseJ4.addAndGetVersion(SolrTestCaseJ4.java:1319)
   [junit4]   2>at 
org.apache.solr.update.TestInPlaceUpdatesStandalone.checkReplay(TestInPlaceUpdatesStandalone.java:823)
   [junit4]   2>at 
org.apache.solr.update.TestInPlaceUpdatesStandalone.testReplay_MixOfInplaceAndNonInPlaceAtomicUpdates(TestInPlaceUpdatesStandalone.java:570)

{noformat}



> Support updates of numeric DocValues
> 
>
> Key: SOLR-5944
> URL: https://issues.apache.org/jira/browse/SOLR-5944
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Shalin Shekhar Mangar
> Attachments: DUP.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, SOLR-5944.patch, 
> SOLR-5944.patch, SOLR-5944.patch, 

[jira] [Updated] (SOLR-6203) cast exception while searching with sort function and result grouping

2016-12-05 Thread Judith Silverman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judith Silverman updated SOLR-6203:
---
Attachment: SOLR-6203.patch

Hi, Christine.  I'm posting what I have so far; please review at your
convenience.  This afternoon's patch includes the changes from the   
patch of Dec. 3.  I added calls to two new utility functions (which  
turned out to be identical, so if you approve them, they should be   
combined and put in a good place) and copied over unit tests from last   
year's patch, which pass.
 
Cheers,  
Judith

> cast exception while searching with sort function and result grouping
> -
>
> Key: SOLR-6203
> URL: https://issues.apache.org/jira/browse/SOLR-6203
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 4.7, 4.8
>Reporter: Nate Dire
>Assignee: Christine Poerschke
> Attachments: README, SOLR-6203-unittest.patch, 
> SOLR-6203-unittest.patch, SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch, 
> SOLR-6203.patch, SOLR-6203.patch, SOLR-6203.patch
>
>
> After upgrading from 4.5.1 to 4.7+, a schema including a {{"*"}} dynamic 
> field as text gets a cast exception when using a sort function and result 
> grouping.  
> Repro (with example config):
> # Add {{"*"}} dynamic field as a {{TextField}}, eg:
> {noformat}
> 
> {noformat}
> #  Create  sharded collection
> {noformat}
> curl 
> 'http://localhost:8983/solr/admin/collections?action=CREATE=test=2=2'
> {noformat}
> # Add example docs (query must have some results)
> # Submit query which sorts on a function result and uses result grouping:
> {noformat}
> {
>   "responseHeader": {
> "status": 500,
> "QTime": 50,
> "params": {
>   "sort": "sqrt(popularity) desc",
>   "indent": "true",
>   "q": "*:*",
>   "_": "1403709010008",
>   "group.field": "manu",
>   "group": "true",
>   "wt": "json"
> }
>   },
>   "error": {
> "msg": "java.lang.Double cannot be cast to 
> org.apache.lucene.util.BytesRef",
> "code": 500
>   }
> }
> {noformat}
> Source exception from log:
> {noformat}
> ERROR - 2014-06-25 08:10:10.055; org.apache.solr.common.SolrException; 
> java.lang.ClassCastException: java.lang.Double cannot be cast to 
> org.apache.lucene.util.BytesRef
> at 
> org.apache.solr.schema.FieldType.marshalStringSortValue(FieldType.java:981)
> at org.apache.solr.schema.TextField.marshalSortValue(TextField.java:176)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.serializeSearchGroup(SearchGroupsResultTransformer.java:125)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:65)
> at 
> org.apache.solr.search.grouping.distributed.shardresultserializer.SearchGroupsResultTransformer.transform(SearchGroupsResultTransformer.java:43)
> at 
> org.apache.solr.search.grouping.CommandHandler.processResult(CommandHandler.java:193)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:340)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>   ...
> {noformat}
> It looks like {{serializeSearchGroup}} is matching the sort expression as the 
> {{"*"}} dynamic field, which is a TextField in the repro.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2016-12-05 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723605#comment-15723605
 ] 

Michael Sun commented on SOLR-9764:
---

hmmm, I think query with q=*:* doesn't use MatchAllDocSets with the current 
patch. Let me see if there is a way to optimize this use case as well.


> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR_9764_no_cloneMe.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2016-12-05 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723591#comment-15723591
 ] 

Michael Sun commented on SOLR-9764:
---

Inspired by all nice discussions, another good optimization would be to store 
an inverse of the matched docSet if all or most of docs are matched by a query. 
If the number of docs matched is close to maxDocs, a HashDocSet would be very 
efficient. (Thanks [~yo...@apache.org] for suggestion.)


> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR_9764_no_cloneMe.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread Ferenczi Jim (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723586#comment-15723586
 ] 

Ferenczi Jim commented on LUCENE-7575:
--

{quote}
I was thinking a bit more about the wastefulness of re-creating SpanQueries 
with different field that are otherwise identical. Some day we could refactor 
out from WSTE a Query -> SpanQuery conversion utility that furthermore allows 
you to re-target the field. With that in place, we could avoid the waste for 
PhraseQuery and MultiPhraseQuery – the most typical position-sensitive queries.
{quote}

I agree, I'll work on this shortly. Thanks for the hint ;)

> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 6.4
>
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2016-12-05 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723569#comment-15723569
 ] 

Michael Sun commented on SOLR-9764:
---

Ah, yes, you are right. Thanks [~varunthacker] for suggestion. The 'inverse 
encoding' is a good idea.

bq.Do you think this will be good enough for this case
On memory saving side, RoaringIdDocSet looks a good solution. It would only use 
a small amount of memory in this use case.

On the other hand, there are some implication on CPU usage, mainly in 
constructing the DocSet. RoaringIdDocSet saves memory by choosing different 
data structure based on matched documents in a chunk. However, the code doesn't 
know what data structure to use before it iterate all documents in a chunk and 
can result in some expensive 'shift' in data structure and 'resizing'. For 
example, in this use case, for each chunk, the code basically starts fill a 
large short[], then shift it to a bitmap, and convert data from short[] to 
bitmap, then fill bitmap, then later switch back to a small short[]. All these 
steps can be expensive unless it's optimized for some use cases. In addition, 
all these steps use iterator to get matched doc one by one.

The union and intersection using RoaringIdDocSet can be more expensive too in 
addition the cost of constructing. Of course, it's hard to fully understand the 
performance implication without testing on a prototype. Any suggestion is 
welcome.



> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR_9764_no_cloneMe.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7579) Sorting on flushed segment

2016-12-05 Thread Ferenczi Jim (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723545#comment-15723545
 ] 

Ferenczi Jim commented on LUCENE-7579:
--

Thanks Mike, 

{quote}
Can we rename freezed to frozen in BinaryDocValuesWriter?
But: why would freezed ever be true when we call flush?
Shouldn't it only be called once, even in the sorting case?
{quote}

This is a leftover that is not needed. The naming was wrong ;) and it's useless 
so I removed it.

{quote}
I also like how you were able to re-use the SortingXXX from
SortingLeafReader. Later on we can maybe optimize some of these;
e.g. SortingFields and CachedXXXDVs should be able to take
advantage of the fact that the things they are sorting are all already
in heap (the indexing buffer), the way you did with
MutableSortingPointValues (cool).
{quote}

Totally agree, we can revisit later and see if we can optimize memory. I think 
it's already an optim vs master in terms of memory usage since we only "sort" 
the segment to be flushed instead of all "unsorted" segments during the merge.

{quote}
Can we block creating a SortingLeafReader now (make its
constructor private)? We only now ever use its inner classes I think?
And it is a dangerous class in the first place... if we can do that,
maybe we rename it SortingCodecUtils or something, just for its
inner classes.
{quote}

We still need to wrap unsorted segments during the merge for BWC so 
SortingLeafReader should remain. I have no idea when we can remove it since 
indices on older versions should still be compatible with this new one ?


{quote}
Do any of the exceptions tests for IndexWriter get angry? Seems like
if we hit an IOException e.g. during the renaming that
SortingStoredFieldsConsumer.flush does we may leave undeleted
files? Hmm or perhaps IW takes care of that by wrapping the directory
itself...
{quote}

Honestly I have no idea. I will dig.

{quote}
Can't you just pass sortMap::newToOld directly (method reference)
instead of making the lambda here?:
{quote}

Indeed, thanks.

{quote}
I think the 6.x back port here is going to be especially tricky 
{quote}

I bet but as it is the main part is done by reusing SortingLeafReader inner 
classes that exist in 6.x. 

I've also removed a nocommit in the AssertingLiveDocsFormat that now checks 
live docs even when they are sorted.



 

> Sorting on flushed segment
> --
>
> Key: LUCENE-7579
> URL: https://issues.apache.org/jira/browse/LUCENE-7579
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ferenczi Jim
>
> Today flushed segments built by an index writer with an index sort specified 
> are not sorted. The merge is responsible of sorting these segments 
> potentially with others that are already sorted (resulted from another 
> merge). 
> I'd like to investigate the cost of sorting the segment directly during the 
> flush. This could make the merge faster since they are some cheap 
> optimizations that can be done only if all segments to be merged are sorted.
>  For instance the merge of the points could use the bulk merge instead of 
> rebuilding the points from scratch.
> I made a small prototype which sort the segment on flush here:
> https://github.com/apache/lucene-solr/compare/master...jimczi:flush_sort
> The idea is simple, for points, norms, docvalues and terms I use the 
> SortingLeafReader implementation to translate the values that we have in RAM 
> in a sorted enumeration for the writers.
> For stored fields I use a two pass scheme where the documents are first 
> written to disk unsorted and then copied to another file with the correct 
> sorting. I use the same stored field format for the two steps and just remove 
> the file produced by the first pass at the end of the process.
> This prototype has no implementation for index sorting that use term vectors 
> yet. I'll add this later if the tests are good enough.
> Speaking of testing, I tried this branch on [~mikemccand] benchmark scripts 
> and compared master with index sorting against my branch with index sorting 
> on flush. I tried with sparsetaxis and wikipedia and the first results are 
> weird. When I use the SerialScheduler and only one thread to write the docs,  
> index sorting on flush is slower. But when I use two threads the sorting on 
> flush is much faster even with the SerialScheduler. I'll continue to run the 
> tests in order to be able to share something more meaningful.
> The tests are passing except one about concurrent DV updates. I don't know 
> this part at all so I did not fix the test yet. I don't even know if we can 
> make it work with index sorting ;).
>  [~mikemccand] I would love to have your feedback about the prototype. Could 
> you please take a look ? I am sure there are plenty of bugs, ... but I think 
> it's a good start to evaluate the feasibility of this feature.




[JENKINS] Lucene-Solr-Tests-master - Build # 1524 - Unstable

2016-12-05 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/1524/

1 tests failed.
FAILED:  
org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.testDelegationTokenCancelFail

Error Message:
expected:<200> but was:<404>

Stack Trace:
java.lang.AssertionError: expected:<200> but was:<404>
at 
__randomizedtesting.SeedInfo.seed([C0B5C6A81A5268C7:A80AF382CAC87A2B]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.cancelDelegationToken(TestSolrCloudWithDelegationTokens.java:140)
at 
org.apache.solr.cloud.TestSolrCloudWithDelegationTokens.testDelegationTokenCancelFail(TestSolrCloudWithDelegationTokens.java:304)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 

[jira] [Updated] (SOLR-9827) Make ConcurrentUpdateSolrClient create RemoteSolrException instead of just SolrException for remote errors

2016-12-05 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-9827:

Attachment: SOLR-9827.patch

> Make ConcurrentUpdateSolrClient create RemoteSolrException instead of just 
> SolrException for remote errors
> --
>
> Key: SOLR-9827
> URL: https://issues.apache.org/jira/browse/SOLR-9827
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Minor
> Attachments: SOLR-9827.patch
>
>
> Also, improve the exception message to include the remote error message when 
> present. Specially when Solr is logging these errors (e.g. 
> DistributedUpdateProcessor), this should make it easier to understand that 
> the error was in the remote host and not in the one logging this exception. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved LUCENE-7575.
--
   Resolution: Fixed
Fix Version/s: 6.4

Thank _you_.

I was thinking a bit more about the wastefulness of re-creating SpanQueries 
with different field that are otherwise identical. Some day we could refactor 
out from WSTE a Query -> SpanQuery conversion utility that furthermore allows 
you to re-target the field.  With that in place, we could avoid the waste for 
PhraseQuery and MultiPhraseQuery -- the most typical position-sensitive queries.

> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Fix For: 6.4
>
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723408#comment-15723408
 ] 

ASF subversion and git services commented on LUCENE-7575:
-

Commit 4e7a7dbf9a56468f41e89f5289833081b27f1b14 in lucene-solr's branch 
refs/heads/branch_6x from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4e7a7db ]

LUCENE-7575: Add UnifiedHighlighter field matcher predicate (AKA 
requireFieldMatch=false)

(cherry picked from commit 2e948fe)


> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723387#comment-15723387
 ] 

ASF subversion and git services commented on LUCENE-7575:
-

Commit 2e948fea300f883b7dfb586e303d5720d09b3210 in lucene-solr's branch 
refs/heads/master from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2e948fe ]

LUCENE-7575: Add UnifiedHighlighter field matcher predicate (AKA 
requireFieldMatch=false)


> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9827) Make ConcurrentUpdateSolrClient create RemoteSolrException instead of just SolrException for remote errors

2016-12-05 Thread JIRA
Tomás Fernández Löbbe created SOLR-9827:
---

 Summary: Make ConcurrentUpdateSolrClient create 
RemoteSolrException instead of just SolrException for remote errors
 Key: SOLR-9827
 URL: https://issues.apache.org/jira/browse/SOLR-9827
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Tomás Fernández Löbbe
Assignee: Tomás Fernández Löbbe
Priority: Minor


Also, improve the exception message to include the remote error message when 
present. Specially when Solr is logging these errors (e.g. 
DistributedUpdateProcessor), this should make it easier to understand that the 
error was in the remote host and not in the one logging this exception. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7582) NIOFSDirectory sometime doesn't work on windows

2016-12-05 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723346#comment-15723346
 ] 

Shawn Heisey commented on LUCENE-7582:
--

Are you the one who mentioned pretty much this exact issue on the #lucene IRC 
channel several hours ago?

As I said on IRC, I think the correct solution is to use an MMap-based 
directory implementation and require that the users of your software install it 
with 64-bit hardware, OS, and Java.  With hardware and software that are common 
today, a 64-bit requirement is *not* difficult to satisfy, even with Windows.  
MMapDirectory typically has better performance than the other file-based 
directory implementations, which is why it's default on 64-bit Java.

This javadoc talks about the differences between the filesystem-based Directory 
implementations:

http://lucene.apache.org/core/6_3_0/core/org/apache/lucene/store/FSDirectory.html


> NIOFSDirectory sometime doesn't work on windows
> ---
>
> Key: LUCENE-7582
> URL: https://issues.apache.org/jira/browse/LUCENE-7582
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: 5.3.1
> Environment: Windows 10, 32 bits JVM
>Reporter: Kevin Senechal
>
> Hi!
> I've an error using lucene on windows. I already post a question on modeshape 
> forum (https://developer.jboss.org/thread/273070) and it looks that 
> NIOFSDirectory is not working well on windows as described in the java 
> documentation of this class.
> {quote}NOTE: NIOFSDirectory is not recommended on Windows because of a bug in 
> how FileChannel.read is implemented in Sun's JRE. Inside of the 
> implementation the position is apparently synchronized. See here for 
> details.{quote}
> After reading the linked java issue 
> (http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6265734), it seems that 
> there is a workaround to solve it, use an AsynchronousFileChannel.
> Is it a choice that has been made to not use AsynchronousFileChannel or will 
> it be a good fix?
> You'll find the complete stacktrace below:
> {code:java}
> Caused by: org.modeshape.jcr.index.lucene.LuceneIndexException: Cannot commit 
> index writer  
>   at org.modeshape.jcr.index.lucene.LuceneIndex.commit(LuceneIndex.java:155) 
> ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.spi.index.provider.IndexChangeAdapter.completeWorkspaceChanges(IndexChangeAdapter.java:104)
>  ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.cache.change.ChangeSetAdapter.notify(ChangeSetAdapter.java:157)
>  ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.spi.index.provider.IndexProvider$AtomicIndex.notify(IndexProvider.java:1493)
>  ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.bus.RepositoryChangeBus.notify(RepositoryChangeBus.java:190)
>  ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.cache.document.WorkspaceCache.changed(WorkspaceCache.java:333)
>  ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.txn.SynchronizedTransactions.updateCache(SynchronizedTransactions.java:223)
>  ~[dsdk-launcher.jar:na]  
>   at 
> org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:751)
>  ~[dsdk-launcher.jar:na]  
>   at org.modeshape.jcr.JcrSession.save(JcrSession.java:1171) 
> ~[dsdk-launcher.jar:na]  
>   ... 19 common frames omitted  
> Caused by: java.nio.file.FileSystemException: 
> C:\Users\Christopher\Infiltrea3CLOUDTEST8\christop...@dooapp.com\indexes\default\nodesByPath\_dc_Lucene50_0.doc:
>  The process cannot access the file because it is being used by another 
> process.  
>   at 
> sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) 
> ~[na:1.8.0_92]  
>   at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) 
> ~[na:1.8.0_92]  
>   at 
> sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) 
> ~[na:1.8.0_92]  
>   at 
> sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:115)
>  ~[na:1.8.0_92]  
>   at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[na:1.8.0_92]  
>   at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[na:1.8.0_92]  
>   at org.apache.lucene.util.IOUtils.fsync(IOUtils.java:393) 
> ~[dsdk-launcher.jar:na]  
>   at org.apache.lucene.store.FSDirectory.fsync(FSDirectory.java:281) 
> ~[dsdk-launcher.jar:na]  
>   at org.apache.lucene.store.FSDirectory.sync(FSDirectory.java:226) 
> ~[dsdk-launcher.jar:na]  
>   at 
> org.apache.lucene.store.LockValidatingDirectoryWrapper.sync(LockValidatingDirectoryWrapper.java:62)
>  ~[dsdk-launcher.jar:na]  
>   at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4456) 
> ~[dsdk-launcher.jar:na]  
>   at 
> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2874)
>  ~[dsdk-launcher.jar:na] 

[jira] [Commented] (SOLR-9251) Allow a tag role:!overseer in replica placement rules

2016-12-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723290#comment-15723290
 ] 

David Smiley commented on SOLR-9251:


Right; I should add I definitely did that and can confirm the overseer is 
running there.

> Allow a tag role:!overseer in replica placement rules
> -
>
> Key: SOLR-9251
> URL: https://issues.apache.org/jira/browse/SOLR-9251
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-9251.patch
>
>
> The reason to assign an overseer role to  a node is to ensure that the node 
> is exclusively used as overseer. replica placement should support tag called 
> {{role}}
> So if a collection is created with {{rule=role:!overseer}} no replica should 
> be created in nodes designated as overseer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9251) Allow a tag role:!overseer in replica placement rules

2016-12-05 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723282#comment-15723282
 ] 

Noble Paul commented on SOLR-9251:
--

It works only if you assign the rope of overseer to a particular node. By 
default, such a role does not exist. Refer to the ADDROLE command

> Allow a tag role:!overseer in replica placement rules
> -
>
> Key: SOLR-9251
> URL: https://issues.apache.org/jira/browse/SOLR-9251
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-9251.patch
>
>
> The reason to assign an overseer role to  a node is to ensure that the node 
> is exclusively used as overseer. replica placement should support tag called 
> {{role}}
> So if a collection is created with {{rule=role:!overseer}} no replica should 
> be created in nodes designated as overseer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-MacOSX (64bit/jdk1.8.0) - Build # 3692 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/3692/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

2 tests failed.
FAILED:  org.apache.solr.cloud.DocValuesNotIndexedTest.testGroupingSorting

Error Message:
Should have exactly 4 documents returned expected:<4> but was:<3>

Stack Trace:
java.lang.AssertionError: Should have exactly 4 documents returned expected:<4> 
but was:<3>
at 
__randomizedtesting.SeedInfo.seed([83573CA4F2A6D81B:9D6F34AC8E0D629B]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.DocValuesNotIndexedTest.checkSortOrder(DocValuesNotIndexedTest.java:254)
at 
org.apache.solr.cloud.DocValuesNotIndexedTest.testGroupingSorting(DocValuesNotIndexedTest.java:239)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 

[jira] [Commented] (SOLR-9251) Allow a tag role:!overseer in replica placement rules

2016-12-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723258#comment-15723258
 ] 

David Smiley commented on SOLR-9251:


[~noble.paul] does the rule system allow one to have this rule but specify 
further that a particular shard may go on this overseer node (or any designated 
node for that matter)?  I hoped this would work:
{{=shard:RT,role:overseer}}
{{=shard:!RT,role:!overseer}}
This is a collection with the implicit router and thus named shards; one of 
them named "RT".  When I do this, Solr complains when attempting to create a 
replica that it could not identify nodes matching the rules.  Note that this 
notion isn't specific to the overseer... I have also tried with 
{{host:specifcHostName}} and got the same result.  I've tried various ways to 
try and achieve this but all in vain :-(

(apologies if you'd rather me ask on the user list)

> Allow a tag role:!overseer in replica placement rules
> -
>
> Key: SOLR-9251
> URL: https://issues.apache.org/jira/browse/SOLR-9251
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 6.2
>
> Attachments: SOLR-9251.patch
>
>
> The reason to assign an overseer role to  a node is to ensure that the node 
> is exclusively used as overseer. replica placement should support tag called 
> {{role}}
> So if a collection is created with {{rule=role:!overseer}} no replica should 
> be created in nodes designated as overseer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_102) - Build # 18451 - Still Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18451/
Java: 64bit/jdk1.8.0_102 -XX:-UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI

Error Message:
expected:<3> but was:<2>

Stack Trace:
java.lang.AssertionError: expected:<3> but was:<2>
at 
__randomizedtesting.SeedInfo.seed([4E2CA17107B9F86A:659D5C5018AD7FF]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:517)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(Thread.java:745)




Build Log:
[...truncated 11456 lines...]
   [junit4] Suite: 

[jira] [Comment Edited] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2016-12-05 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723002#comment-15723002
 ] 

Michael Sun edited comment on SOLR-9764 at 12/5/16 6:45 PM:


bq.  if the DocSet just produced has size==numDocs, then just use liveDocs
[~yo...@apache.org] Can you give me some more details how to implement this 
check. Somehow I can't find a clean way to do it. Thanks.



was (Author: michael.sun):
bq.  if the DocSet just produced has size==numDocs, then just use liveDocs
[~yo...@apache.org] Can you give me some more details how to implement this 
check. Somehow I can't find an easy way to do it. Thanks.


> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR_9764_no_cloneMe.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9764) Design a memory efficient DocSet if a query returns all docs

2016-12-05 Thread Michael Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723002#comment-15723002
 ] 

Michael Sun commented on SOLR-9764:
---

bq.  if the DocSet just produced has size==numDocs, then just use liveDocs
[~yo...@apache.org] Can you give me some more details how to implement this 
check. Somehow I can't find an easy way to do it. Thanks.


> Design a memory efficient DocSet if a query returns all docs
> 
>
> Key: SOLR-9764
> URL: https://issues.apache.org/jira/browse/SOLR-9764
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Michael Sun
> Attachments: SOLR-9764.patch, SOLR-9764.patch, SOLR-9764.patch, 
> SOLR-9764.patch, SOLR-9764.patch, SOLR_9764_no_cloneMe.patch
>
>
> In some use cases, particularly use cases with time series data, using 
> collection alias and partitioning data into multiple small collections using 
> timestamp, a filter query can match all documents in a collection. Currently 
> BitDocSet is used which contains a large array of long integers with every 
> bits set to 1. After querying, the resulted DocSet saved in filter cache is 
> large and becomes one of the main memory consumers in these use cases.
> For example. suppose a Solr setup has 14 collections for data in last 14 
> days, each collection with one day of data. A filter query for last one week 
> data would result in at least six DocSet in filter cache which matches all 
> documents in six collections respectively.   
> This is to design a new DocSet that is memory efficient for such a use case.  
> The new DocSet removes the large array, reduces memory usage and GC pressure 
> without losing advantage of large filter cache.
> In particular, for use cases when using time series data, collection alias 
> and partition data into multiple small collections using timestamp, the gain 
> can be large.
> For further optimization, it may be helpful to design a DocSet with run 
> length encoding. Thanks [~mmokhtar] for suggestion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.x-Windows (32bit/jdk1.8.0_102) - Build # 604 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Windows/604/
Java: 32bit/jdk1.8.0_102 -server -XX:+UseG1GC

2 tests failed.
FAILED:  org.apache.solr.handler.TestReplicationHandler.doTestStressReplication

Error Message:
[index.20161206061100377, index.20161206061100699, index.properties, 
replication.properties, snapshot_metadata] expected:<1> but was:<2>

Stack Trace:
java.lang.AssertionError: [index.20161206061100377, index.20161206061100699, 
index.properties, replication.properties, snapshot_metadata] expected:<1> but 
was:<2>
at 
__randomizedtesting.SeedInfo.seed([3EB7EC91E54A0B27:E51CEC57E0626294]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.handler.TestReplicationHandler.checkForSingleIndex(TestReplicationHandler.java:909)
at 
org.apache.solr.handler.TestReplicationHandler.doTestStressReplication(TestReplicationHandler.java:876)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 

[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_102) - Build # 18450 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18450/
Java: 32bit/jdk1.8.0_102 -client -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.PeerSyncReplicationTest.test

Error Message:
expected:<152> but was:<139>

Stack Trace:
java.lang.AssertionError: expected:<152> but was:<139>
at 
__randomizedtesting.SeedInfo.seed([FF743FBAB683FE68:77200060187F9390]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.PeerSyncReplicationTest.bringUpDeadNodeAndEnsureNoReplication(PeerSyncReplicationTest.java:280)
at 
org.apache.solr.cloud.PeerSyncReplicationTest.forceNodeFailureAndDoPeerSync(PeerSyncReplicationTest.java:244)
at 
org.apache.solr.cloud.PeerSyncReplicationTest.test(PeerSyncReplicationTest.java:130)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:985)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:960)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:17 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is to implement the aggregationMode=facet logic for 
aggregate queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigating how SELECT DISTINCT behaves with our 
Calcite implementation. As [~julianhyde] mentioned, we should see DISTINCT 
queries as aggregate queries so it's possible we'll have all the code in place 
to push this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is to implement the aggregationMode=facet logic for 
aggregate queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:16 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is to implement the aggregationMode=facet logic for 
aggregate queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current pant doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein commented on SOLR-8593:
--

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current pant doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7579) Sorting on flushed segment

2016-12-05 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722610#comment-15722610
 ] 

Michael McCandless commented on LUCENE-7579:


This is a nice approach!  Basically, the codec remains unaware index
sorting is happening, which is a the right way to do it.  Instead, the
indexing chain takes care of it.  And to build the doc comparators you take
advantage of the in-heap buffered doc values.

I like that to sort stored fields, you are still just using the codec
APIs, writing to temp files, then using the codec to read the stored
fields back for sorting.

I also like how you were able to re-use the {{SortingXXX}} from
{{SortingLeafReader}}.  Later on we can maybe optimize some of these;
e.g. {{SortingFields}} and {{CachedXXXDVs}} should be able to take
advantage of the fact that the things they are sorting are all already
in heap (the indexing buffer), the way you did with
{{MutableSortingPointValues}} (cool).

Can we rename {{freezed}} to {{frozen}} in {{BinaryDocValuesWriter}}?
But: why would {{freezed}} ever be true when we call {{flush}}?
Shouldn't it only be called once, even in the sorting case?

I think the 6.x back port here is going to be especially tricky :)

Can we block creating a {{SortingLeafReader}} now (make its
constructor private)?  We only now ever use its inner classes I think?
And it is a dangerous class in the first place...  if we can do that,
maybe we rename it {{SortingCodecUtils}} or something, just for its
inner classes.

Do any of the exceptions tests for IndexWriter get angry?  Seems like
if we hit an {{IOException}} e.g. during the renaming that
{{SortingStoredFieldsConsumer.flush}} does we may leave undeleted
files?  Hmm or perhaps IW takes care of that by wrapping the directory
itself...

Can't you just pass {{sortMap::newToOld}} directly (method reference)
instead of making the lambda here?:

{noformat}
  writer.sort(state.segmentInfo.maxDoc(), mergeReader, state.fieldInfos,
  (docID) -> (sortMap.newToOld(docID)));
{noformat}


> Sorting on flushed segment
> --
>
> Key: LUCENE-7579
> URL: https://issues.apache.org/jira/browse/LUCENE-7579
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ferenczi Jim
>
> Today flushed segments built by an index writer with an index sort specified 
> are not sorted. The merge is responsible of sorting these segments 
> potentially with others that are already sorted (resulted from another 
> merge). 
> I'd like to investigate the cost of sorting the segment directly during the 
> flush. This could make the merge faster since they are some cheap 
> optimizations that can be done only if all segments to be merged are sorted.
>  For instance the merge of the points could use the bulk merge instead of 
> rebuilding the points from scratch.
> I made a small prototype which sort the segment on flush here:
> https://github.com/apache/lucene-solr/compare/master...jimczi:flush_sort
> The idea is simple, for points, norms, docvalues and terms I use the 
> SortingLeafReader implementation to translate the values that we have in RAM 
> in a sorted enumeration for the writers.
> For stored fields I use a two pass scheme where the documents are first 
> written to disk unsorted and then copied to another file with the correct 
> sorting. I use the same stored field format for the two steps and just remove 
> the file produced by the first pass at the end of the process.
> This prototype has no implementation for index sorting that use term vectors 
> yet. I'll add this later if the tests are good enough.
> Speaking of testing, I tried this branch on [~mikemccand] benchmark scripts 
> and compared master with index sorting against my branch with index sorting 
> on flush. I tried with sparsetaxis and wikipedia and the first results are 
> weird. When I use the SerialScheduler and only one thread to write the docs,  
> index sorting on flush is slower. But when I use two threads the sorting on 
> flush is much faster even with the SerialScheduler. I'll continue to run the 
> tests in order to be able to share something more meaningful.
> The tests are passing except one about concurrent DV updates. I don't know 
> this part at all so I did not fix the test yet. I don't even know if we can 
> make it work with index sorting ;).
>  [~mikemccand] I would love to have your feedback about the prototype. Could 
> you please take a look ? I am sure there are plenty of bugs, ... but I think 
> it's a good start to evaluate the feasibility of this feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5894) Speed up high-cardinality facets with sparse counters

2016-12-05 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722603#comment-15722603
 ] 

Yago Riveiro commented on SOLR-5894:


Are facets with sparse counters faster that current JSON facets?

> Speed up high-cardinality facets with sparse counters
> -
>
> Key: SOLR-5894
> URL: https://issues.apache.org/jira/browse/SOLR-5894
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 4.7.1
>Reporter: Toke Eskildsen
>Priority: Minor
>  Labels: faceted-search, faceting, memory, performance
> Attachments: SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, SOLR-5894.patch, 
> SOLR-5894.patch, SOLR-5894.patch, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> SOLR-5894_test.zip, SOLR-5894_test.zip, SOLR-5894_test.zip, 
> author_7M_tags_1852_logged_queries_warmed.png, 
> sparse_200docs_fc_cutoff_20140403-145412.png, 
> sparse_500docs_20140331-151918_multi.png, 
> sparse_500docs_20140331-151918_single.png, 
> sparse_5051docs_20140328-152807.png
>
>
> Multiple performance enhancements to Solr String faceting.
> * Sparse counters, switching the constant time overhead of extracting top-X 
> terms with time overhead linear to result set size
> * Counter re-use for reduced garbage collection and lower per-call overhead
> * Optional counter packing, trading speed for space
> * Improved distribution count logic, greatly improving the performance of 
> distributed faceting
> * In-segment threaded faceting
> * Regexp based white- and black-listing of facet terms
> * Heuristic faceting for large result sets
> Currently implemented for Solr 4.10. Source, detailed description and 
> directly usable WAR at http://tokee.github.io/lucene-solr/
> This project has grown beyond a simple patch and will require a fair amount 
> of co-operation with a committer to get into Solr. Splitting into smaller 
> issues is a possibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread Ferenczi Jim (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722580#comment-15722580
 ] 

Ferenczi Jim commented on LUCENE-7575:
--

Thanks David !

> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722577#comment-15722577
 ] 

David Smiley commented on LUCENE-7575:
--

Oh I see; right.

I'll commit your patch later this evening.

> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7563) BKD index should compress unused leading bytes

2016-12-05 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722537#comment-15722537
 ] 

Michael McCandless commented on LUCENE-7563:


Ahh, OK; I think we should restrict {{TestBKD}} to the same dimension count / 
bytes per dimension limits that Lucene enforces?  As we tighten up how we 
compress it on disk and the in-heap index we should only test for what we 
actually offer to the end user.

> BKD index should compress unused leading bytes
> --
>
> Key: LUCENE-7563
> URL: https://issues.apache.org/jira/browse/LUCENE-7563
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7563-prefixlen-unary.patch, LUCENE-7563.patch, 
> LUCENE-7563.patch, LUCENE-7563.patch, LUCENE-7563.patch
>
>
> Today the BKD (points) in-heap index always uses {{dimensionNumBytes}} per 
> dimension, but if e.g. you are indexing {{LongPoint}} yet only use the bottom 
> two bytes in a given segment, we shouldn't store all those leading 0s in the 
> index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread Ferenczi Jim (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferenczi Jim updated LUCENE-7575:
-
Attachment: LUCENE-7575.patch

Thanks David !
Here is a new patch to address your last comments. Now we have a 
FieldFilteringTermSet and extractTerms uses a simple HashSet.

{quote}
couldn't defaultFieldMatcher be initialized to non-null to match the same 
field? Then getFieldMatcher() would simply return it.
{quote}

Not as a Predicate since the predicate is only on the candidate field 
name. We could use a BiPredicate to always provide the current 
field name to the predicate but I find it simpler this way. 


> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.x-Solaris (64bit/jdk1.8.0) - Build # 536 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Solaris/536/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.hdfs.HdfsRecoveryZkTest

Error Message:
ObjectTracker found 1 object(s) that were not released!!! [HdfsTransactionLog] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:43)
  at 
org.apache.solr.update.HdfsTransactionLog.(HdfsTransactionLog.java:130)  
at org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:202)  at 
org.apache.solr.update.UpdateHandler.(UpdateHandler.java:137)  at 
org.apache.solr.update.UpdateHandler.(UpdateHandler.java:94)  at 
org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:102)
  at sun.reflect.GeneratedConstructorAccessor131.newInstance(Unknown Source)  
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)  at 
org.apache.solr.core.SolrCore.createInstance(SolrCore.java:705)  at 
org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:767)  at 
org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1006)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:871)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:775)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:842)  at 
org.apache.solr.core.CoreContainer.lambda$load$0(CoreContainer.java:498)  at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 at java.lang.Thread.run(Thread.java:745)  

Stack Trace:
java.lang.AssertionError: ObjectTracker found 1 object(s) that were not 
released!!! [HdfsTransactionLog]
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException
at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:43)
at 
org.apache.solr.update.HdfsTransactionLog.(HdfsTransactionLog.java:130)
at org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:202)
at org.apache.solr.update.UpdateHandler.(UpdateHandler.java:137)
at org.apache.solr.update.UpdateHandler.(UpdateHandler.java:94)
at 
org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:102)
at sun.reflect.GeneratedConstructorAccessor131.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:705)
at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:767)
at org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1006)
at org.apache.solr.core.SolrCore.(SolrCore.java:871)
at org.apache.solr.core.SolrCore.(SolrCore.java:775)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:842)
at 
org.apache.solr.core.CoreContainer.lambda$load$0(CoreContainer.java:498)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


at __randomizedtesting.SeedInfo.seed([35D1F768FD799503]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNull(Assert.java:551)
at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:266)
at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:870)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 

[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722479#comment-15722479
 ] 

David Smiley commented on LUCENE-7575:
--

Tim, by the way, the style of checking booleans with {{== false}} is common in 
Lucene deliberately... some folks (like Rob and perhaps others) feel this is 
actually more clear than a leading exclamation point.  I sorta agree but don't 
have a strong opinion.  So I tend to follow this now too.

> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7581) IndexWriter#updateDocValues can break index sorting

2016-12-05 Thread Ferenczi Jim (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722476#comment-15722476
 ] 

Ferenczi Jim commented on LUCENE-7581:
--

[~mikemccand] I think so too. I'll work on a patch.

> IndexWriter#updateDocValues can break index sorting
> ---
>
> Key: LUCENE-7581
> URL: https://issues.apache.org/jira/browse/LUCENE-7581
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ferenczi Jim
> Attachments: LUCENE-7581.patch
>
>
> IndexWriter#updateDocValues can break index sorting if it is called on a 
> field that is used in the index sorting specification. 
> TestIndexSorting has a test for this case: #testConcurrentDVUpdates 
> but only L1 merge are checked. Any LN merge would fail the test because the 
> inner sort of the segment is not re-compute during/after DV updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722472#comment-15722472
 ] 

David Smiley commented on LUCENE-7575:
--

This is looking really good now Jim!  I like this Predicate approach.

UH:
* Maybe change  UH.extractTerms to simply be a Set (HashSet) since we needn't 
pay any sorting expense up front any longer.
* couldn't defaultFieldMatcher be initialized to non-null to match the same 
field?  Then getFieldMatcher() would simply return it.

PhraseHelper: 
* the comment on the fieldName field about being non-null isn't true anymore; 
in fact it's required.  Perhaps add Objects.requireNonNull(...) in c'tor if you 
want.
* I can see why you changed FieldFilteringTermHashSet to extend TreeSet.  But 
you now need to modify the javadocs & class name accordingly; perhaps removing 
the implementation detail like this

Nice tests.  That's it.

> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7582) NIOFSDirectory sometime doesn't work on windows

2016-12-05 Thread Kevin Senechal (JIRA)
Kevin Senechal created LUCENE-7582:
--

 Summary: NIOFSDirectory sometime doesn't work on windows
 Key: LUCENE-7582
 URL: https://issues.apache.org/jira/browse/LUCENE-7582
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Affects Versions: 5.3.1
 Environment: Windows 10, 32 bits JVM
Reporter: Kevin Senechal


Hi!

I've an error using lucene on windows. I already post a question on modeshape 
forum (https://developer.jboss.org/thread/273070) and it looks that 
NIOFSDirectory is not working well on windows as described in the java 
documentation of this class.

{quote}NOTE: NIOFSDirectory is not recommended on Windows because of a bug in 
how FileChannel.read is implemented in Sun's JRE. Inside of the implementation 
the position is apparently synchronized. See here for details.{quote}

After reading the linked java issue 
(http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6265734), it seems that 
there is a workaround to solve it, use an AsynchronousFileChannel.

Is it a choice that has been made to not use AsynchronousFileChannel or will it 
be a good fix?

You'll find the complete stacktrace below:
{code:java}
Caused by: org.modeshape.jcr.index.lucene.LuceneIndexException: Cannot commit 
index writer  
  at org.modeshape.jcr.index.lucene.LuceneIndex.commit(LuceneIndex.java:155) 
~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.spi.index.provider.IndexChangeAdapter.completeWorkspaceChanges(IndexChangeAdapter.java:104)
 ~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.cache.change.ChangeSetAdapter.notify(ChangeSetAdapter.java:157)
 ~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.spi.index.provider.IndexProvider$AtomicIndex.notify(IndexProvider.java:1493)
 ~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.bus.RepositoryChangeBus.notify(RepositoryChangeBus.java:190) 
~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.cache.document.WorkspaceCache.changed(WorkspaceCache.java:333)
 ~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.txn.SynchronizedTransactions.updateCache(SynchronizedTransactions.java:223)
 ~[dsdk-launcher.jar:na]  
  at 
org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:751)
 ~[dsdk-launcher.jar:na]  
  at org.modeshape.jcr.JcrSession.save(JcrSession.java:1171) 
~[dsdk-launcher.jar:na]  
  ... 19 common frames omitted  
Caused by: java.nio.file.FileSystemException: 
C:\Users\Christopher\Infiltrea3CLOUDTEST8\christop...@dooapp.com\indexes\default\nodesByPath\_dc_Lucene50_0.doc:
 The process cannot access the file because it is being used by another 
process.  
  at 
sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) 
~[na:1.8.0_92]  
  at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) 
~[na:1.8.0_92]  
  at 
sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) 
~[na:1.8.0_92]  
  at 
sun.nio.fs.WindowsFileSystemProvider.newFileChannel(WindowsFileSystemProvider.java:115)
 ~[na:1.8.0_92]  
  at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[na:1.8.0_92]  
  at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[na:1.8.0_92]  
  at org.apache.lucene.util.IOUtils.fsync(IOUtils.java:393) 
~[dsdk-launcher.jar:na]  
  at org.apache.lucene.store.FSDirectory.fsync(FSDirectory.java:281) 
~[dsdk-launcher.jar:na]  
  at org.apache.lucene.store.FSDirectory.sync(FSDirectory.java:226) 
~[dsdk-launcher.jar:na]  
  at 
org.apache.lucene.store.LockValidatingDirectoryWrapper.sync(LockValidatingDirectoryWrapper.java:62)
 ~[dsdk-launcher.jar:na]  
  at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4456) 
~[dsdk-launcher.jar:na]  
  at 
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2874)
 ~[dsdk-launcher.jar:na]  
  at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2977) 
~[dsdk-launcher.jar:na]  
  at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2944) 
~[dsdk-launcher.jar:na]  
  at org.modeshape.jcr.index.lucene.LuceneIndex.commit(LuceneIndex.java:152) 
~[dsdk-launcher.jar:na] 
{code}

Thank you in advance for your help



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7563) BKD index should compress unused leading bytes

2016-12-05 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722349#comment-15722349
 ] 

Adrien Grand commented on LUCENE-7563:
--

I digged into it, the test failure may happen with large numbers of bytes per 
dimension. It could be fixed if we limited the number of bytes per value of 
BKDWriter to 16 (like we do in FieldInfos) and made {{code}} a long.

> BKD index should compress unused leading bytes
> --
>
> Key: LUCENE-7563
> URL: https://issues.apache.org/jira/browse/LUCENE-7563
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7563-prefixlen-unary.patch, LUCENE-7563.patch, 
> LUCENE-7563.patch, LUCENE-7563.patch, LUCENE-7563.patch
>
>
> Today the BKD (points) in-heap index always uses {{dimensionNumBytes}} per 
> dimension, but if e.g. you are indexing {{LongPoint}} yet only use the bottom 
> two bytes in a given segment, we shouldn't store all those leading 0s in the 
> index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: question about QueryParser's method setAutoGeneratePhraseQueries()

2016-12-05 Thread Steve Rowe
Hi Gang,

The javadoc explanation isn’t very clear, but the process is:

1. Split query on whitespace (‘term1 term2’ is split into ‘term1’ and ‘term2’)
2. For each split term: if autoGeneratePhraseQueries=true, and analysis 
produces more than one term, for example a synonym ’term1’->’multiple word 
synonym’, then a phrase query will be created. 

In the example you give, after splitting and analysis, there is only one term, 
so phrase queries will not be produced.

A workaround: insert quotation marks at the start and end of the query.

--
Steve
www.lucidworks.com

> On Dec 4, 2016, at 4:39 PM, Gang Li  wrote:
> 
> Hi everyone,
> 
> I'm trying to make the QueryParser parse a raw query without quotes into a
> phrase query by default, and according to Lucene doc it seems I can use the
> method setAutoGeneratePhraseQueries(). (
> http://lucene.apache.org/core/6_2_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAutoGeneratePhraseQueries-boolean-
> )
> 
> But after I call parser.setAutoGeneratePhraseQueries(True), the parser
> still doesn't produce a phrase query. Please see the code example below.
> 
> I'm using Ubuntu 16.04, Java 1.8, Pylucene (Lucene version 6.2.0). All the
> tests are passed by running "make test" under pylucene folder.
> 
> import lucene
> from org.apache.lucene.analysis.standard import StandardAnalyzer
> from org.apache.lucene.queryparser.classic import QueryParser
> from org.apache.lucene.search import PhraseQuery
> from org.apache.lucene.index import Term
> 
> 
> jcc_env = lucene.initVM(vmargs=[str('-Djava.awt.headless=true')])
> 
> # Parse raw query.
> 
> 
> analyzer = StandardAnalyzer()
> parser = QueryParser('field', analyzer)
> # Auto generate phrase query over multiple terms.
> 
> 
> parser.setAutoGeneratePhraseQueries(True)
> 
> # This prints field:term1 field:term2, but it should be field:"term1 term2"
> 
> 
> print parser.parse('term1 term2')
> 
> # Build a phrase query.
> 
> 
> builder = PhraseQuery.Builder()
> builder.add(Term('field', 'term1'))
> builder.add(Term('field', 'term2'))
> 
> # This prints field:"term1 term2", which is correct.
> print builder.build()
> 
> Does anyone know how to make it work? Thank you!
> 
> Gang



[jira] [Commented] (LUCENE-7581) IndexWriter#updateDocValues can break index sorting

2016-12-05 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722337#comment-15722337
 ] 

Michael McCandless commented on LUCENE-7581:


Thanks [~jim.ferenczi]; I think we need to fix the update DVs APIs to prevent 
changing any field involved in the index sort?

> IndexWriter#updateDocValues can break index sorting
> ---
>
> Key: LUCENE-7581
> URL: https://issues.apache.org/jira/browse/LUCENE-7581
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ferenczi Jim
> Attachments: LUCENE-7581.patch
>
>
> IndexWriter#updateDocValues can break index sorting if it is called on a 
> field that is used in the index sorting specification. 
> TestIndexSorting has a test for this case: #testConcurrentDVUpdates 
> but only L1 merge are checked. Any LN merge would fail the test because the 
> inner sort of the segment is not re-compute during/after DV updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7581) IndexWriter#updateDocValues can break index sorting

2016-12-05 Thread Ferenczi Jim (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferenczi Jim updated LUCENE-7581:
-
Attachment: LUCENE-7581.patch

I attached a patch that fails the test if a second round of DV updates are run.

> IndexWriter#updateDocValues can break index sorting
> ---
>
> Key: LUCENE-7581
> URL: https://issues.apache.org/jira/browse/LUCENE-7581
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ferenczi Jim
> Attachments: LUCENE-7581.patch
>
>
> IndexWriter#updateDocValues can break index sorting if it is called on a 
> field that is used in the index sorting specification. 
> TestIndexSorting has a test for this case: #testConcurrentDVUpdates 
> but only L1 merge are checked. Any LN merge would fail the test because the 
> inner sort of the segment is not re-compute during/after DV updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-7581) IndexWriter#updateDocValues can break index sorting

2016-12-05 Thread Ferenczi Jim (JIRA)
Ferenczi Jim created LUCENE-7581:


 Summary: IndexWriter#updateDocValues can break index sorting
 Key: LUCENE-7581
 URL: https://issues.apache.org/jira/browse/LUCENE-7581
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Ferenczi Jim


IndexWriter#updateDocValues can break index sorting if it is called on a field 
that is used in the index sorting specification. 
TestIndexSorting has a test for this case: #testConcurrentDVUpdates 
but only L1 merge are checked. Any LN merge would fail the test because the 
inner sort of the segment is not re-compute during/after DV updates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Solaris (64bit/jdk1.8.0) - Build # 992 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Solaris/992/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseSerialGC

1 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.hdfs.HdfsRecoveryZkTest

Error Message:
ObjectTracker found 1 object(s) that were not released!!! [HdfsTransactionLog] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:43)
  at 
org.apache.solr.update.HdfsTransactionLog.(HdfsTransactionLog.java:130)  
at org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:202)  at 
org.apache.solr.update.UpdateHandler.(UpdateHandler.java:137)  at 
org.apache.solr.update.UpdateHandler.(UpdateHandler.java:94)  at 
org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:102)
  at sun.reflect.GeneratedConstructorAccessor84.newInstance(Unknown Source)  at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)  at 
org.apache.solr.core.SolrCore.createInstance(SolrCore.java:723)  at 
org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:785)  at 
org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1024)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:889)  at 
org.apache.solr.core.SolrCore.(SolrCore.java:793)  at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:868)  at 
org.apache.solr.core.CoreContainer.lambda$load$0(CoreContainer.java:517)  at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)  at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 at java.lang.Thread.run(Thread.java:745)  

Stack Trace:
java.lang.AssertionError: ObjectTracker found 1 object(s) that were not 
released!!! [HdfsTransactionLog]
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException
at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:43)
at 
org.apache.solr.update.HdfsTransactionLog.(HdfsTransactionLog.java:130)
at org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:202)
at org.apache.solr.update.UpdateHandler.(UpdateHandler.java:137)
at org.apache.solr.update.UpdateHandler.(UpdateHandler.java:94)
at 
org.apache.solr.update.DirectUpdateHandler2.(DirectUpdateHandler2.java:102)
at sun.reflect.GeneratedConstructorAccessor84.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:723)
at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:785)
at org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1024)
at org.apache.solr.core.SolrCore.(SolrCore.java:889)
at org.apache.solr.core.SolrCore.(SolrCore.java:793)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:868)
at 
org.apache.solr.core.CoreContainer.lambda$load$0(CoreContainer.java:517)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


at __randomizedtesting.SeedInfo.seed([27593A6632F4780]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNull(Assert.java:551)
at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:266)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:870)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 

[jira] [Commented] (SOLR-8871) Classification Update Request Processor Improvements

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722113#comment-15722113
 ] 

ASF subversion and git services commented on SOLR-8871:
---

Commit 7a44b8ed7b6b0c1214544f0c572433deb2f665f7 in lucene-solr's branch 
refs/heads/branch_6x from [~teofili]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7a44b8e ]

SOLR-8871 - removed suppress for forbidden API, added locale to toUpperCase
(cherry picked from commit c36ec0b)


> Classification Update Request Processor Improvements
> 
>
> Key: SOLR-8871
> URL: https://issues.apache.org/jira/browse/SOLR-8871
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 6.1
>Reporter: Alessandro Benedetti
>Assignee: Tommaso Teofili
>  Labels: classification, classifier, update, update.chain
> Fix For: 6.4
>
> Attachments: SOLR_8871.patch, SOLR_8871_UIMA_processor_test_fix.patch
>
>
> This task will group a set of modifications to the classification update 
> reqeust processor ( and Lucene classification module), based on user's 
> feedback ( thanks [~teofili] and Александър Цветанов  ) :
> - include boosting support for inputFields in the solrconfig.xml for the 
> classification update request processor
> e.g.
> field1^2, field2^5 ...
> - multi class assignement ( introduce a parameter, default 1, for the max 
> number of class to assign)
> - separate the classField in :
> classTrainingField
> classOutputField
> Default when classOutputField is not defined, is classTrainingField .
> - add support for the classification query, to use only a subset of the 
> entire index to classify.
> - Improve Related Tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-8871) Classification Update Request Processor Improvements

2016-12-05 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved SOLR-8871.
---
   Resolution: Fixed
Fix Version/s: 6.4

> Classification Update Request Processor Improvements
> 
>
> Key: SOLR-8871
> URL: https://issues.apache.org/jira/browse/SOLR-8871
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 6.1
>Reporter: Alessandro Benedetti
>Assignee: Tommaso Teofili
>  Labels: classification, classifier, update, update.chain
> Fix For: 6.4
>
> Attachments: SOLR_8871.patch, SOLR_8871_UIMA_processor_test_fix.patch
>
>
> This task will group a set of modifications to the classification update 
> reqeust processor ( and Lucene classification module), based on user's 
> feedback ( thanks [~teofili] and Александър Цветанов  ) :
> - include boosting support for inputFields in the solrconfig.xml for the 
> classification update request processor
> e.g.
> field1^2, field2^5 ...
> - multi class assignement ( introduce a parameter, default 1, for the max 
> number of class to assign)
> - separate the classField in :
> classTrainingField
> classOutputField
> Default when classOutputField is not defined, is classTrainingField .
> - add support for the classification query, to use only a subset of the 
> entire index to classify.
> - Improve Related Tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8871) Classification Update Request Processor Improvements

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722114#comment-15722114
 ] 

ASF subversion and git services commented on SOLR-8871:
---

Commit cdce62108737dd8f35e588c6d6e5486469d416f7 in lucene-solr's branch 
refs/heads/branch_6x from [~teofili]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=cdce621 ]

SOLR-8871 - adjusted UIMA processor test, patch from Alessandro Benedetti
(cherry picked from commit 641294a)


> Classification Update Request Processor Improvements
> 
>
> Key: SOLR-8871
> URL: https://issues.apache.org/jira/browse/SOLR-8871
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 6.1
>Reporter: Alessandro Benedetti
>Assignee: Tommaso Teofili
>  Labels: classification, classifier, update, update.chain
> Fix For: 6.4
>
> Attachments: SOLR_8871.patch, SOLR_8871_UIMA_processor_test_fix.patch
>
>
> This task will group a set of modifications to the classification update 
> reqeust processor ( and Lucene classification module), based on user's 
> feedback ( thanks [~teofili] and Александър Цветанов  ) :
> - include boosting support for inputFields in the solrconfig.xml for the 
> classification update request processor
> e.g.
> field1^2, field2^5 ...
> - multi class assignement ( introduce a parameter, default 1, for the max 
> number of class to assign)
> - separate the classField in :
> classTrainingField
> classOutputField
> Default when classOutputField is not defined, is classTrainingField .
> - add support for the classification query, to use only a subset of the 
> entire index to classify.
> - Improve Related Tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8871) Classification Update Request Processor Improvements

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722110#comment-15722110
 ] 

ASF subversion and git services commented on SOLR-8871:
---

Commit f9ca890fc377a5699d612c00ee0bc7e90baf569e in lucene-solr's branch 
refs/heads/branch_6x from [~teofili]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f9ca890 ]

SOLR-8871 - various improvements to ClassificationURP
(cherry picked from commit 5ad741e)


> Classification Update Request Processor Improvements
> 
>
> Key: SOLR-8871
> URL: https://issues.apache.org/jira/browse/SOLR-8871
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 6.1
>Reporter: Alessandro Benedetti
>Assignee: Tommaso Teofili
>  Labels: classification, classifier, update, update.chain
> Fix For: 6.4
>
> Attachments: SOLR_8871.patch, SOLR_8871_UIMA_processor_test_fix.patch
>
>
> This task will group a set of modifications to the classification update 
> reqeust processor ( and Lucene classification module), based on user's 
> feedback ( thanks [~teofili] and Александър Цветанов  ) :
> - include boosting support for inputFields in the solrconfig.xml for the 
> classification update request processor
> e.g.
> field1^2, field2^5 ...
> - multi class assignement ( introduce a parameter, default 1, for the max 
> number of class to assign)
> - separate the classField in :
> classTrainingField
> classOutputField
> Default when classOutputField is not defined, is classTrainingField .
> - add support for the classification query, to use only a subset of the 
> entire index to classify.
> - Improve Related Tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8871) Classification Update Request Processor Improvements

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722111#comment-15722111
 ] 

ASF subversion and git services commented on SOLR-8871:
---

Commit 048d4370abf6337bcd8cb969f463d7dbe2dbb1a7 in lucene-solr's branch 
refs/heads/branch_6x from [~teofili]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=048d437 ]

SOLR-8871 - adjusted header positioning
(cherry picked from commit 96489d2)


> Classification Update Request Processor Improvements
> 
>
> Key: SOLR-8871
> URL: https://issues.apache.org/jira/browse/SOLR-8871
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 6.1
>Reporter: Alessandro Benedetti
>Assignee: Tommaso Teofili
>  Labels: classification, classifier, update, update.chain
> Fix For: 6.4
>
> Attachments: SOLR_8871.patch, SOLR_8871_UIMA_processor_test_fix.patch
>
>
> This task will group a set of modifications to the classification update 
> reqeust processor ( and Lucene classification module), based on user's 
> feedback ( thanks [~teofili] and Александър Цветанов  ) :
> - include boosting support for inputFields in the solrconfig.xml for the 
> classification update request processor
> e.g.
> field1^2, field2^5 ...
> - multi class assignement ( introduce a parameter, default 1, for the max 
> number of class to assign)
> - separate the classField in :
> classTrainingField
> classOutputField
> Default when classOutputField is not defined, is classTrainingField .
> - add support for the classification query, to use only a subset of the 
> entire index to classify.
> - Improve Related Tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7350) Let classifiers be constructed from IndexReaders

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722083#comment-15722083
 ] 

ASF subversion and git services commented on LUCENE-7350:
-

Commit 406535a3a8fcb5c92e15854d89c1dc8407852f0e in lucene-solr's branch 
refs/heads/branch_6x from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=406535a ]

LUCENE-7350: Remove unused imports.
(cherry picked from commit 281af8b)


> Let classifiers be constructed from IndexReaders
> 
>
> Key: LUCENE-7350
> URL: https://issues.apache.org/jira/browse/LUCENE-7350
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: master (7.0)
>
>
> Current {{Classifier}} implementations are built from {{LeafReaders}}, this 
> is an heritage of using certain Lucene 4.x {{AtomicReader}}'s specific APIs; 
> this is no longer required as what is used by current implementations is 
> based on {{IndexReader}} APIs and therefore it makes more sense to use that 
> as constructor parameter as it doesn't give any additional benefit whereas it 
> requires client code to deal with classifiers that are tight to segments 
> (which doesn't make much sense).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7563) BKD index should compress unused leading bytes

2016-12-05 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722090#comment-15722090
 ] 

Michael McCandless commented on LUCENE-7563:


bq. I think there is just a redundant arraycopy in clone()?

Thanks, I pushed a fix!

bq. For the record, I played with another idea leveraging the fact that the 
prefix lengths on two consecutive levels are likely close to each other,

I like this idea!  But I hit this test failure ... doesn't reproduce on trunk:

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestBKD 
-Dtests.method=testWastedLeadingBytes -Dtests.seed=2E5F0E183BBA1098 
-Dtests.locale=es-PR -Dtests.timezone=CST -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] ERROR   0.90s J1 | TestBKD.testWastedLeadingBytes <<<
   [junit4]> Throwable #1: java.lang.ArrayIndexOutOfBoundsException: -32
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([2E5F0E183BBA1098:ABD9D50B47794EFC]:0)
   [junit4]>at 
org.apache.lucene.util.bkd.BKDReader$PackedIndexTree.readNodeData(BKDReader.java:442)
   [junit4]>at 
org.apache.lucene.util.bkd.BKDReader$PackedIndexTree.(BKDReader.java:343)
   [junit4]>at 
org.apache.lucene.util.bkd.BKDReader.getIntersectState(BKDReader.java:526)
   [junit4]>at 
org.apache.lucene.util.bkd.BKDReader.intersect(BKDReader.java:498)
   [junit4]>at 
org.apache.lucene.util.bkd.TestBKD.testWastedLeadingBytes(TestBKD.java:1042)
   [junit4]>at java.lang.Thread.run(Thread.java:745)
{noformat}

> BKD index should compress unused leading bytes
> --
>
> Key: LUCENE-7563
> URL: https://issues.apache.org/jira/browse/LUCENE-7563
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7563-prefixlen-unary.patch, LUCENE-7563.patch, 
> LUCENE-7563.patch, LUCENE-7563.patch, LUCENE-7563.patch
>
>
> Today the BKD (points) in-heap index always uses {{dimensionNumBytes}} per 
> dimension, but if e.g. you are indexing {{LongPoint}} yet only use the bottom 
> two bytes in a given segment, we shouldn't store all those leading 0s in the 
> index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.x-MacOSX (64bit/jdk1.8.0) - Build # 558 - Failure!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-MacOSX/558/
Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseG1GC

2 tests failed.
FAILED:  
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggesterTest.testRandomNRT

Error Message:
Captured an uncaught exception in thread: Thread[id=29, name=Thread-1, 
state=RUNNABLE, group=TGRP-AnalyzingInfixSuggesterTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=29, name=Thread-1, state=RUNNABLE, 
group=TGRP-AnalyzingInfixSuggesterTest]
at 
__randomizedtesting.SeedInfo.seed([570DDDB64D691936:F323D30B15B6C58A]:0)
Caused by: org.apache.lucene.store.AlreadyClosedException: this 
ReferenceManager is closed
at __randomizedtesting.SeedInfo.seed([570DDDB64D691936]:0)
at 
org.apache.lucene.search.ReferenceManager.acquire(ReferenceManager.java:98)
at 
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.lookup(AnalyzingInfixSuggester.java:645)
at 
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.lookup(AnalyzingInfixSuggester.java:457)
at 
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggesterTest$LookupThread.run(AnalyzingInfixSuggesterTest.java:533)


FAILED:  org.apache.solr.cloud.ShardSplitTest.test

Error Message:
Wrong doc count on shard1_0. See SOLR-5309 expected:<128> but was:<127>

Stack Trace:
java.lang.AssertionError: Wrong doc count on shard1_0. See SOLR-5309 
expected:<128> but was:<127>
at 
__randomizedtesting.SeedInfo.seed([582147CA013F4EB0:D0757810AFC32348]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.ShardSplitTest.checkDocCountsAndShardStates(ShardSplitTest.java:886)
at 
org.apache.solr.cloud.ShardSplitTest.splitByUniqueKeyTest(ShardSplitTest.java:669)
at org.apache.solr.cloud.ShardSplitTest.test(ShardSplitTest.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:992)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:967)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
   

[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722085#comment-15722085
 ] 

ASF subversion and git services commented on SOLR-4735:
---

Commit 46c662fcab56906f3fa6fde09d3789d1d2fc6aed in lucene-solr's branch 
refs/heads/feature/metrics from [~ab]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=46c662f ]

SOLR-4735 WIP: move metric reporter config to CoreContainer level. Manage
reporters in SolrMetricManager.


> Improve Solr metrics reporting
> --
>
> Key: SOLR-4735
> URL: https://issues.apache.org/jira/browse/SOLR-4735
> Project: Solr
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Andrzej Bialecki 
>Priority: Minor
> Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, 
> SOLR-4735.patch
>
>
> Following on from a discussion on the mailing list:
> http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+
> It would be good to make Solr play more nicely with existing devops 
> monitoring systems, such as Graphite or Ganglia.  Stats monitoring at the 
> moment is poll-only, either via JMX or through the admin stats page.  I'd 
> like to refactor things a bit to make this more pluggable.
> This patch is a start.  It adds a new interface, InstrumentedBean, which 
> extends SolrInfoMBean to return a 
> [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a 
> couple of MetricReporters (which basically just duplicate the JMX and admin 
> page reporting that's there at the moment, but which should be more 
> extensible).  The patch includes a change to RequestHandlerBase showing how 
> this could work.  The idea would be to eventually replace the getStatistics() 
> call on SolrInfoMBean with this instead.
> The next step would be to allow more MetricReporters to be defined in 
> solrconfig.xml.  The Metrics library comes with ganglia and graphite 
> reporting modules, and we can add contrib plugins for both of those.
> There's some more general cleanup that could be done around SolrInfoMBean 
> (we've got two plugin handlers at /mbeans and /plugins that basically do the 
> same thing, and the beans themselves have some weirdly inconsistent data on 
> them - getVersion() returns different things for different impls, and 
> getSource() seems pretty useless), but maybe that's for another issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7350) Let classifiers be constructed from IndexReaders

2016-12-05 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-7350:

Fix Version/s: 6.4

> Let classifiers be constructed from IndexReaders
> 
>
> Key: LUCENE-7350
> URL: https://issues.apache.org/jira/browse/LUCENE-7350
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: master (7.0), 6.4
>
>
> Current {{Classifier}} implementations are built from {{LeafReaders}}, this 
> is an heritage of using certain Lucene 4.x {{AtomicReader}}'s specific APIs; 
> this is no longer required as what is used by current implementations is 
> based on {{IndexReader}} APIs and therefore it makes more sense to use that 
> as constructor parameter as it doesn't give any additional benefit whereas it 
> requires client code to deal with classifiers that are tight to segments 
> (which doesn't make much sense).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7350) Let classifiers be constructed from IndexReaders

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722081#comment-15722081
 ] 

ASF subversion and git services commented on LUCENE-7350:
-

Commit ea95c3cea7b2a13e64596b59da2ea0c3f2a9705f in lucene-solr's branch 
refs/heads/branch_6x from [~teofili]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ea95c3c ]

LUCENE-7350 - Let classifiers be constructed from IndexReaders


> Let classifiers be constructed from IndexReaders
> 
>
> Key: LUCENE-7350
> URL: https://issues.apache.org/jira/browse/LUCENE-7350
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/classification
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: master (7.0)
>
>
> Current {{Classifier}} implementations are built from {{LeafReaders}}, this 
> is an heritage of using certain Lucene 4.x {{AtomicReader}}'s specific APIs; 
> this is no longer required as what is used by current implementations is 
> based on {{IndexReader}} APIs and therefore it makes more sense to use that 
> as constructor parameter as it doesn't give any additional benefit whereas it 
> requires client code to deal with classifiers that are tight to segments 
> (which doesn't make much sense).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-7563) BKD index should compress unused leading bytes

2016-12-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722054#comment-15722054
 ] 

ASF subversion and git services commented on LUCENE-7563:
-

Commit bd8b191505d92c89a483a6189497374238476a00 in lucene-solr's branch 
refs/heads/master from Mike McCandless
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=bd8b191 ]

LUCENE-7563: remove redundant array copy in PackedIndexTree.clone


> BKD index should compress unused leading bytes
> --
>
> Key: LUCENE-7563
> URL: https://issues.apache.org/jira/browse/LUCENE-7563
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7563-prefixlen-unary.patch, LUCENE-7563.patch, 
> LUCENE-7563.patch, LUCENE-7563.patch, LUCENE-7563.patch
>
>
> Today the BKD (points) in-heap index always uses {{dimensionNumBytes}} per 
> dimension, but if e.g. you are indexing {{LongPoint}} yet only use the bottom 
> two bytes in a given segment, we shouldn't store all those leading 0s in the 
> index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-7563) BKD index should compress unused leading bytes

2016-12-05 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7563:
-
Attachment: LUCENE-7563-prefixlen-unary.patch

The change looks good and the drop is quite spectacular. 
http://people.apache.org/~mikemccand/lucenebench/sparseResults.html#searcher_heap
 :-) I think there is just a redundant arraycopy in {{clone()}}?

For the record, I played with another idea leveraging the fact that the prefix 
lengths on two consecutive levels are likely close to each other, and the most 
common values for the deltas are 0, then 1, then -1. So we might be able to do 
more savings by encoding the delta between consecutive prefix length using 
unary coding on top of zig-zag encoding, which would allow to encode 0 on 1 
bit, 1 on 2 bits, 2 on 3 bits, etc. However it only saved 1% memory on IndexOSM 
and less than 1% on IndexTaxis. I'm attaching it here if someone wants to have 
a look but I don't think the gains are worth the complexity.

> BKD index should compress unused leading bytes
> --
>
> Key: LUCENE-7563
> URL: https://issues.apache.org/jira/browse/LUCENE-7563
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: master (7.0), 6.4
>
> Attachments: LUCENE-7563-prefixlen-unary.patch, LUCENE-7563.patch, 
> LUCENE-7563.patch, LUCENE-7563.patch, LUCENE-7563.patch
>
>
> Today the BKD (points) in-heap index always uses {{dimensionNumBytes}} per 
> dimension, but if e.g. you are indexing {{LongPoint}} yet only use the bottom 
> two bytes in a given segment, we shouldn't store all those leading 0s in the 
> index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-master-Linux (32bit/jdk-9-ea+140) - Build # 18448 - Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/18448/
Java: 32bit/jdk-9-ea+140 -server -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.DocValuesNotIndexedTest.testGroupingDVOnly

Error Message:
Unexpected number of elements in the group for intGSF: 5

Stack Trace:
java.lang.AssertionError: Unexpected number of elements in the group for 
intGSF: 5
at 
__randomizedtesting.SeedInfo.seed([8D1C88BC628218EB:16A7E6E42FDA2AB5]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.DocValuesNotIndexedTest.testGroupingDVOnly(DocValuesNotIndexedTest.java:371)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@9-ea/Native 
Method)
at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@9-ea/NativeMethodAccessorImpl.java:62)
at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@9-ea/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(java.base@9-ea/Method.java:535)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at java.lang.Thread.run(java.base@9-ea/Thread.java:843)




Build Log:
[...truncated 12173 lines...]
   [junit4] Suite: org.apache.solr.cloud.DocValuesNotIndexedTest
   [junit4]   2> Creating dataDir: 

[jira] [Updated] (LUCENE-7575) UnifiedHighlighter: add requireFieldMatch=false support

2016-12-05 Thread Ferenczi Jim (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferenczi Jim updated LUCENE-7575:
-
Attachment: LUCENE-7575.patch

Thanks [~dsmiley] and [~Timothy055] !

I pushed a new patch to address your comments. 

{quote}
 it'd be interesting if instead of a simple boolean toggle, if it were a 
Predicate fieldMatchPredicate so that only some fields could be 
collected in the query but not all. Just an idea.
{quote}

I agree and this is why I changed the patch to include your idea. By default 
nothing changes, queries are extracted based on the field name to highlight. 
Though with this change the user can now define which query (based on the field 
name) should be highlighted. I think it's better like this but I can revert if 
you think this should not implemented in the first iteration.

I fixed the bugs that David spotted (terms from different fields not sorted 
after filteredExtractTerms and redundant initialization of the filter leaf 
reader for the span queries) and split the tests based on the type of query 
that is tested.


> UnifiedHighlighter: add requireFieldMatch=false support
> ---
>
> Key: LUCENE-7575
> URL: https://issues.apache.org/jira/browse/LUCENE-7575
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/highlighter
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: LUCENE-7575.patch, LUCENE-7575.patch
>
>
> The UnifiedHighlighter (like the PostingsHighlighter) only supports 
> highlighting queries for the same fields that are being highlighted.  The 
> original Highlighter and FVH support loosening this, AKA 
> requireFieldMatch=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9826) Shutting down leader when it's sending updates makes another active node go into recovery

2016-12-05 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721813#comment-15721813
 ] 

Ere Maijala commented on SOLR-9826:
---

The attached log is from node finna-index-2.

> Shutting down leader when it's sending updates makes another active node go 
> into recovery
> -
>
> Key: SOLR-9826
> URL: https://issues.apache.org/jira/browse/SOLR-9826
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.3
>Reporter: Ere Maijala
>  Labels: solrcloud
> Attachments: failure.log
>
>
> If the leader in SolrCloud is sending updates to a follower when it's shut 
> down, it forces the replica it can't communicate with (due to being shut 
> down, I assume) to go into recovery. I'll attach a log excerpt that shows the 
> related messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-9826) Shutting down leader when it's sending updates makes another active node go into recovery

2016-12-05 Thread Ere Maijala (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721808#comment-15721808
 ] 

Ere Maijala edited comment on SOLR-9826 at 12/5/16 9:58 AM:


Added a Solr log showing how the leader gets confused and sends another node 
into recovery.


was (Author: emaijala):
Solr log showing how the leader gets confused and sends another node into 
recovery.

> Shutting down leader when it's sending updates makes another active node go 
> into recovery
> -
>
> Key: SOLR-9826
> URL: https://issues.apache.org/jira/browse/SOLR-9826
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.3
>Reporter: Ere Maijala
>  Labels: solrcloud
> Attachments: failure.log
>
>
> If the leader in SolrCloud is sending updates to a follower when it's shut 
> down, it forces the replica it can't communicate with (due to being shut 
> down, I assume) to go into recovery. I'll attach a log excerpt that shows the 
> related messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9826) Shutting down leader when it's sending updates makes another active node go into recovery

2016-12-05 Thread Ere Maijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ere Maijala updated SOLR-9826:
--
Attachment: failure.log

Solr log showing how the leader gets confused and sends another node into 
recovery.

> Shutting down leader when it's sending updates makes another active node go 
> into recovery
> -
>
> Key: SOLR-9826
> URL: https://issues.apache.org/jira/browse/SOLR-9826
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.3
>Reporter: Ere Maijala
>  Labels: solrcloud
> Attachments: failure.log
>
>
> If the leader in SolrCloud is sending updates to a follower when it's shut 
> down, it forces the replica it can't communicate with (due to being shut 
> down, I assume) to go into recovery. I'll attach a log excerpt that shows the 
> related messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9826) Shutting down leader when it's sending updates makes another active node go into recovery

2016-12-05 Thread Ere Maijala (JIRA)
Ere Maijala created SOLR-9826:
-

 Summary: Shutting down leader when it's sending updates makes 
another active node go into recovery
 Key: SOLR-9826
 URL: https://issues.apache.org/jira/browse/SOLR-9826
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 6.3
Reporter: Ere Maijala


If the leader in SolrCloud is sending updates to a follower when it's shut 
down, it forces the replica it can't communicate with (due to being shut down, 
I assume) to go into recovery. I'll attach a log excerpt that shows the related 
messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.x-Linux (64bit/jdk1.8.0_102) - Build # 2344 - Still Unstable!

2016-12-05 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/2344/
Java: 64bit/jdk1.8.0_102 -XX:-UseCompressedOops -XX:+UseG1GC

1 tests failed.
FAILED:  org.apache.solr.cloud.CloudExitableDirectoryReaderTest.test

Error Message:
No live SolrServers available to handle this request

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request
at 
__randomizedtesting.SeedInfo.seed([8E80D05C10B47740:6D4EF86BE481AB8]:0)
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:412)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1344)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1095)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1037)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
at 
org.apache.solr.cloud.CloudExitableDirectoryReaderTest.assertPartialResults(CloudExitableDirectoryReaderTest.java:106)
at 
org.apache.solr.cloud.CloudExitableDirectoryReaderTest.doTimeoutTests(CloudExitableDirectoryReaderTest.java:78)
at 
org.apache.solr.cloud.CloudExitableDirectoryReaderTest.test(CloudExitableDirectoryReaderTest.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:367)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:811)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:462)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at