[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well

2019-11-08 Thread Lucene/Solr QA (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970744#comment-16970744
 ] 

Lucene/Solr QA commented on LUCENE-9036:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
20s{color} | {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 20s{color} 
| {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} Release audit (RAT) {color} | {color:red}  
0m 20s{color} | {color:red} core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 58s{color} | {color:green} Release audit (RAT) rat-sources 
passed {color} |
| {color:red}-1{color} | {color:red} Check forbidden APIs {color} | {color:red} 
 0m 20s{color} | {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} Validate source patterns {color} | 
{color:red}  0m 20s{color} | {color:red} core in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 12s{color} 
| {color:red} core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 45m 38s{color} 
| {color:red} core in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.core.ExitableDirectoryReaderTest |
|   | solr.cloud.CloudExitableDirectoryReaderTest |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-9036 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985326/LUCENE-9036.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 7a207a93537 |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
| compile | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-compile-lucene_core.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-compile-lucene_core.txt
 |
| Release audit (RAT) | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-compile-lucene_core.txt
 |
| Check forbidden APIs | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-compile-lucene_core.txt
 |
| Validate source patterns | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-compile-lucene_core.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-unit-lucene_core.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/artifact/out/patch-unit-solr_core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/testReport/ |
| modules | C: lucene/core solr/core U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/224/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> ExitableDirectoryReader to interrupt DocValues as well
> --
>
> Key: LUCENE-9036
> URL: https://issues.apache.org/jira/browse/LUCENE-9036
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Mikhail Khludnev
>Priority: Major
> Attachments: LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch
>
>
> This allow to make AnalyticsComponent and json.facet sensitive to time 
> allowed. 
> Does it make sense? Is it enough to check on DV creation ie per field/segment 
> or it's worth to check every Nth doc? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: 

[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970709#comment-16970709
 ] 

Mark Robert Miller commented on SOLR-13888:
---

Now I am going to have a great vacation actually. You an start a pool on 
whether you qualitatively think it's 100, 1000, or 1 times better when it 
doesn't behave like a drunk Frankenstein with a pot on it's head. I'd make it 
use cores to cause, like servers, but I'm multi-core biased.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache

2019-11-08 Thread Ben Manes (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970707#comment-16970707
 ] 

Ben Manes commented on LUCENE-9038:
---

hmm.. okay, so the behavior is certainly too complex for my initial runs at the 
problems.

A few ideas to consider based on evolving the cache and helping to alleviate 
some of the challenges.
1. Avoid the {{tryLock}} on read by borrowing the trick to record the access 
into a striped, lossy ring buffer and replay the events under the lock. This 
lets you use a {{ConcurrentHashMap}} for lock-free reads. Instead of contending 
on a lock to perform tiny LRU reordering work, you schedule and perform a batch 
under the lock to modify the non-threadsafe data structures. The striping and 
lossy behavior mitigates hot spots under high load.
2. Perform the computations under a striped lock in double-checked locking 
fashion. This would avoid redundant work due to dogpiling, while also allowing 
mostly parallel loading of independent keys (e.g. {{var lock = 
locks[key.hashCode() % locks.length]}}).

This would mitigate some of the problems but also add complexity, whereas I 
originally hoped to reduce that burden.

> Evaluate Caffeine for LruQueryCache
> ---
>
> Key: LUCENE-9038
> URL: https://issues.apache.org/jira/browse/LUCENE-9038
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ben Manes
>Priority: Major
> Attachments: CaffeineQueryCache.java
>
>
> [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java]
>  appears to play a central role in Lucene's performance. There are many 
> issues discussing its performance, such as LUCENE-7235, LUCENE-7237, 
> LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's 
> overhead can be just as much of a benefit as a liability, causing various 
> workarounds and complexity.
> When reviewing the discussions and code, the following issues are concerning:
> # The cache is guarded by a single lock for all reads and writes.
> # All computations are performed outside of the any locking to avoid 
> penalizing other callers. This  doesn't handle the cache stampedes meaning 
> that multiple threads may cache miss, compute the value, and try to store it. 
> That redundant work becomes expensive under load and can be mitigated with ~ 
> per-key locks.
> # The cache queries the entry to see if it's even worth caching. At first 
> glance one assumes that is so that inexpensive entries don't bang on the lock 
> or thrash the LRU. However, this is also used to indicate data dependencies 
> for uncachable items (per JIRA), which perhaps shouldn't be invoking the 
> cache.
> # The cache lookup is skipped if the global lock is held and the value is 
> computed, but not stored. This means a busy lock reduces performance across 
> all usages and the cache's effectiveness degrades. This is not counted in the 
> miss rate, giving a false impression.
> # An attempt was made to perform computations asynchronously, due to their 
> heavy cost on tail latencies. That work was reverted due to test failures and 
> is being worked on.
> # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] 
> tries to avoid LRU thrashing due to large, infrequently used items being 
> cached.
> # The cache is tightly intertwined with business logic, making it hard to 
> tease apart core algorithms and data structures from the usage scenarios.
> It seems that more and more items skip being cached because of concurrency 
> and hit rate performance, causing special case fixes based on knowledge of 
> the external code flows. Since the developers are experts on search, not 
> caching, it seems justified to evaluate if an off-the-shelf library would be 
> more helpful in terms of developer time, code complexity, and performance. 
> Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] 
> in SOLR-8241 and SOLR-13817.
> The proposal is to replace the internals {{LruQueryCache}} so that external 
> usages are not affected in terms of the API. However, like in {{SolrCache}}, 
> a difference is that Caffeine only bounds by either the number of entries or 
> an accumulated size (e.g. bytes), but not both constraints. This likely is an 
> acceptable divergence in how the configuration is honored.
> cc [~ab], [~dsmiley]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-11-08 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970650#comment-16970650
 ] 

Adrien Grand commented on LUCENE-8920:
--

I quickly skimmed through the patch, the approach looks good to me.

With this new approach to compute labels from the bitset, maybe we should set a 
less aggressive default oversizing factor, e.g. 1? For instance if you index 
lowercase text, the range would most of the time be less than 24, which takes 3 
bytes, plus one byte that we need for the first label. So this new encoding 
would already be used if you have 4 arcs or more, which sounds good to me 
already?

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Minor
> Attachments: TestTermsDictRamBytesUsed.java
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13872) Backup can fail to read index files w/NoSuchFileException during merges (SOLR-11616 regression)

2019-11-08 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-13872:
--
Attachment: SOLR-13872.patch
Status: Open  (was: Open)

updated patch working away at some of the remaining nocommit...

* removed nocommits related to things spun off into new (linked) jiras
* test additions
* ref-guide note about backups when using softCommit


>  Backup can fail to read index files w/NoSuchFileException during merges 
> (SOLR-11616 regression)
> 
>
> Key: SOLR-13872
> URL: https://issues.apache.org/jira/browse/SOLR-13872
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13872.patch, SOLR-13872.patch, SOLR-13872.patch, 
> index_churn.pl
>
>
> SOLR-11616 purports to fix a bug in Solr's backup functionality that causes 
> 'NoSuchFileException' errors when attempting to backup an index while it is 
> undergoing indexing (and segment merging)
> Although SOLR-11616 is marked with "Fix Version: 7.2" it's pretty easy to 
> demonstrate that this bug still exists on master, branch_8x, and even in 7.2 
> - so it seems less like the current problem is a "regression" and more that 
> the original fix didn't work.
> 
> The crux of the problem seems to be concurrency bugs in if/how a commit is 
> "reserved" before attempting to copy the files in that commit to the backup 
> location.  
> A possible work around discussed in more depth in the comments below is to 
> update {{solrconfig.xml}} to explicitly configure the {{SolrDeletionPolicy}} 
> with either the {{maxCommitsToKeep}} or {{maxCommitAge}} options to ensure 
> the commits are kept around long enough for the backup to be created.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9015) Configure branches, auto build and auto stage/publish

2019-11-08 Thread Adam Walz (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970632#comment-16970632
 ] 

Adam Walz commented on LUCENE-9015:
---

[~janhoy] please see [https://github.com/apache/lucene-site/pull/5]

> Configure branches, auto build and auto stage/publish
> -
>
> Key: LUCENE-9015
> URL: https://issues.apache.org/jira/browse/LUCENE-9015
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Priority: Major
>
> Commit to master should build and publish the staging site
> Find a simple way to trigger publishing of main site from staging



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] adamwalz opened a new pull request #5: LUCENE-9013 Add .asf.yaml to automatically build Pelican

2019-11-08 Thread GitBox
adamwalz opened a new pull request #5: LUCENE-9013 Add .asf.yaml to 
automatically build Pelican
URL: https://github.com/apache/lucene-site/pull/5
 
 
   Build Pelican site upon merging to master branch. Commit output/ directory 
to asf-staging branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] adamwalz commented on issue #4: SOLR-13910 Add security pages news feeds for solr, core, and tlp

2019-11-08 Thread GitBox
adamwalz commented on issue #4: SOLR-13910 Add security pages news feeds for 
solr, core, and tlp
URL: https://github.com/apache/lucene-site/pull/4#issuecomment-552021486
 
 
   Only 3237b49 is new. I'll update this PR when 
https://github.com/apache/lucene-site/pull/3 is merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-site] adamwalz opened a new pull request #4: SOLR-13910 Add security pages news feeds for solr, core, and tlp

2019-11-08 Thread GitBox
adamwalz opened a new pull request #4: SOLR-13910 Add security pages news feeds 
for solr, core, and tlp
URL: https://github.com/apache/lucene-site/pull/4
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13910) Create security news feed on website with RSS/Atom feed

2019-11-08 Thread Adam Walz (Jira)
Adam Walz created SOLR-13910:


 Summary: Create security news feed on website with RSS/Atom feed
 Key: SOLR-13910
 URL: https://issues.apache.org/jira/browse/SOLR-13910
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
  Components: website
Reporter: Adam Walz


>From [~janhoy]

We're in the process of migrating our web site to Git and in that same
process we also change CMS from an ASF one to Pelican. The new site has
built-in support for news posts as individual files and also RSS feeds of
those. So I propose to add [https://lucene.apache.org/solr/security.html]
to the site, including a list of newest CVEs and an RSS/Atom feed to go
along with it. This way users have ONE place to visit to check security
announcements and they can monitor RSS to be alerted once we post a new
announcement.

We could also add RSS feeds for Lucene-core news and Solr-news sections
of course.

At the same time I propose that the news on the front-page 
[lucene.apache.org|http://lucene.apache.org/]
is replaced with widgets that show the title only of the last 3 announcements
from Lucene, Solr and PyLucene sub projects. That front page is waaay
too long :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9037) ArrayIndexOutOfBoundsException due to repeated IOException during indexing

2019-11-08 Thread Ilan Ginzburg (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970622#comment-16970622
 ] 

Ilan Ginzburg edited comment on LUCENE-9037 at 11/8/19 10:47 PM:
-

Thanks [~mikemccand].

What about moving up the call to 
{{DocumentsWriterFlushControl.doAfterDocument()}} into the {{finally}} of the 
bloc calling {{DocumentsWriterPerThread.updateDocument/s()}} in 
{{DocumentsWriter.updateDocument/s()}}?
 Basically consider {{DocumentsWriterFlushControl.doAfterDocument()}} as a "do 
after _successful or failed_ document".

Exploring that path see if I can make it work (and existing tests pass).

Your suggestion of throwing a meaningful exception upon reaching the limit 
would not help my use case if there's no flush happening as a consequence.


was (Author: murblanc):
Thanks [~mikemccand].

What about moving up the call to 
{{DocumentsWriterFlushControl.doAfterDocument()}} into the {{finally}} of the 
bloc calling {{DocumentsWriterPerThread.updateDocument/s()}} in 
{{DocumentsWriter.updateDocument/s()}}?
Basically consider {{DocumentsWriterFlushControl.doAfterDocument()}} as a "do 
after _successful or failed_ document".

Exploring that path see if I can make it work (and existing tests pass).

> ArrayIndexOutOfBoundsException due to repeated IOException during indexing
> --
>
> Key: LUCENE-9037
> URL: https://issues.apache.org/jira/browse/LUCENE-9037
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.1
>Reporter: Ilan Ginzburg
>Priority: Minor
> Attachments: TestIndexWriterTermsHashOverflow.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a limit to the number of tokens that can be held in memory by Lucene 
> when docs are indexed using DocumentsWriter, then bad things happen. The 
> limit can be reached by submitting a really large document, by submitting a 
> large number of documents without doing a commit (see LUCENE-8118) or by 
> repeatedly submitting documents that fail to get indexed in some specific 
> ways, leading to Lucene not cleaning up the in memory data structures that 
> eventually overflow.
> The overflow is due to a 32 bit (signed) integer wrapping around to negative 
> territory, then causing an ArrayIndexOutOfBoundsException. 
> The failure path that we are reliably hitting is due to an IOException during 
> doc tokenization. A tokenizer implementing TokenStream throws an exception 
> from incrementToken() which causes indexing of that doc to fail. 
> The IOException bubbles back up to DocumentsWriter.updateDocument() (or 
> DocumentsWriter.updateDocuments() in some other cases) where it is not 
> treated as an AbortingException therefore it is not causing a reset of the 
> DocumentsWriterPerThread. On repeated failures (without any successful 
> indexing in between) if the upper layer (client via Solr) resubmits the doc 
> that fails again, DocumentsWriterPerThread will eventually cause 
> TermsHashPerField data structures to grow and overflow, leading to an 
> exception stack similar to the one in LUCENE-8118 (below stack trace copied 
> from a test run repro on 7.1):
> java.lang.ArrayIndexOutOfBoundsException: 
> -65536java.lang.ArrayIndexOutOfBoundsException: -65536
>  at __randomizedtesting.SeedInfo.seed([394FAB2B91B1D90A:C86FB3F3CE001AA8]:0) 
> at 
> org.apache.lucene.index.TermsHashPerField.writeByte(TermsHashPerField.java:198)
>  at 
> org.apache.lucene.index.TermsHashPerField.writeVInt(TermsHashPerField.java:221)
>  at 
> org.apache.lucene.index.FreqProxTermsWriterPerField.writeProx(FreqProxTermsWriterPerField.java:80)
>  at 
> org.apache.lucene.index.FreqProxTermsWriterPerField.addTerm(FreqProxTermsWriterPerField.java:171)
>  at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185) 
> at 
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:792)
>  at 
> org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
>  at 
> org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:392)
>  at 
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239)
>  at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:481)
>  at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1717) 
> at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1462)
> Using tokens composed only of lowercase letters, it takes less than 
> 130,000,000 different tokens (the shortest ones) to overflow 
> TermsHashPerField.
> Using a single document (composed of the 20,000 shortest lowercase tokens) 
> submitted repeatedly for indexing requires 6352 

[jira] [Commented] (LUCENE-9037) ArrayIndexOutOfBoundsException due to repeated IOException during indexing

2019-11-08 Thread Ilan Ginzburg (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970622#comment-16970622
 ] 

Ilan Ginzburg commented on LUCENE-9037:
---

Thanks [~mikemccand].

What about moving up the call to 
{{DocumentsWriterFlushControl.doAfterDocument()}} into the {{finally}} of the 
bloc calling {{DocumentsWriterPerThread.updateDocument/s()}} in 
{{DocumentsWriter.updateDocument/s()}}?
Basically consider {{DocumentsWriterFlushControl.doAfterDocument()}} as a "do 
after _successful or failed_ document".

Exploring that path see if I can make it work (and existing tests pass).

> ArrayIndexOutOfBoundsException due to repeated IOException during indexing
> --
>
> Key: LUCENE-9037
> URL: https://issues.apache.org/jira/browse/LUCENE-9037
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 7.1
>Reporter: Ilan Ginzburg
>Priority: Minor
> Attachments: TestIndexWriterTermsHashOverflow.java
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a limit to the number of tokens that can be held in memory by Lucene 
> when docs are indexed using DocumentsWriter, then bad things happen. The 
> limit can be reached by submitting a really large document, by submitting a 
> large number of documents without doing a commit (see LUCENE-8118) or by 
> repeatedly submitting documents that fail to get indexed in some specific 
> ways, leading to Lucene not cleaning up the in memory data structures that 
> eventually overflow.
> The overflow is due to a 32 bit (signed) integer wrapping around to negative 
> territory, then causing an ArrayIndexOutOfBoundsException. 
> The failure path that we are reliably hitting is due to an IOException during 
> doc tokenization. A tokenizer implementing TokenStream throws an exception 
> from incrementToken() which causes indexing of that doc to fail. 
> The IOException bubbles back up to DocumentsWriter.updateDocument() (or 
> DocumentsWriter.updateDocuments() in some other cases) where it is not 
> treated as an AbortingException therefore it is not causing a reset of the 
> DocumentsWriterPerThread. On repeated failures (without any successful 
> indexing in between) if the upper layer (client via Solr) resubmits the doc 
> that fails again, DocumentsWriterPerThread will eventually cause 
> TermsHashPerField data structures to grow and overflow, leading to an 
> exception stack similar to the one in LUCENE-8118 (below stack trace copied 
> from a test run repro on 7.1):
> java.lang.ArrayIndexOutOfBoundsException: 
> -65536java.lang.ArrayIndexOutOfBoundsException: -65536
>  at __randomizedtesting.SeedInfo.seed([394FAB2B91B1D90A:C86FB3F3CE001AA8]:0) 
> at 
> org.apache.lucene.index.TermsHashPerField.writeByte(TermsHashPerField.java:198)
>  at 
> org.apache.lucene.index.TermsHashPerField.writeVInt(TermsHashPerField.java:221)
>  at 
> org.apache.lucene.index.FreqProxTermsWriterPerField.writeProx(FreqProxTermsWriterPerField.java:80)
>  at 
> org.apache.lucene.index.FreqProxTermsWriterPerField.addTerm(FreqProxTermsWriterPerField.java:171)
>  at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185) 
> at 
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:792)
>  at 
> org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430)
>  at 
> org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:392)
>  at 
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239)
>  at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:481)
>  at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1717) 
> at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1462)
> Using tokens composed only of lowercase letters, it takes less than 
> 130,000,000 different tokens (the shortest ones) to overflow 
> TermsHashPerField.
> Using a single document (composed of the 20,000 shortest lowercase tokens) 
> submitted repeatedly for indexing requires 6352 submissions all failing with 
> an IOException on incrementToken() to trigger the 
> ArrayIndexOutOfBoundsException.
> A proposed fix is to treat in DocumentsWriter.updateDocument() and 
> DocumentsWriter.updateDocuments() an IOException in the same way we treat an 
> AbortingException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9027) SIMD-based decoding of postings lists

2019-11-08 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970621#comment-16970621
 ] 

Michael McCandless commented on LUCENE-9027:


Thanks [~jpountz] – I forgot that Codec impacts make our {{Term}} query tasks 
super fast!  Your explanation makes sense.

I'll look at the PR around endianess – as ugly as it sounds, I think it is the 
right tradeoff.  We should do things at indexing time to make search time 
faster, and if one of those things is writing multi-byte values in the "right" 
order for searching, that's fair game!

> SIMD-based decoding of postings lists
> -
>
> Key: LUCENE-9027
> URL: https://issues.apache.org/jira/browse/LUCENE-9027
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [~rcmuir] has been mentioning the idea for quite some time that we might be 
> able to write the decoding logic in such a way that Java would use SIMD 
> instructions. More recently [~paul.masurel] wrote a [blog 
> post|https://fulmicoton.com/posts/bitpacking/] that raises the point that 
> Lucene could still do decode multiple ints at once in a single instruction by 
> packing two ints in a long and we've had some discussions about what we could 
> try in Lucene to speed up the decoding of postings. This made me want to look 
> a bit deeper at what we could do.
> Our current decoding logic reads data in a byte[] and decodes packed integers 
> from it. Unfortunately it doesn't make use of SIMD instructions and looks 
> like 
> [this|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/NaiveByteDecoder.java].
> I confirmed by looking at the generated assembly that if I take an array of 
> integers and shift them all by the same number of bits then Java will use 
> SIMD instructions to shift multiple integers at once. This led me to writing 
> this 
> [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SimpleSIMDDecoder.java]
>  that tries as much as possible to shift long sequences of ints by the same 
> number of bits to speed up decoding. It is indeed faster than the current 
> logic we have, up to about 2x faster for some numbers of bits per value.
> Currently the best 
> [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SIMDDecoder.java]
>  I've been able to come up with combines the above idea with the idea that 
> Paul mentioned in his blog that consists of emulating SIMD from Java by 
> packing multiple integers into a long: 2 ints, 4 shorts or 8 bytes. It is a 
> bit harder to read but gives another speedup on top of the above 
> implementation.
> I have a [JMH 
> benchmark|https://github.com/jpountz/decode-128-ints-benchmark/] available in 
> case someone would like to play with this and maybe even come up with an even 
> faster implementation. It is 2-2.5x faster than our current implementation 
> for most numbers of bits per value. I'm copying results here:
> {noformat}
>  * `readLongs` just reads 2*bitsPerValue longs from the ByteBuffer, it serves 
> as
>a baseline.
>  * `decodeNaiveFromBytes` reads a byte[] and decodes from it. This is what the
>current Lucene codec does.
>  * `decodeNaiveFromLongs` decodes from longs on the fly.
>  * `decodeSimpleSIMD` is a simple implementation that relies on how Java
>recognizes some patterns and uses SIMD instructions.
>  * `decodeSIMD` is a more complex implementation that both relies on the C2
>compiler to generate SIMD instructions and encodes 8 bytes, 4 shorts or
>2 ints in a long in order to decompress multiple values at once.
> Benchmark   (bitsPerValue)  (byteOrder)   
> Mode  Cnt   Score   Error   Units
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   1   LE  
> thrpt5  12.912 ± 0.393  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   1   BE  
> thrpt5  12.862 ± 0.395  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   2   LE  
> thrpt5  13.040 ± 1.162  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   2   BE  
> thrpt5  13.027 ± 0.270  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   3   LE  
> thrpt5  12.409 ± 0.637  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   3   BE  
> thrpt5  12.268 ± 0.947  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   4   LE  
> thrpt5  14.177 ± 2.263  ops/us
> PackedIntsDecodeBenchmark.decodeNaiveFromBytes   4   BE  
> thrpt5  11.457 ± 0.150  ops/us
> 

[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache

2019-11-08 Thread Ben Manes (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970517#comment-16970517
 ] 

Ben Manes commented on LUCENE-9038:
---

Why would it be {{O(QxS^2)}} and not {{O(Q)}} entries to remove?

My initial impression would be to also have the {{key = (Query,CacheKey)}} pair 
and let the eviction policy drop the individual entries. When a new entry is 
added, we would maintain a separate index, {{CacheKey => Set}}, to remove 
when the invalidating the {{CacheKey}}. This secondary index would be 
maintained using computations and be a weakKey reference cache, which lets us 
ignore a few subtle races.

The caching policy already includes a frequency-based admission filter on top 
of LRU, similar in spirit to your scheme. However in a 2 layer model it would 
track frequency of the leaf cache rather than the query entries, which negates 
its usefulness. I think due to richer computations and better write concurrency 
we could make a simpler 1-layer cache work efficiently.

Do you have benchmark scenarios that I could run? If so, I might create both 
versions so that we could benchmark to see which is preferable.

> Evaluate Caffeine for LruQueryCache
> ---
>
> Key: LUCENE-9038
> URL: https://issues.apache.org/jira/browse/LUCENE-9038
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ben Manes
>Priority: Major
> Attachments: CaffeineQueryCache.java
>
>
> [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java]
>  appears to play a central role in Lucene's performance. There are many 
> issues discussing its performance, such as LUCENE-7235, LUCENE-7237, 
> LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's 
> overhead can be just as much of a benefit as a liability, causing various 
> workarounds and complexity.
> When reviewing the discussions and code, the following issues are concerning:
> # The cache is guarded by a single lock for all reads and writes.
> # All computations are performed outside of the any locking to avoid 
> penalizing other callers. This  doesn't handle the cache stampedes meaning 
> that multiple threads may cache miss, compute the value, and try to store it. 
> That redundant work becomes expensive under load and can be mitigated with ~ 
> per-key locks.
> # The cache queries the entry to see if it's even worth caching. At first 
> glance one assumes that is so that inexpensive entries don't bang on the lock 
> or thrash the LRU. However, this is also used to indicate data dependencies 
> for uncachable items (per JIRA), which perhaps shouldn't be invoking the 
> cache.
> # The cache lookup is skipped if the global lock is held and the value is 
> computed, but not stored. This means a busy lock reduces performance across 
> all usages and the cache's effectiveness degrades. This is not counted in the 
> miss rate, giving a false impression.
> # An attempt was made to perform computations asynchronously, due to their 
> heavy cost on tail latencies. That work was reverted due to test failures and 
> is being worked on.
> # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] 
> tries to avoid LRU thrashing due to large, infrequently used items being 
> cached.
> # The cache is tightly intertwined with business logic, making it hard to 
> tease apart core algorithms and data structures from the usage scenarios.
> It seems that more and more items skip being cached because of concurrency 
> and hit rate performance, causing special case fixes based on knowledge of 
> the external code flows. Since the developers are experts on search, not 
> caching, it seems justified to evaluate if an off-the-shelf library would be 
> more helpful in terms of developer time, code complexity, and performance. 
> Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] 
> in SOLR-8241 and SOLR-13817.
> The proposal is to replace the internals {{LruQueryCache}} so that external 
> usages are not affected in terms of the API. However, like in {{SolrCache}}, 
> a difference is that Caffeine only bounds by either the number of entries or 
> an accumulated size (e.g. bytes), but not both constraints. This likely is an 
> acceptable divergence in how the configuration is honored.
> cc [~ab], [~dsmiley]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13909) Everything about CheckBackupStatus is broken

2019-11-08 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970510#comment-16970510
 ] 

Chris M. Hostetter commented on SOLR-13909:
---

I've got some simpler/saner code in the new {{TestStressThreadBackup}} I'm 
adding as part of SOLR-13872 that i'll refactor and re-use in all places that 
currently dela with CheckBackupStatus once I'm done with that jira.

> Everything about CheckBackupStatus is broken
> 
>
> Key: SOLR-13909
> URL: https://issues.apache.org/jira/browse/SOLR-13909
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
>
> While working on SOLR-13872 I tried to take advantage of the existing 
> {{CheckBackupStatus}} helper class and discovered that just about every 
> aspect of this class is broken and needs fixed:
>  * doesn't use SolrClient, pulls out it's URL to do a bare HTTP request
>  * hardcoded assumption of xml - but doesn't parse it just tries to match 
> regexes against it
>  * almost every usage of this class follows the same broken "loop" pattern 
> that garuntees the test will sleep more then it needs to even after 
> {{CheckBackupStatus}} thinks the backup is a success...
> {code:java}
> CheckBackupStatus checkBackupStatus = new CheckBackupStatus(...);
> while (!checkBackupStatus.success) {
>   checkBackupStatus.fetchStatus();
>   Thread.sleep(1000);
> }
> {code}
>  * the 3 arg constructor is broken both in design and in implementation:
>  ** it appears to be useful for checking that a _new_ backup has succeeded 
> after a {{lastBackupTimestamp}} from some previously successful check
>  ** in reality it only ever reports {{success}} if it's status check 
> indicates the most recent backup has the exact {{.equals()}} time stamp as 
> {{lastBackupTimestamp}}
>  ** *AND THESE TIMESTAMPS ONLY HAVE MINUTE PRECISION*
>  ** As far as i can tell, the only the tests using the 3 arg version ever 
> pass is because of the broken loop pattern:
>  *** they ask for the status so quick, it's either already done (during the 
> same wall clock minute) or it's not done yet and they re-read the "old" 
> status (with the old matching timestamp)
>  *** either way, the test then sleeps for a second giving the "new" backup 
> enough time to actually finish
>  ** AFAICT if the System clock ticks over to a new minute in between these 
> sleep calls, the test is garunteed to loop forever!
> 
> Everything about this class needs to die and be replaced with something 
> better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13909) Everything about CheckBackupStatus is broken

2019-11-08 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created SOLR-13909:
-

 Summary: Everything about CheckBackupStatus is broken
 Key: SOLR-13909
 URL: https://issues.apache.org/jira/browse/SOLR-13909
 Project: Solr
  Issue Type: Test
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter
Assignee: Chris M. Hostetter


While working on SOLR-13872 I tried to take advantage of the existing 
{{CheckBackupStatus}} helper class and discovered that just about every aspect 
of this class is broken and needs fixed:
 * doesn't use SolrClient, pulls out it's URL to do a bare HTTP request
 * hardcoded assumption of xml - but doesn't parse it just tries to match 
regexes against it
 * almost every usage of this class follows the same broken "loop" pattern that 
garuntees the test will sleep more then it needs to even after 
{{CheckBackupStatus}} thinks the backup is a success...
{code:java}
CheckBackupStatus checkBackupStatus = new CheckBackupStatus(...);
while (!checkBackupStatus.success) {
  checkBackupStatus.fetchStatus();
  Thread.sleep(1000);
}
{code}

 * the 3 arg constructor is broken both in design and in implementation:
 ** it appears to be useful for checking that a _new_ backup has succeeded 
after a {{lastBackupTimestamp}} from some previously successful check
 ** in reality it only ever reports {{success}} if it's status check indicates 
the most recent backup has the exact {{.equals()}} time stamp as 
{{lastBackupTimestamp}}
 ** *AND THESE TIMESTAMPS ONLY HAVE MINUTE PRECISION*
 ** As far as i can tell, the only the tests using the 3 arg version ever pass 
is because of the broken loop pattern:
 *** they ask for the status so quick, it's either already done (during the 
same wall clock minute) or it's not done yet and they re-read the "old" status 
(with the old matching timestamp)
 *** either way, the test then sleeps for a second giving the "new" backup 
enough time to actually finish
 ** AFAICT if the System clock ticks over to a new minute in between these 
sleep calls, the test is garunteed to loop forever!


Everything about this class needs to die and be replaced with something better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13908) Possible bugs when using HdfsDirectoryFactory w/ softCommit=true + openSearcher=true

2019-11-08 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created SOLR-13908:
-

 Summary: Possible bugs when using HdfsDirectoryFactory w/ 
softCommit=true + openSearcher=true
 Key: SOLR-13908
 URL: https://issues.apache.org/jira/browse/SOLR-13908
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: hdfs
Reporter: Chris M. Hostetter


While working on SOLR-13872 something caught my eye that seems fishy

*Background:*

SOLR-4916 introduced the API 
{{DirectoryFactory.searchersReserveCommitPoints()}} -- a method that 
{{SolrIndexSearcher}} uses to decide if it needs to explicitly save/release the 
{{IndexCommit}} point of it's {{DirectoryReader}} with the 
{{IndexDeletionPolicytWrapper}}, for use on Filesystems that don't in some way 
"protect" open files...

{code:title=SolrIndexSearcher}
if (directoryFactory.searchersReserveCommitPoints()) {
  // reserve commit point for life of searcher
  
core.getDeletionPolicy().saveCommitPoint(reader.getIndexCommit().getGeneration());
}
{code}

{code:title=DirectoryFactory}
  /**
   * If your implementation can count on delete-on-last-close semantics
   * or throws an exception when trying to remove a file in use, return
   * false (eg NFS). Otherwise, return true. Defaults to returning false.
   * 
   * @return true if factory impl requires that Searcher's explicitly
   * reserve commit points.
   */
  public boolean searchersReserveCommitPoints() {
return false;
  }
{code}

{{HdfsDirectoryFactory}} is (still) the only {{DirectoryFactory}} Impl that 
returns {{true}}.



*Concern:*

As noted in LUCENE-9040  The behavior of {{DirectoryReader.getIndexCommit()}} 
is a little weird / underspecified when dealing with an "NRT" {{IndexReader}} 
(opened directly off of an {{IndexWriter}} using "un-committed" changes) ... 
which is exactly what {{SolrIndexSearcher}} is using in solr setups that use 
{{softCommit=true=false}}.

In particular the {{IndexCommit.getGeneration()}} value that will be used when 
{{SolrIndexSearcher}} executes 
{{core.getDeletionPolicy().saveCommitPoint(reader.getIndexCommit().getGeneration());}}
 will be (as of the current code) the {{generation}} of the last _hard_ commit 
-- meaning that new segment/data files since the last "hard commit" will not be 
protected from deletion if additional commits/merges happen on the index 
duringthe life of the {{SolrIndexSearcher}} -- either view concurrent rapid 
commits, or via {{commit=true=false=false}}.

I have not investigated this in depth, but I believe there is risk here of 
unpredictible bugs when using HDFS in conjunction with 
{{softCommit=true=true}}.






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9040) Questionable/Underspecified behavior of DirectoryReader.getIndexCommit() on NRT Readers

2019-11-08 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated LUCENE-9040:
---
Attachment: LUCENE-9040.patch
Status: Open  (was: Open)


The attached patch includes additions to TestIndexWriterReader that will pass 
reliably with all seeds I've tried -- but the behavior demonstrated/asserted in 
that test seems very weird to me, and potentially trappy to end users.

Note in particular the {{nocommit}} comments and the assertions that follow 
them -- these are very diff readers, exposing very diff views of the index, yet 
they claim to have the same {{IndexCommit}} & generation underpinning them, 
even though some of the details of those IndexCommits differ.

I'm wondering if this is a fixable bug, or an undespecified behavior (and we 
should beef up the docs to clarify what to expect), or if it represents some 
"feature" whose value I'm not understanding?




> Questionable/Underspecified behavior of DirectoryReader.getIndexCommit() on 
> NRT Readers
> ---
>
> Key: LUCENE-9040
> URL: https://issues.apache.org/jira/browse/LUCENE-9040
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: LUCENE-9040.patch
>
>
> It seems like DirectoryReader.getIndexCommit() returns weird 
> results when using a "reopened" reader off of uncommited IW changes.
> Even though 2 diff readers will expose diff views of the index, they will 
> claim to refer to the same IndexCommit (generation).
> Original email thread (no replies): 
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201910.mbox/%3Calpine.DEB.2.21.1910301611190.8171%40slate%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9040) Questionable/Underspecified behavior of DirectoryReader.getIndexCommit() on NRT Readers

2019-11-08 Thread Chris M. Hostetter (Jira)
Chris M. Hostetter created LUCENE-9040:
--

 Summary: Questionable/Underspecified behavior of 
DirectoryReader.getIndexCommit() on NRT Readers
 Key: LUCENE-9040
 URL: https://issues.apache.org/jira/browse/LUCENE-9040
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Chris M. Hostetter


It seems like DirectoryReader.getIndexCommit() returns weird 
results when using a "reopened" reader off of uncommited IW changes.

Even though 2 diff readers will expose diff views of the index, they will 
claim to refer to the same IndexCommit (generation).

Original email thread (no replies): 
http://mail-archives.apache.org/mod_mbox/lucene-dev/201910.mbox/%3Calpine.DEB.2.21.1910301611190.8171%40slate%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-11-08 Thread Bruno Roustant (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970444#comment-16970444
 ] 

Bruno Roustant commented on LUCENE-8920:


It works. I removed the labels for direct-addressing, except the first label of 
the arc.
It makes direct-addressing encoding even more compact. Now we have 48% of the 
nodes with fixed length arcs encoded with direct-encoding by using only +12% 
memory (instead of +23% with my previous patch).

I also did several things:
The presence bits table is now a reused structure (does not create long[] 
anymore each time we read an arc).
I touched more code, but I think the code is now a bit clearer with more 
comments.
I moved the bit twiddling methods to BitUtil.

Thanks reviewers!

 

I noticed the direct-addressing limitation: it is less performant than binary 
search for reverse lookup (lookup by output).

Is there an periodic automated monitoring of the performance / memory of 
commits? Could you give me the link? I would like to watch the change with this 
PR.

 

 I definitely would like to clean more FST. But that will be the subject of 
another Jira issue later. Too much duplicated code that makes reasoning and 
modifications hard. Also missing docs and method contracts (e.g. expected 
position in the byte input).

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Minor
> Attachments: TestTermsDictRamBytesUsed.java
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13872) Backup can fail to read index files w/NoSuchFileException during merges (SOLR-11616 regression)

2019-11-08 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-13872:
--
Attachment: SOLR-13872.patch
Status: Open  (was: Open)

Ok, here's a patch with the API/Usage changes I think we should make.

There are still a lot of nocommits, but mostly just related to additional test 
coverage I want to add (and most of that is around named snapshots since i 
didn't dig into that very deep and i want to make sure taking backups that way 
is as solid as the "simple" path) but I think the new API & synchornization 
logic is pretty solid.

There's also a few places were I have nocommits related to needing to spin out 
loosly related jiras.  Once I've done that i'll slim down this patch a bit and 
link those jiras and then finish up the tests.


>  Backup can fail to read index files w/NoSuchFileException during merges 
> (SOLR-11616 regression)
> 
>
> Key: SOLR-13872
> URL: https://issues.apache.org/jira/browse/SOLR-13872
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13872.patch, SOLR-13872.patch, index_churn.pl
>
>
> SOLR-11616 purports to fix a bug in Solr's backup functionality that causes 
> 'NoSuchFileException' errors when attempting to backup an index while it is 
> undergoing indexing (and segment merging)
> Although SOLR-11616 is marked with "Fix Version: 7.2" it's pretty easy to 
> demonstrate that this bug still exists on master, branch_8x, and even in 7.2 
> - so it seems less like the current problem is a "regression" and more that 
> the original fix didn't work.
> 
> The crux of the problem seems to be concurrency bugs in if/how a commit is 
> "reserved" before attempting to copy the files in that commit to the backup 
> location.  
> A possible work around discussed in more depth in the comments below is to 
> update {{solrconfig.xml}} to explicitly configure the {{SolrDeletionPolicy}} 
> with either the {{maxCommitsToKeep}} or {{maxCommitAge}} options to ensure 
> the commits are kept around long enough for the backup to be created.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-11-08 Thread Bruno Roustant (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970444#comment-16970444
 ] 

Bruno Roustant edited comment on LUCENE-8920 at 11/8/19 5:36 PM:
-

It works. PR#980 updated. I removed the labels for direct-addressing, except 
the first label of the arc.
 It makes direct-addressing encoding even more compact. Now we have 48% of the 
nodes with fixed length arcs encoded with direct-encoding by using only +12% 
memory (instead of +23% with my previous patch).

I also did several things:
 The presence bits table is now a reused structure (does not create long[] 
anymore each time we read an arc).
 I touched more code, but I think the code is now a bit clearer with more 
comments.
 I moved the bit twiddling methods to BitUtil.

Thanks reviewers!

 

I noticed the direct-addressing limitation: it is less performant than binary 
search for reverse lookup (lookup by output).

Is there a periodic automated monitoring of the performance / memory of 
commits? Could you give me the link? I would like to watch the change with this 
PR.

 

 I definitely would like to clean more FST. But that will be the subject of 
another Jira issue later. Too much duplicated code that makes reasoning and 
modifications hard. Also missing docs and method contracts (e.g. expected 
position in the byte input).


was (Author: bruno.roustant):
It works. I removed the labels for direct-addressing, except the first label of 
the arc.
 It makes direct-addressing encoding even more compact. Now we have 48% of the 
nodes with fixed length arcs encoded with direct-encoding by using only +12% 
memory (instead of +23% with my previous patch).

I also did several things:
 The presence bits table is now a reused structure (does not create long[] 
anymore each time we read an arc).
 I touched more code, but I think the code is now a bit clearer with more 
comments.
 I moved the bit twiddling methods to BitUtil.

Thanks reviewers!

 

I noticed the direct-addressing limitation: it is less performant than binary 
search for reverse lookup (lookup by output).

Is there a periodic automated monitoring of the performance / memory of 
commits? Could you give me the link? I would like to watch the change with this 
PR.

 

 I definitely would like to clean more FST. But that will be the subject of 
another Jira issue later. Too much duplicated code that makes reasoning and 
modifications hard. Also missing docs and method contracts (e.g. expected 
position in the byte input).

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Minor
> Attachments: TestTermsDictRamBytesUsed.java
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-11-08 Thread Bruno Roustant (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970444#comment-16970444
 ] 

Bruno Roustant edited comment on LUCENE-8920 at 11/8/19 5:35 PM:
-

It works. I removed the labels for direct-addressing, except the first label of 
the arc.
 It makes direct-addressing encoding even more compact. Now we have 48% of the 
nodes with fixed length arcs encoded with direct-encoding by using only +12% 
memory (instead of +23% with my previous patch).

I also did several things:
 The presence bits table is now a reused structure (does not create long[] 
anymore each time we read an arc).
 I touched more code, but I think the code is now a bit clearer with more 
comments.
 I moved the bit twiddling methods to BitUtil.

Thanks reviewers!

 

I noticed the direct-addressing limitation: it is less performant than binary 
search for reverse lookup (lookup by output).

Is there a periodic automated monitoring of the performance / memory of 
commits? Could you give me the link? I would like to watch the change with this 
PR.

 

 I definitely would like to clean more FST. But that will be the subject of 
another Jira issue later. Too much duplicated code that makes reasoning and 
modifications hard. Also missing docs and method contracts (e.g. expected 
position in the byte input).


was (Author: bruno.roustant):
It works. I removed the labels for direct-addressing, except the first label of 
the arc.
It makes direct-addressing encoding even more compact. Now we have 48% of the 
nodes with fixed length arcs encoded with direct-encoding by using only +12% 
memory (instead of +23% with my previous patch).

I also did several things:
The presence bits table is now a reused structure (does not create long[] 
anymore each time we read an arc).
I touched more code, but I think the code is now a bit clearer with more 
comments.
I moved the bit twiddling methods to BitUtil.

Thanks reviewers!

 

I noticed the direct-addressing limitation: it is less performant than binary 
search for reverse lookup (lookup by output).

Is there an periodic automated monitoring of the performance / memory of 
commits? Could you give me the link? I would like to watch the change with this 
PR.

 

 I definitely would like to clean more FST. But that will be the subject of 
another Jira issue later. Too much duplicated code that makes reasoning and 
modifications hard. Also missing docs and method contracts (e.g. expected 
position in the byte input).

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Minor
> Attachments: TestTermsDictRamBytesUsed.java
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #980: LUCENE-8920: Reduce the memory used by direct addressing of arcs

2019-11-08 Thread GitBox
bruno-roustant commented on a change in pull request #980: LUCENE-8920: Reduce 
the memory used by direct addressing of arcs
URL: https://github.com/apache/lucene-solr/pull/980#discussion_r344284382
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/fst/FST.java
 ##
 @@ -726,6 +759,57 @@ private void writeArrayPacked(Builder builder, 
Builder.UnCompiledNode node
 }
   }
 
+  private void writeArrayDirectAddressing(Builder builder, 
Builder.UnCompiledNode nodeIn, long fixedArrayStart, int maxBytesPerArc, int 
labelRange) {
+int numPresenceBytes = getNumPresenceBytes(labelRange);
+// expand the arcs in place, backwards
+long srcPos = builder.bytes.getPosition();
+long destPos = fixedArrayStart + numPresenceBytes + nodeIn.numArcs * 
maxBytesPerArc;
+// if destPos == srcPos it means all the arcs were the same length, and 
the array of them is *already* direct
+assert destPos >= srcPos;
+if (destPos > srcPos) {
+  builder.bytes.skipBytes((int) (destPos - srcPos));
+  assert builder.bytes.getPosition() == destPos;
+  for (int arcIdx = nodeIn.numArcs - 1; arcIdx >= 0; arcIdx--) {
+destPos -= maxBytesPerArc;
+int arcLen = builder.reusedBytesPerArc[arcIdx];
+srcPos -= arcLen;
+if (srcPos != destPos) {
+  assert destPos > srcPos: "destPos=" + destPos + " srcPos=" + srcPos 
+ " arcIdx=" + arcIdx + " maxBytesPerArc=" + maxBytesPerArc + " 
reusedBytesPerArc[arcIdx]=" + builder.reusedBytesPerArc[arcIdx] + " 
nodeIn.numArcs=" + nodeIn.numArcs;
+  builder.bytes.copyBytes(srcPos, destPos, arcLen);
+}
+  }
+}
+assert destPos - numPresenceBytes == fixedArrayStart;
+writePresenceBits(builder, nodeIn, labelRange, fixedArrayStart);
+  }
+
+  private void writePresenceBits(Builder builder, Builder.UnCompiledNode 
nodeIn, int labelRange, long dest) {
+long bytePos = dest;
+byte presenceBits = 1; // The first arc is always present.
+int presenceIndex = 0;
+int previousLabel = nodeIn.arcs[0].label;
+for (int arcIdx = 1; arcIdx < nodeIn.numArcs; arcIdx++) {
+  int label = nodeIn.arcs[arcIdx].label;
+  presenceIndex += label - previousLabel;
+  while (presenceIndex >= 8) {
 
 Review comment:
   Ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #980: LUCENE-8920: Reduce the memory used by direct addressing of arcs

2019-11-08 Thread GitBox
bruno-roustant commented on a change in pull request #980: LUCENE-8920: Reduce 
the memory used by direct addressing of arcs
URL: https://github.com/apache/lucene-solr/pull/980#discussion_r344284258
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/fst/FST.java
 ##
 @@ -676,26 +697,38 @@ long addNode(Builder builder, 
Builder.UnCompiledNode nodeIn) throws IOExce
 */
 
 if (doFixedArray) {
-  final int MAX_HEADER_SIZE = 11; // header(byte) + numArcs(vint) + 
numBytes(vint)
   assert maxBytesPerArc > 0;
   // 2nd pass just "expands" all arcs to take up a fixed byte size
 
+  // If more than (1 / DIRECT_ARC_LOAD_FACTOR) of the "slots" would be 
occupied, write an arc
+  // array that may have holes in it so that we can address the arcs 
directly by label without
 
 Review comment:
   Yes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] bruno-roustant commented on a change in pull request #980: LUCENE-8920: Reduce the memory used by direct addressing of arcs

2019-11-08 Thread GitBox
bruno-roustant commented on a change in pull request #980: LUCENE-8920: Reduce 
the memory used by direct addressing of arcs
URL: https://github.com/apache/lucene-solr/pull/980#discussion_r344284113
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/util/fst/FST.java
 ##
 @@ -864,6 +958,64 @@ private long readUnpackedNodeTarget(BytesReader in) 
throws IOException {
 return readNextRealArc(arc, in);
   }
 
+  /**
+   * Reads the presence bits of a direct-addressing node, store them in the 
provided arc {@link Arc#bitTable()}
+   * and returns the number of presence bytes.
+   */
+  private int readPresenceBytes(Arc arc, BytesReader in) throws IOException 
{
+int numPresenceBytes = getNumPresenceBytes(arc.numArcs());
+long[] presenceBits = new long[(numPresenceBytes + 7) / Long.BYTES];
+for (int i = 0; i < numPresenceBytes; i++) {
+  // Read the next unsigned byte, shift it to the left, and appends it to 
the current long.
+  presenceBits[i / Long.BYTES] |= (in.readByte() & 0xFFL) << (i * 
Byte.SIZE);
+}
+arc.bitTable = presenceBits;
+assert checkPresenceBytesAreValid(arc);
+return numPresenceBytes;
+  }
+
+  static boolean checkPresenceBytesAreValid(Arc arc) {
+assert (arc.bitTable()[0] & 1L) != 0; // First bit must be set.
+assert (arc.bitTable()[arc.bitTable().length - 1] & (1L << (arc.numArcs() 
- 1))) != 0; // Last bit must be set.
+assert countBits(arc.bitTable()) <= arc.numArcs(); // Total bit set (real 
num arcs) must be <= label range (stored in arc.numArcs()).
+return true;
+  }
+
+  /**
+   * Counts all bits set in the provided longs.
+   */
+  static int countBits(long[] bits) {
 
 Review comment:
   Ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970391#comment-16970391
 ] 

Mark Robert Miller commented on SOLR-13888:
---

10 years and my prototype code is holding - like a handful of the bugs fixed. I 
mean just come back to that over and over. It's like I know what happened, I 
was here, but WTF happened.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970387#comment-16970387
 ] 

Mark Robert Miller commented on SOLR-13888:
---

And I had an exciting branch for me, existing direction for me. It won't excite 
you, we don't agree on how to develop.

You can look at it to find fixes and test improvements.

If I'm showing off something, it is again, for me and not you. So I couldn't 
eat this week with that work (seems like I'm doing nothing eh? but i tend to do 
3 things at once) and I wasnt going to get anything out of it anymore.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13907) Cloud view tree - fixed placement

2019-11-08 Thread Richard Goodman (Jira)
Richard Goodman created SOLR-13907:
--

 Summary: Cloud view tree - fixed placement
 Key: SOLR-13907
 URL: https://issues.apache.org/jira/browse/SOLR-13907
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Admin UI
Reporter: Richard Goodman
 Attachments: SOLR-13907.patch, clipping-of-tree-on-x-axis.png, 
fixed-metadata-panel.png

The tree view on the admin UI is really helpful to get a view of the znodes in 
zookeeper. We've found sometimes when troubleshooting issues that this can 
become quite fickle to use when you scroll down the collections list, select a 
shard leader znode, and have to scroll back up again to look at the metadata.

This small patch forces the metadata panel to be fixed to the UI, and also 
forces a horizontal scrollbar for the tree.

Screenshots show the patch applied.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13907) Cloud view tree - fixed placement

2019-11-08 Thread Richard Goodman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Goodman updated SOLR-13907:
---
Attachment: SOLR-13907.patch

> Cloud view tree - fixed placement
> -
>
> Key: SOLR-13907
> URL: https://issues.apache.org/jira/browse/SOLR-13907
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Admin UI
>Reporter: Richard Goodman
>Priority: Minor
> Attachments: SOLR-13907.patch, clipping-of-tree-on-x-axis.png, 
> fixed-metadata-panel.png
>
>
> The tree view on the admin UI is really helpful to get a view of the znodes 
> in zookeeper. We've found sometimes when troubleshooting issues that this can 
> become quite fickle to use when you scroll down the collections list, select 
> a shard leader znode, and have to scroll back up again to look at the 
> metadata.
> This small patch forces the metadata panel to be fixed to the UI, and also 
> forces a horizontal scrollbar for the tree.
> Screenshots show the patch applied.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970339#comment-16970339
 ] 

Mark Robert Miller commented on SOLR-13888:
---

And Im super easy and have super understanding and lax standards. SO THIS IS A 
MESS!

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970262#comment-16970262
 ] 

Mark Robert Miller commented on SOLR-13888:
---

Our ZK recipes suck, and thats not our business, stop using them. Just an 
opinion.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970259#comment-16970259
 ] 

Mark Robert Miller commented on SOLR-13888:
---

You make the overseer super slow to make it fast. I don't know how to tell you 
to fix the overseer, maybe thinks it should work like it does, but as it is you 
can speed up HUUUGE. And you can make it not insane and speed it up even more.

And then you can not do silly slow stuff that you might do for huge clusters 
for small clusters. Like it's just all common sense and fixing.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970257#comment-16970257
 ] 

Mark Robert Miller commented on SOLR-13888:
---

I'll even give you more. I wish just telling the problems *WAS* all I had to do:

Some alias stuff makes cluster state updates fire twice in a row.

There is bad concurrency in a lot of spots, just spot check.

There is a lot of bad exception handling and interruption handling that are 
actually very important.

There a lot of bad behavior in failure cases, if you make it better, fails get 
easier to look at.

There is a lot of weird and off stuff in the collections api, because it's 
trying to balance two worlds, its not well tests, and other things.

Our base layer zk stuff, like the mkpaths and retry stuff, can do weird.

Lots of times we dont handle session expiration.

 

I'll give you more too there is a lot more hours there. When will you start?

 

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] KoenDG commented on issue #990: [SOLR-13885] Typo corrections.

2019-11-08 Thread GitBox
KoenDG commented on issue #990: [SOLR-13885] Typo corrections.
URL: https://github.com/apache/lucene-solr/pull/990#issuecomment-551851168
 
 
   How very weird.
   
   From the logs, it seems this is the only indication of what went wrong:
   
   ```
   WARNING: An illegal reflective access operation has occurred
   WARNING: Illegal reflective access by 
org.apache.ivy.util.url.IvyAuthenticator 
(file:/home/runner/.ant/lib/ivy-2.4.0.jar) to field 
java.net.Authenticator.theAuthenticator
   WARNING: Please consider reporting this to the maintainers of 
org.apache.ivy.util.url.IvyAuthenticator
   WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
   WARNING: All illegal access operations will be denied in a future release
   ```
   
   I don't see anything else in the logs indicating why it is that the 
precommit failed.
   
   And it's literally only text files. Nothing that should be compiled...
   
   Is there a place where we can read the steps of this precommit check?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970233#comment-16970233
 ] 

Mark Robert Miller commented on SOLR-13888:
---

And all you guys obsessed with timing and surprises and design. I dont care. I 
just dont care. I just want this garbage fixed. I could care less if I do it. 
If you do it. If god does it. Somebody please fix this or replace my code and i 
can at least not feel any responsibility for it.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970227#comment-16970227
 ] 

Mark Robert Miller commented on SOLR-13888:
---

This is why I can fix it without you. This is a guess why you won't fix it? 
It's simply hard boring work.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13888) SolrCloud 2

2019-11-08 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970219#comment-16970219
 ] 

Mark Robert Miller commented on SOLR-13888:
---

And yeah, I've realized I've hated what I've been working on for 10 years, so 
I'm either going to stop hating it or do something else.

> SolrCloud 2
> ---
>
> Key: SOLR-13888
> URL: https://issues.apache.org/jira/browse/SOLR-13888
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Major
> Attachments: solrscreen.png
>
>
> As devs discuss dropping the SolrCloud name on the dev list, here is an issue 
> titled SolrCloud 2.
> A couple times now I've pulled on the sweater thread that is our broken 
> tests. It leads to one place - SolrCloud is sick and devs are adding spotty 
> code on top of it at a rate that will lead to the system falling in on 
> itself. As it is, it's a very slow, very inefficient, very unreliable, very 
> buggy system.
> This is not why I am here. This is the opposite of why I am here.
> So please, let's stop. We can't build on that thing as it is.
>  
> I need some time, I lost a lot of work at one point, the scope has expanded 
> since I realized how problematic some things really are, but I have an 
> alternative path that is not so duct tape and straw. As the building climbs, 
> that foundation is going to kill us all.
>  
> This i not about an architecture change - the architecture is fine. The 
> implementation is broken and getting worse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8836) Optimize DocValues TermsDict to continue scanning from the last position when possible

2019-11-08 Thread juan camilo rodriguez duran (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970220#comment-16970220
 ] 

juan camilo rodriguez duran commented on LUCENE-8836:
-

They both use the same terms enum class

> Optimize DocValues TermsDict to continue scanning from the last position when 
> possible
> --
>
> Key: LUCENE-8836
> URL: https://issues.apache.org/jira/browse/LUCENE-8836
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Bruno Roustant
>Priority: Major
>  Labels: docValues, optimization
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Lucene80DocValuesProducer.TermsDict is used to lookup for either a term or a 
> term ordinal.
> Currently it does not have the optimization the FSTEnum has: to be able to 
> continue a sequential scan from where the last lookup was in the IndexInput. 
> For sparse lookups (when searching only a few terms or ordinal) it is not an 
> issue. But for multiple lookups in a row this optimization could save 
> re-scanning all the terms from the block start (since they are delat encoded).
> This patch proposes the optimization.
> To estimate the gain, we ran 3 Lucene tests while counting the seeks and the 
> term reads in the IndexInput, with and without the optimization:
> TestLucene70DocValuesFormat - the optimization saves 24% seeks and 15% term 
> reads.
> TestDocValuesQueries - the optimization adds 0.7% seeks and 0.003% term reads.
> TestDocValuesRewriteMethod.testRegexps - the optimization saves 71% seeks and 
> 82% term reads.
> In some cases, when scanning many terms in lexicographical order, the 
> optimization saves a lot. In some case, when only looking for some sparse 
> terms, the optimization does not bring improvement, but does not penalize 
> neither. It seems to be worth to always have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-11492) More Modern cloud dev script

2019-11-08 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970215#comment-16970215
 ] 

Gus Heck commented on SOLR-11492:
-

Yup, the content of this ticket that is checked in is in 8.3, so time to raise 
enhancement/bug if you have a patch for further changes :).

> More Modern cloud dev script
> 
>
> Key: SOLR-11492
> URL: https://issues.apache.org/jira/browse/SOLR-11492
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 8.0
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Minor
> Fix For: 8.3
>
> Attachments: SOLR-11492.patch, cloud.sh, cloud.sh, cloud.sh, 
> cloud.sh, cloud.sh, cloud.sh, cloud.sh
>
>
> Most of the scripts in solr/cloud-dev do things like start using java -jar 
> and other similarly ancient techniques. I recently decided I really didn't 
> like that it was a pain to setup a cloud to test a patch/feature and that 
> often one winds up needing to blow away existing testing so working on more 
> than one thing at a time is irritating... so here's a script I wrote, if 
> folks like it I'd be happy for it to be included in solr/cloud-dev 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-13906) ClassCastException in Group Sort when 'score' field in fl

2019-11-08 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-13906.
---
Resolution: Duplicate

Please search the Jira or ask on the user's list before opening a JIRA.

> ClassCastException in Group Sort when 'score' field in fl
> -
>
> Key: SOLR-13906
> URL: https://issues.apache.org/jira/browse/SOLR-13906
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 8.2
> Environment: Solr Version: 8.2.0
> JDK 11
> Cluster Setup: 2 Shards with 2 nodes each
>Reporter: Arpit Jain
>Priority: Critical
>
> When running a group query with group sort, and passing 'score' field in fl 
> list, I am getting the following exception. 
> Sample Query which is failing :  
> [http://localhost:8983/solr/collection1/select?fl=field1,score=field2:(]"ABC")=*:*=field3+desc=true&_route_=shard1
> Removing score from fl fixes the error. 
> {code:java}
> 2019-11-05 10:46:50.604 ERROR (qtp1527007086-84) [c:sprod s:2297 
> r:core_node110 x:sprod_2297_replica_n109] o.a.s.h.RequestHandlerBase 
> java.lang.ClassCastException: class org.apache.lucene.search.MultiCollector 
> cannot be cast to class org.apache.lucene.search.TopDocsCollector 
> (org.apache.lucene.search.MultiCollector and 
> org.apache.lucene.search.TopDocsCollector are in unnamed module of loader 
> org.eclipse.jetty.webapp.WebAppClassLoader @757f675c)
> at org.apache.solr.search.Grouping$CommandQuery.finish(Grouping.java:890)
> at org.apache.solr.search.Grouping.execute(Grouping.java:407)
> at 
> org.apache.solr.handler.component.QueryComponent.doProcessGroupedSearch(QueryComponent.java:1458)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:305)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2578)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:780)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:566)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:423)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1249)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:152)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> at org.eclipse.jetty.server.Server.handle(Server.java:505)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
> at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
> at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
> at 
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
> at 
> 

[jira] [Created] (SOLR-13906) ClassCastException in Group Sort when 'score' field in fl

2019-11-08 Thread Arpit Jain (Jira)
Arpit Jain created SOLR-13906:
-

 Summary: ClassCastException in Group Sort when 'score' field in fl
 Key: SOLR-13906
 URL: https://issues.apache.org/jira/browse/SOLR-13906
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCloud
Affects Versions: 8.2
 Environment: Solr Version: 8.2.0
JDK 11
Cluster Setup: 2 Shards with 2 nodes each
Reporter: Arpit Jain


When running a group query with group sort, and passing 'score' field in fl 
list, I am getting the following exception. 

Sample Query which is failing :  
[http://localhost:8983/solr/collection1/select?fl=field1,score=field2:(]"ABC")=*:*=field3+desc=true&_route_=shard1

Removing score from fl fixes the error. 


{code:java}
2019-11-05 10:46:50.604 ERROR (qtp1527007086-84) [c:sprod s:2297 r:core_node110 
x:sprod_2297_replica_n109] o.a.s.h.RequestHandlerBase 
java.lang.ClassCastException: class org.apache.lucene.search.MultiCollector 
cannot be cast to class org.apache.lucene.search.TopDocsCollector 
(org.apache.lucene.search.MultiCollector and 
org.apache.lucene.search.TopDocsCollector are in unnamed module of loader 
org.eclipse.jetty.webapp.WebAppClassLoader @757f675c)
at org.apache.solr.search.Grouping$CommandQuery.finish(Grouping.java:890)
at org.apache.solr.search.Grouping.execute(Grouping.java:407)
at 
org.apache.solr.handler.component.QueryComponent.doProcessGroupedSearch(QueryComponent.java:1458)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:390)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:305)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2578)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:780)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:566)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:423)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:350)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1249)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:152)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:505)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
at 

[jira] [Commented] (SOLR-13905) Nullpointer exception in AuditEvent

2019-11-08 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970054#comment-16970054
 ] 

Jan Høydahl commented on SOLR-13905:


I plan to do two changes
 * resource should never be null. This may be a regression from 8.3 when we 
switched from {{getContextPath}} to {{getPathInfo}} to get the resource
 * Check for null explicitly in {{findRequestType}} and return UNKNOWN, as a 
safeguard.
 * Also I will pre-compile all the regexes so we don't need to compile those on 
every request but instead loop through the precompiled {{List}} - for 
performance reasons.

I currently have found no way to reproduce the exact null scenario in a test, 
just in a live test environment.

> Nullpointer exception in AuditEvent
> ---
>
> Key: SOLR-13905
> URL: https://issues.apache.org/jira/browse/SOLR-13905
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Auditlogging
>Affects Versions: 8.3
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 8.4, 8.3.1
>
>
> Nullpointer exception in AuditEvent for events with HttpServletRequest as 
> input. Happens when {{getPathInfo()}} returns null, which was not caught by 
> current tests. This causes the whole request to fail, rendering the audit 
> service unusable.
> The nullpointer is experienced in the {{findRequestType()}} method when 
> performing pattern match on the resource (path).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13905) Nullpointer exception in AuditEvent

2019-11-08 Thread Jira
Jan Høydahl created SOLR-13905:
--

 Summary: Nullpointer exception in AuditEvent
 Key: SOLR-13905
 URL: https://issues.apache.org/jira/browse/SOLR-13905
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Auditlogging
Affects Versions: 8.3
Reporter: Jan Høydahl
Assignee: Jan Høydahl
 Fix For: 8.3.1, 8.4


Nullpointer exception in AuditEvent for events with HttpServletRequest as 
input. Happens when {{getPathInfo()}} returns null, which was not caught by 
current tests. This causes the whole request to fail, rendering the audit 
service unusable.

The nullpointer is experienced in the {{findRequestType()}} method when 
performing pattern match on the resource (path).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13902) "ant precommit" is inconsistent in branch_8x vs. master

2019-11-08 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970041#comment-16970041
 ] 

Andrzej Bialecki commented on SOLR-13902:
-

It should fail on both branches. I also encountered a case where there were 2 
files in the same packages - one in core and one in test sources, both in the 
same package. This wasn't caught either but it later caused a weird error from 
the javadoc build.

> "ant precommit" is inconsistent in branch_8x vs. master
> ---
>
> Key: SOLR-13902
> URL: https://issues.apache.org/jira/browse/SOLR-13902
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>
> Missing {{package-info.java}} does not fail in master , but it fails in 
> {{branch_8x}}. If it's required , it should fail everywhere.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13894) Solr 8.3 streaming expreessions do not return all fields (select)

2019-11-08 Thread Christian Spitzlay (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969928#comment-16969928
 ] 

Christian Spitzlay commented on SOLR-13894:
---

No, I didn't test on 8.3.  I'm mostly just a user of solr and we only just 
upgraded to 8.2 :)

I just wanted to provide the fixed version of the streaming expression for 
convenience, after I had found out why the original didn't parse on my machine.

 

> Solr 8.3 streaming expreessions do not return all fields (select)
> -
>
> Key: SOLR-13894
> URL: https://issues.apache.org/jira/browse/SOLR-13894
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud, streaming expressions
>Affects Versions: 8.3
>Reporter: Jörn Franke
>Priority: Major
>
> {color:#22}I use streaming expressions, e.g.{color}
> {color:#22}sortsSelect(search(...),id,if({color}{color:#22}eq(1,1),Y,N)
>  as found), by=“field A asc”){color}
> {color:#22}(Using export handler, sort is not really mandatory , I will 
> remove it later anyway){color}
> {color:#22}This works perfectly fine if I use Solr 8.2.0 (server + 
> client). It returns Tuples in the form \{ “id”,”12345”, “found”:”Y”}{color}
> {color:#22}However, if I use Solr 8.2.0 as server and Solr 8.3.0 as 
> client then the above statement only returns the id field, but not the 
> "found" field.{color}
> {color:#22}Questions:{color}
> {color:#22}1) is this expected behavior, ie Solr client 8.3.0 is in this 
> case not compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix 
> this?{color}
> {color:#22}2) has the syntax for the above expression changed? If so 
> how?{color}
> {color:#22}3) is this not expected behavior and I should create a Jira 
> for it?{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org