[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite
[ https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980991#comment-16980991 ] Ishan Chattopadhyaya commented on SOLR-13933: - If someone has ideas on an existing Java tool that can take in a configuration (JSON) of tasks to run in separate threads and then execute them while collecting metrics, please let me know. I've evaluated sundial, argo, etc. JMeter's embdedded mode also came close. Ditched all of them, as the configurations were very ugly. So, consequently, building this from scratch (based on the configuration in my previous comment). Will update the configuration as necessary, so as to keep it as simple and yet as expressive as possible. > Cluster mode Stress test suite > --- > > Key: SOLR-13933 > URL: https://issues.apache.org/jira/browse/SOLR-13933 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > > We need a stress test harness based on 10s or 100s of nodes, 1000s of > collection API operations, overseer operations etc. This suite should run > nightly, publish results publicly, so as to help with: > # Uncover stability problems > # Benchmarking (timings, resource metrics etc.) on collection operations > # Indexing/querying performance > # Validate the accuracy of potential improvements > References: > SOLR-10317 > https://github.com/lucidworks/solr-scale-tk > https://github.com/shalinmangar/solr-perf-tools > Lucene benchmarks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13933) Cluster mode Stress test suite
[ https://issues.apache.org/jira/browse/SOLR-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980989#comment-16980989 ] Ishan Chattopadhyaya commented on SOLR-13933: - I've managed to spin up and down instances on GCP, build the Solr jar from a commit and scp it to the instances. Here's a tentative format for defining the tasks to be executed. {code} { "task-types": [ { "name": "indexing-wikipedia", "indexing-benchmark": { "name": "wikipedia-small", "description": "Wikipedia dataset on SolrCloud", "dataset-file": "small-data/small-enwiki.tsv.gz", "setups": [ { "setup-name": "wiki_2x2", "collection": "wiki_2x2", "replication-factor": 2, "shards": 2, "min-threads": 4, "max-threads": 12, "thread-step": 4 } ] } }, { "name": "collection-creation", "command": "http://${HOST}:${PORT}/solr/collections/admin?action=CREATE=collection${INDEX}=${SHARDS};, "defaults": { "INDEX": 0, "SHARDS": 1 } }, { "name": "shard-splitting", "command": "http://${HOST}:${PORT}/solr/collections/admin?action=SPLITSHARD=${COLLECTION}=${SHARD};, "defaults": {} } ], "global-variables": { "collection-counter": 1 }, "tasks": [ { "task": "task1", "type": "indexing-wikipedia", "mode": "async" }, { "description": "Create 100 collections parallely using 4 threads", "task": "task2", "type": "collection-creation", "instances": 100, "concurrency": 4, "parameters": { "INDEX": "${collection-counter}" }, "pre-task-evals": [ "inc(collection-counter,1)" ], "mode": "async" }, { "description": "Once all collections are created, split a shard in collection1", "task": "task3", "type": "shard-splitting", "parameters": { "COLLECTION": "collection1", "SHARD": "shard1" }, "waitFor": "task2", "mode": "sync" } ] } {code} > Cluster mode Stress test suite > --- > > Key: SOLR-13933 > URL: https://issues.apache.org/jira/browse/SOLR-13933 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > > We need a stress test harness based on 10s or 100s of nodes, 1000s of > collection API operations, overseer operations etc. This suite should run > nightly, publish results publicly, so as to help with: > # Uncover stability problems > # Benchmarking (timings, resource metrics etc.) on collection operations > # Indexing/querying performance > # Validate the accuracy of potential improvements > References: > SOLR-10317 > https://github.com/lucidworks/solr-scale-tk > https://github.com/shalinmangar/solr-perf-tools > Lucene benchmarks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13842) Remove wt=json from Implicit API definition's defaults
[ https://issues.apache.org/jira/browse/SOLR-13842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980956#comment-16980956 ] Munendra S N commented on SOLR-13842: - Please go ahead. Leave a comment when u pick this up > Remove wt=json from Implicit API definition's defaults > -- > > Key: SOLR-13842 > URL: https://issues.apache.org/jira/browse/SOLR-13842 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Munendra S N >Priority: Minor > Labels: newdev > > From solr 7, {{json}} is the default response writer. So, {{wt=json}} can be > removed from implicit API definitions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980913#comment-16980913 ] Noble Paul commented on SOLR-13963: --- I have attached a patch where the main {{_readStr(DataInputInputStream dis, StringCache stringCache, int sz)}} does not pay the price of synchronization. I'm still thinking how I can make a testcase which can localize the problem > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > -- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Assignee: Noble Paul >Priority: Major > Attachments: JavaBinCodec.java, SOLR-13963.patch, SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-12193) Move some log messages to TRACE level
[ https://issues.apache.org/jira/browse/SOLR-12193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-12193. Fix Version/s: (was: 8.1) (was: master (9.0)) 8.4 Resolution: Fixed Thanks [~gezapeti], finally got around to this one > Move some log messages to TRACE level > - > > Key: SOLR-12193 > URL: https://issues.apache.org/jira/browse/SOLR-12193 > Project: Solr > Issue Type: Improvement > Components: logging >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Labels: newbie, newdev > Fix For: 8.4 > > Time Spent: 0.5h > Remaining Estimate: 0h > > One example of a wasteful DEBUG log which could be moved to TRACE level is: > {noformat} > $ solr start -f -v > 2018-04-05 22:46:14.488 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /opt/solr/server/solr/solr.xml > 2018-04-05 22:46:14.574 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@coreLoadThreads > 2018-04-05 22:46:14.577 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@persistent > 2018-04-05 22:46:14.579 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@sharedLib > 2018-04-05 22:46:14.581 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@zkHost > 2018-04-05 22:46:14.583 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/cores > 2018-04-05 22:46:14.605 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/transientCoreCacheFactory > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/counter > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/meter > 2018-04-05 22:46:14.611 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/timer > 2018-04-05 22:46:14.612 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/histogram > 201 > {noformat} > There are probably other examples as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12193) Move some log messages to TRACE level
[ https://issues.apache.org/jira/browse/SOLR-12193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980910#comment-16980910 ] ASF subversion and git services commented on SOLR-12193: Commit 5f11efb2d51ce7ebc28012db059553f83ba4fdff in lucene-solr's branch refs/heads/branch_8x from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5f11efb ] SOLR-12193: Move some log messages to TRACE level, remove some dead code (cherry picked from commit d809bc27f1b5cd6d97e0bfe688c99d481bc42d39) > Move some log messages to TRACE level > - > > Key: SOLR-12193 > URL: https://issues.apache.org/jira/browse/SOLR-12193 > Project: Solr > Issue Type: Improvement > Components: logging >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Labels: newbie, newdev > Fix For: 8.1, master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > One example of a wasteful DEBUG log which could be moved to TRACE level is: > {noformat} > $ solr start -f -v > 2018-04-05 22:46:14.488 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /opt/solr/server/solr/solr.xml > 2018-04-05 22:46:14.574 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@coreLoadThreads > 2018-04-05 22:46:14.577 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@persistent > 2018-04-05 22:46:14.579 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@sharedLib > 2018-04-05 22:46:14.581 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@zkHost > 2018-04-05 22:46:14.583 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/cores > 2018-04-05 22:46:14.605 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/transientCoreCacheFactory > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/counter > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/meter > 2018-04-05 22:46:14.611 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/timer > 2018-04-05 22:46:14.612 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/histogram > 201 > {noformat} > There are probably other examples as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-13963: -- Attachment: SOLR-13963.patch > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > -- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Assignee: Noble Paul >Priority: Major > Attachments: JavaBinCodec.java, SOLR-13963.patch, SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12193) Move some log messages to TRACE level
[ https://issues.apache.org/jira/browse/SOLR-12193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980911#comment-16980911 ] ASF subversion and git services commented on SOLR-12193: Commit 340b238f1c15e4c5facc58990fbb653064a0b121 in lucene-solr's branch refs/heads/branch_8x from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=340b238 ] SOLR-12193: reverting one line back to trace (cherry picked from commit 592ea19eff0a0d4225f92d0b96bfb3c9559c077e) > Move some log messages to TRACE level > - > > Key: SOLR-12193 > URL: https://issues.apache.org/jira/browse/SOLR-12193 > Project: Solr > Issue Type: Improvement > Components: logging >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Labels: newbie, newdev > Fix For: 8.1, master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > One example of a wasteful DEBUG log which could be moved to TRACE level is: > {noformat} > $ solr start -f -v > 2018-04-05 22:46:14.488 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /opt/solr/server/solr/solr.xml > 2018-04-05 22:46:14.574 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@coreLoadThreads > 2018-04-05 22:46:14.577 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@persistent > 2018-04-05 22:46:14.579 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@sharedLib > 2018-04-05 22:46:14.581 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@zkHost > 2018-04-05 22:46:14.583 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/cores > 2018-04-05 22:46:14.605 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/transientCoreCacheFactory > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/counter > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/meter > 2018-04-05 22:46:14.611 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/timer > 2018-04-05 22:46:14.612 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/histogram > 201 > {noformat} > There are probably other examples as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12193) Move some log messages to TRACE level
[ https://issues.apache.org/jira/browse/SOLR-12193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980909#comment-16980909 ] ASF subversion and git services commented on SOLR-12193: Commit 592ea19eff0a0d4225f92d0b96bfb3c9559c077e in lucene-solr's branch refs/heads/master from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=592ea19 ] SOLR-12193: reverting one line back to trace > Move some log messages to TRACE level > - > > Key: SOLR-12193 > URL: https://issues.apache.org/jira/browse/SOLR-12193 > Project: Solr > Issue Type: Improvement > Components: logging >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Labels: newbie, newdev > Fix For: 8.1, master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > One example of a wasteful DEBUG log which could be moved to TRACE level is: > {noformat} > $ solr start -f -v > 2018-04-05 22:46:14.488 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /opt/solr/server/solr/solr.xml > 2018-04-05 22:46:14.574 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@coreLoadThreads > 2018-04-05 22:46:14.577 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@persistent > 2018-04-05 22:46:14.579 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@sharedLib > 2018-04-05 22:46:14.581 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@zkHost > 2018-04-05 22:46:14.583 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/cores > 2018-04-05 22:46:14.605 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/transientCoreCacheFactory > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/counter > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/meter > 2018-04-05 22:46:14.611 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/timer > 2018-04-05 22:46:14.612 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/histogram > 201 > {noformat} > There are probably other examples as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12193) Move some log messages to TRACE level
[ https://issues.apache.org/jira/browse/SOLR-12193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980904#comment-16980904 ] ASF subversion and git services commented on SOLR-12193: Commit d809bc27f1b5cd6d97e0bfe688c99d481bc42d39 in lucene-solr's branch refs/heads/master from Jan Høydahl [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d809bc2 ] SOLR-12193: Move some log messages to TRACE level, remove some dead code > Move some log messages to TRACE level > - > > Key: SOLR-12193 > URL: https://issues.apache.org/jira/browse/SOLR-12193 > Project: Solr > Issue Type: Improvement > Components: logging >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Labels: newbie, newdev > Fix For: 8.1, master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > One example of a wasteful DEBUG log which could be moved to TRACE level is: > {noformat} > $ solr start -f -v > 2018-04-05 22:46:14.488 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading > container configuration from /opt/solr/server/solr/solr.xml > 2018-04-05 22:46:14.574 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@coreLoadThreads > 2018-04-05 22:46:14.577 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@persistent > 2018-04-05 22:46:14.579 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@sharedLib > 2018-04-05 22:46:14.581 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/@zkHost > 2018-04-05 22:46:14.583 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/cores > 2018-04-05 22:46:14.605 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/transientCoreCacheFactory > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/counter > 2018-04-05 22:46:14.609 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/meter > 2018-04-05 22:46:14.611 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/timer > 2018-04-05 22:46:14.612 DEBUG (main) [ ] o.a.s.c.Config null missing > optional solr/metrics/suppliers/histogram > 201 > {noformat} > There are probably other examples as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8674) UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal
[ https://issues.apache.org/jira/browse/LUCENE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980901#comment-16980901 ] Rahul Yadav commented on LUCENE-8674: - I am looking at this > UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal > -- > > Key: LUCENE-8674 > URL: https://issues.apache.org/jira/browse/LUCENE-8674 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection and reproducing the bug > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > curl -v “URL_BUG” > {noformat} > Please check the issue description below to find the “URL_BUG” that will > allow you to reproduce the issue reported. >Reporter: Johannes Kloos >Priority: Minor > Labels: diffblue, newdev > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fq={!frange%20l=10%20u=100}or_version_s,directed_by > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.UnsupportedOperationException > at > org.apache.lucene.queries.function.FunctionValues.floatVal(FunctionValues.java:47) > at > org.apache.lucene.queries.function.FunctionValues$3.matches(FunctionValues.java:188) > at > org.apache.lucene.queries.function.ValueSourceScorer$1.matches(ValueSourceScorer.java:53) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.doNext(TwoPhaseIterator.java:89) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.nextDoc(TwoPhaseIterator.java:77) > at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:261) > at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:214) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) > at org.apache.solr.search.DocSetUtil.createDocSetGeneric(DocSetUtil.java:151) > at org.apache.solr.search.DocSetUtil.createDocSet(DocSetUtil.java:140) > at > org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1177) > at > org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:817) > at > org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1025) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1434) > {noformat} > Sadly, I can't understand the logic of this code well enough to give any > insights. > To set up an environment to reproduce this bug, follow the description in the > ‘Environment’ field. > We found this issue and ~70 more like this using [Diffblue Microservices > Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. Find more > information on this [fuzz testing >
[jira] [Resolved] (SOLR-13345) Admin UI login page doesn't accept empty passwords
[ https://issues.apache.org/jira/browse/SOLR-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-13345. Resolution: Won't Fix Closing this. I checked a few code paths, and there are some checks that won't allow you to enter an empty password. Have not checked everywhere though. Feel free to re-open if you want to block empty password in other code paths. > Admin UI login page doesn't accept empty passwords > -- > > Key: SOLR-13345 > URL: https://issues.apache.org/jira/browse/SOLR-13345 > Project: Solr > Issue Type: Bug > Components: Admin UI >Affects Versions: 7.7, 8.0 >Reporter: Märt >Assignee: Jan Høydahl >Priority: Minor > > In solr 7.6 and older, it was possible to log in with an empty password using > basic auth. The new Admin UI login page implemented in SOLR-7896 no longer > accepts empty passwords. > This issue was discussed in the solr-user mailing list > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201903.mbox/%3C7629BDDD-3D22-4203-9188-0E0A8DCF2FEE%40cominvent.com%3E -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13842) Remove wt=json from Implicit API definition's defaults
[ https://issues.apache.org/jira/browse/SOLR-13842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980888#comment-16980888 ] Rahul Yadav commented on SOLR-13842: If no one is working on this , can i start looking at this? > Remove wt=json from Implicit API definition's defaults > -- > > Key: SOLR-13842 > URL: https://issues.apache.org/jira/browse/SOLR-13842 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Munendra S N >Priority: Minor > Labels: newdev > > From solr 7, {{json}} is the default response writer. So, {{wt=json}} can be > removed from implicit API definitions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9031) UnsupportedOperationException on highlighting Interval Query
[ https://issues.apache.org/jira/browse/LUCENE-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated LUCENE-9031: - Attachment: LUCENE-9031.patch Status: Patch Available (was: Patch Available) Moved passing (only) intervals tests to {{TestUnifiedHighlighterTermIntervals}} Refreshed https://github.com/apache/lucene-solr/pull/1011 as well. So far, there two next open questions: * fixField() highlighting LUCENE-9058 * Multiterm Intervals highlihgting. Giving that, everything what's requested has been resolved, I suppose to push it after +1 from precommit. However, more feedbacks, concerns and even vetoes are highly appreciated. Thanks, [~romseygeek]! > UnsupportedOperationException on highlighting Interval Query > > > Key: LUCENE-9031 > URL: https://issues.apache.org/jira/browse/LUCENE-9031 > Project: Lucene - Core > Issue Type: Bug > Components: modules/queries >Reporter: Mikhail Khludnev >Assignee: Mikhail Khludnev >Priority: Major > Fix For: 8.4 > > Attachments: LUCENE-9031.patch, LUCENE-9031.patch, LUCENE-9031.patch, > LUCENE-9031.patch, LUCENE-9031.patch, LUCENE-9031.patch, LUCENE-9031.patch, > LUCENE-9031.patch, LUCENE-9031.patch > > Time Spent: 3.5h > Remaining Estimate: 0h > > When UnifiedHighlighter highlights Interval Query it encounters > UnsupportedOperationException. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980879#comment-16980879 ] Colvin Cowie commented on SOLR-13963: - [~noble.paul] Eclipse is playing up and I need to head off. If you run the test in the patch without the extra synchronized{} block you (should) see it fail every time. I've attached another version of the JBC where I've replaced the synchronized{} with a Reentrant lock and thrown an exception when the lock for the instance is held by another thread, which shows that it is happening. If there's more detail you need after you've tried running the test, then let me know and I can get back to you tomorrow. When I was debugging it earlier, I grabbed this stacktrace where two threads were modifying the same instance. The line numbers won't match here because the code was formatted differently. {noformat} Thread [qtp1047503754-123] (Suspended) JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec)._readStr(DataInputInputStream, JavaBinCodec$StringCache, int) line: 931 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readStr(DataInputInputStream, JavaBinCodec$StringCache, boolean) line: 920 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readExternString(DataInputInputStream) line: 1190 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readObject(DataInputInputStream) line: 302 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readVal(DataInputInputStream) line: 280 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readSolrInputDocument(DataInputInputStream) line: 626 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readObject(DataInputInputStream) line: 339 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readVal(DataInputInputStream) line: 280 JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(DataInputInputStream) line: 321 JavaBinUpdateRequestCodec$StreamingCodec.readIterator(DataInputInputStream) line: 280 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readObject(DataInputInputStream) line: 335 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readVal(DataInputInputStream) line: 280 JavaBinUpdateRequestCodec$StreamingCodec.readNamedList(DataInputInputStream) line: 235 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readObject(DataInputInputStream) line: 300 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).readVal(DataInputInputStream) line: 280 JavaBinUpdateRequestCodec$StreamingCodec(JavaBinCodec).unmarshal(InputStream) line: 189 JavaBinUpdateRequestCodec.unmarshal(InputStream, JavaBinUpdateRequestCodec$StreamingUpdateHandler) line: 126 JavabinLoader.parseAndLoadDocs(SolrQueryRequest, SolrQueryResponse, InputStream, UpdateRequestProcessor) line: 123 JavabinLoader.load(SolrQueryRequest, SolrQueryResponse, ContentStream, UpdateRequestProcessor) line: 70 UpdateRequestHandler$1.load(SolrQueryRequest, SolrQueryResponse, ContentStream, UpdateRequestProcessor) line: 97 UpdateRequestHandler(ContentStreamHandlerBase).handleRequestBody(SolrQueryRequest, SolrQueryResponse) line: 68 UpdateRequestHandler(RequestHandlerBase).handleRequest(SolrQueryRequest, SolrQueryResponse) line: 198 SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse) line: 2576 HttpSolrCall.execute(SolrQueryResponse) line: 803 HttpSolrCall.call() line: 582 RobustSolrDispatchFilter(SolrDispatchFilter).doFilter(ServletRequest, ServletResponse, FilterChain, boolean) line: 424 RobustSolrDispatchFilter(SolrDispatchFilter).doFilter(ServletRequest, ServletResponse, FilterChain) line: 351 ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1602 ServletHandler.doHandle(String, Request, HttpServletRequest, HttpServletResponse) line: 540 ServletHandler(ScopedHandler).handle(String, Request, HttpServletRequest, HttpServletResponse) line: 146 ConstraintSecurityHandler(SecurityHandler).handle(String, Request, HttpServletRequest, HttpServletResponse) line: 548 SessionHandler(HandlerWrapper).handle(String, Request, HttpServletRequest, HttpServletResponse) line: 132 SessionHandler(ScopedHandler).nextHandle(String, Request, HttpServletRequest, HttpServletResponse) line: 257 SessionHandler.doHandle(String, Request, HttpServletRequest, HttpServletResponse) line: 1711 WebAppContext(ScopedHandler).nextHandle(String, Request, HttpServletRequest, HttpServletResponse) line: 255 WebAppContext(ContextHandler).doHandle(String, Request, HttpServletRequest, HttpServletResponse) line: 1347 ServletHandler(ScopedHandler).nextScope(String, Request, HttpServletRequest, HttpServletResponse) line: 203 ServletHandler.doScope(String, Request, HttpServletRequest, HttpServletResponse) line: 480 SessionHandler.doScope(String, Request, HttpServletRequest, HttpServletResponse) line: 1678
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colvin Cowie updated SOLR-13963: Attachment: JavaBinCodec.java > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > -- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Assignee: Noble Paul >Priority: Major > Attachments: JavaBinCodec.java, SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980876#comment-16980876 ] Noble Paul commented on SOLR-13963: --- Thanks , I'll try to modify the test to a smaller easily reproducible > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > -- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-13963: - Assignee: Noble Paul > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > -- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colvin Cowie updated SOLR-13963: Summary: JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates (was: JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates) > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > -- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > Attachments: SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8674) UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal
[ https://issues.apache.org/jira/browse/LUCENE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980868#comment-16980868 ] Rahul Yadav commented on LUCENE-8674: - Hi , NewDev here , can i take up this issue? > UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal > -- > > Key: LUCENE-8674 > URL: https://issues.apache.org/jira/browse/LUCENE-8674 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection and reproducing the bug > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > curl -v “URL_BUG” > {noformat} > Please check the issue description below to find the “URL_BUG” that will > allow you to reproduce the issue reported. >Reporter: Johannes Kloos >Priority: Minor > Labels: diffblue, newdev > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fq={!frange%20l=10%20u=100}or_version_s,directed_by > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.UnsupportedOperationException > at > org.apache.lucene.queries.function.FunctionValues.floatVal(FunctionValues.java:47) > at > org.apache.lucene.queries.function.FunctionValues$3.matches(FunctionValues.java:188) > at > org.apache.lucene.queries.function.ValueSourceScorer$1.matches(ValueSourceScorer.java:53) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.doNext(TwoPhaseIterator.java:89) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.nextDoc(TwoPhaseIterator.java:77) > at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:261) > at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:214) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) > at org.apache.solr.search.DocSetUtil.createDocSetGeneric(DocSetUtil.java:151) > at org.apache.solr.search.DocSetUtil.createDocSet(DocSetUtil.java:140) > at > org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1177) > at > org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:817) > at > org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1025) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1434) > {noformat} > Sadly, I can't understand the logic of this code well enough to give any > insights. > To set up an environment to reproduce this bug, follow the description in the > ‘Environment’ field. > We found this issue and ~70 more like this using [Diffblue Microservices > Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. Find more > information on this [fuzz testing >
[jira] [Commented] (LUCENE-6744) equals methods should compare classes directly, not use instanceof
[ https://issues.apache.org/jira/browse/LUCENE-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980862#comment-16980862 ] Rahul Yadav commented on LUCENE-6744: - Hi , NewDev here , is this issue resolved/abandoned? , if not, can i start looking at this? > equals methods should compare classes directly, not use instanceof > -- > > Key: LUCENE-6744 > URL: https://issues.apache.org/jira/browse/LUCENE-6744 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chris M. Hostetter >Priority: Major > Labels: newdev > Attachments: LUCENE-6744.patch, LUCENE-6744.patch > > > from a 2015-07-12 email to the dev list from Fuxiang Chen... > {noformat} > We have found some inconsistencies in the overriding of the equals() method > in some files with respect to the conforming to the contract structure > based on the Java Specification. > Affected files: > 1) ConstValueSource.java > 2) DoubleConstValueSource.java > 3) FixedBitSet.java > 4) GeohashFunction.java > 5) LongBitSet.java > 6) SpanNearQuery.java > 7) StringDistanceFunction.java > 8) ValueSourceRangeFilter.java > 9) VectorDistanceFunction.java > The above files all uses instanceof in the overridden equals() method in > comparing two objects. > According to the Java Specification, the equals() method must be reflexive, > symmetric, transitive and consistent. In the case of symmetric, it is > stated that x.equals(y) should return true if and only if y.equals(x) > returns true. Using instanceof is asymmetric and is not a valid symmetric > contract. > A more preferred way will be to compare the classes instead. i.e. if > (this.getClass() != o.getClass()). > However, if compiling the source code using JDK 7 and above, and if > developers still prefer to use instanceof, you can make use of the static > methods of Objects such as Objects.equals(this.id, that.id). (Making use of > the static methods of Objects is currently absent in the methods.) It will > be easier to override the equals() method and will ensure that the > overridden equals() method will fulfill the contract rules. > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980860#comment-16980860 ] Colvin Cowie commented on SOLR-13963: - I've attached a patch that fixes it, and I've included a new test that reproduces the problem without the fix... I don't know enough about the way the tests have been done for Solr to know what the best way to write a test for this is, so I've just done something that worked. But if there is a better way to do it / different coding style etc, then obviously I'm open to it being done differently. > JavaBinCodec has concurrent modification of CharrArr resulting in corrupt > intranode updates > --- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > Attachments: SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colvin Cowie updated SOLR-13963: Status: Patch Available (was: Open) > JavaBinCodec has concurrent modification of CharrArr resulting in corrupt > intranode updates > --- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > Attachments: SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colvin Cowie updated SOLR-13963: Attachment: SOLR-13963.patch > JavaBinCodec has concurrent modification of CharrArr resulting in corrupt > intranode updates > --- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > Attachments: SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9056) Simplify BlockImpactsDocsEnum#advance
[ https://issues.apache.org/jira/browse/LUCENE-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9056. -- Fix Version/s: 8.4 Resolution: Fixed > Simplify BlockImpactsDocsEnum#advance > - > > Key: LUCENE-9056 > URL: https://issues.apache.org/jira/browse/LUCENE-9056 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.4 > > Time Spent: 20m > Remaining Estimate: 0h > > This is a follow-up to LUCENE-9027. Now that we compute the prefix sum in > #refillDocs, we can remove the check that we are on the last document of the > postings list so that we should return NO_MORE_DOCS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9060) Fix the files generated python scripts in lucene/util/packed to not use RamUsageEstimator.NUM_BYTES_INT
[ https://issues.apache.org/jira/browse/LUCENE-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980838#comment-16980838 ] Adrien Grand commented on LUCENE-9060: -- +1 Does the script then recreate the file exactly as it is today? > Fix the files generated python scripts in lucene/util/packed to not use > RamUsageEstimator.NUM_BYTES_INT > --- > > Key: LUCENE-9060 > URL: https://issues.apache.org/jira/browse/LUCENE-9060 > Project: Lucene - Core > Issue Type: Bug >Reporter: Erick Erickson >Priority: Major > Attachments: LUCENE-9060.patch > > > RamUsageEstimator.NUM_BYTES_INT has been removed. But the Python code still > puts it in the generated code. Once you run "ant regenerate" (and I had to > run it with 24G!) you can no longer build. > We should verify that warnings against hand-editing end up in the generated > code, although they weren't hand-edited in this case. > It looks like the constants were removed as part of LUCENE-8745. > I think it's just a straightforward substitution of "Integer.BYTES". -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8985) SynonymGraphFilter cannot handle input stream with tokens filtered.
[ https://issues.apache.org/jira/browse/LUCENE-8985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980835#comment-16980835 ] Jan Høydahl commented on LUCENE-8985: - I was hoping the new tests would describe the bug. The current sgf code does not handle holes in the token stream caused by removing tokens in e.g. stopfilter. When holes are remover then a phrase query may get wrong match. Simplified example: Document: Please clean the screen After stopfilter: Please clean * screen Query: “clean the monitor” After stopfilter: “clean * monitor” After sgf: “clean screen|monitor” No match > SynonymGraphFilter cannot handle input stream with tokens filtered. > --- > > Key: LUCENE-8985 > URL: https://issues.apache.org/jira/browse/LUCENE-8985 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chongchen Chen >Assignee: Jan Høydahl >Priority: Major > Fix For: 8.3 > > Attachments: SGF_SF_interaction.patch.txt > > Time Spent: 2h 40m > Remaining Estimate: 0h > > [~janhoy] find the bug. > In an analyzer with e.g. stopFilter where tokens are removed from the stream > and replaced with a “hole”, synonymgraphfilter will not preserve these holes > but remove them, resulting in certain phrase queries failing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980830#comment-16980830 ] Ishan Chattopadhyaya commented on SOLR-13963: - Thanks Colvin! [~noble], FYI. > JavaBinCodec has concurrent modification of CharrArr resulting in corrupt > intranode updates > --- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980829#comment-16980829 ] Colvin Cowie commented on SOLR-13963: - SOLR-12983 is in 7.x and is when the getStringProvider() was added, so there is a potential bug in 7 as well. But maybe there's nothing hitting it. [https://github.com/apache/lucene-solr/commit/507a96e4181d4151d36332d46dd51e7ca5a09f90] Probably worth applying the fix to both anyway > JavaBinCodec has concurrent modification of CharrArr resulting in corrupt > intranode updates > --- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colvin Cowie updated SOLR-13963: Description: Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?" In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was _sometimes_ corrupted. For example if the well formed data was _'fieldName':"this is a long string"_ The error we saw from Solr might be that unknown field _+'fieldNamis a long string"+_ The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before. getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls org.apache.solr.common.util.JavaBinCodec.getStringProvider() JavaBinCodec has a CharArr, _arr_, which is modified in two different locations, but only one of which is protected with a synchronized block getStringProvider() synchronizes on _arr_: [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] but _readStr() doesn't: [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] The two methods are called concurrently, but wheren't prior to SOLR-13682. Adding a synchronized block into _readStr() around the modification of _arr_ fixes the problem as far as I can see. Also, the problem does not seem to occur when using the dynamic schema mode of autoCreateFields=true in the updateRequestProcessorChain. was: Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?" In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was _sometimes_ corrupted. For example if the well formed data was _'fieldName':"this is a long string"_ The error we saw from Solr might be that unknown field _+'fieldNamis a long string"+_ The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before. getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls org.apache.solr.common.util.JavaBinCodec.getStringProvider() JavaBinCodec has a CharArr, _arr_, which is modified in two different locations, but only one of which is protected with a synchronized block getStringProvider() synchronizes on _arr_: [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] but _readStr() doesn't: [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] The two methods are called concurrently, but wheren't prior to SOLR-13682. Adding a synchronized block into _readStr() around the modification of _arr_ fixes the problem as far as I can see. > JavaBinCodec has concurrent modification of CharrArr resulting in corrupt > intranode updates > --- > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.3 >Reporter: Colvin Cowie >Priority: Major > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls >
[jira] [Created] (SOLR-13963) JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
Colvin Cowie created SOLR-13963: --- Summary: JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates Key: SOLR-13963 URL: https://issues.apache.org/jira/browse/SOLR-13963 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 8.3 Reporter: Colvin Cowie Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?" In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was _sometimes_ corrupted. For example if the well formed data was _'fieldName':"this is a long string"_ The error we saw from Solr might be that unknown field _+'fieldNamis a long string"+_ The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before. getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls org.apache.solr.common.util.JavaBinCodec.getStringProvider() JavaBinCodec has a CharArr, _arr_, which is modified in two different locations, but only one of which is protected with a synchronized block getStringProvider() synchronizes on _arr_: [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] but _readStr() doesn't: [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] The two methods are called concurrently, but wheren't prior to SOLR-13682. Adding a synchronized block into _readStr() around the modification of _arr_ fixes the problem as far as I can see. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980772#comment-16980772 ] Tomoko Uchida edited comment on LUCENE-9004 at 11/23/19 4:23 PM: - Just for status update: [my PoC branch|https://github.com/mocobeta/lucene-solr-mirror/tree/jira/LUCENE-9004-aknn] is still on pretty early stage and works only on one segment, but now it can index and query arbitrary vectors by [this example code|https://gist.github.com/mocobeta/5c174ee9fc6408470057a9e7d2020c45]. The newly added KnnGraphQuery is an extension of Query class so it should be able to be combined with other queries with some limitations, because the knn query cannot score entire dataset in nature. Indexing performance is terrible for now (it takes a few minutes for a hundred of thousands vectors w/ 100 dims on commodity PCs), but searching doesn't look too bad (it takes ~30 msec for the same dataset) thanks to the skip list-like graph structure. On my current branch I wrapped {{BinaryDocValues}} to store vector values. However, exposing random access capability for doc values (or its extensions) can be controversial, so I'd like to propose a new codec which combines 1. the HNSW graph and 2. the vectors (float arrays). The new format for each vector field would have three parts (in other words, three files in a segment). They would look like: {code:java} Meta data and index part: +--+ | meta data| ++-+ | doc id | offset to first friend list for the doc | ++-+ | doc id | offset to first friend list for the doc | ++-+ | .. | ++-+ Graph data part: +-+---+-+-+ | friends list at layer N | friends list at layer N-1 | .. | friends list at level 0 | <- friends lists for doc 0 +-+---+-+-+ | friends list at layer N | friends list at layer N-1 | .. | friends list at level 0 | <- friends lists for doc 1 +-+---+-+-+ |.. | <- and so on +-+ Vector data part: +--+ | encoded vector value | <- vector value for doc 0 +--+ | encoded vector value | <- vector value for doc 1 +--+ | .. | <- and so on +--+ {code} - "meta data" includes: number of dimensions, distance function for similarity calculation, and other field level meta data - "doc id" is: doc ids having a vector value on this field - "friends list at layer N" is: a delta encoded target doc id list where each target doc is connected to the doc at Nth layer - "encoded vector value" is: a fixed length byte array. the offset of the value can be calculated on the fly. (limitations: each document can have only one vector value for each vector field) The graph data (friends lists) is relatively small so we could keep all of them on the Java heap for fast retrieval (though some off-heap strategy might be required for very large graphs). The vector data (vector values) is large and only the small fraction of it is needed when searching, so they should be kept on disk and accessed by some on-demand style. Feedback is welcomed. And I have a question about introducing new formats - is there a way to inject XXXFormat to the indexing chain so that we can add in this feature without any change on the {{lucene-core}}? was (Author: tomoko uchida): Just for status update: [my PoC branch|https://github.com/mocobeta/lucene-solr-mirror/tree/jira/LUCENE-9004-aknn] is still on pretty early stage and works only on one segment, but now it can index and query arbitrary vectors by [this example code|https://gist.github.com/mocobeta/5c174ee9fc6408470057a9e7d2020c45]. The newly added KnnGraphQuery is an extension of Query class so it should be able to be combined with other queries with some limitations, because the knn query cannot score entire dataset in nature. Indexing performance is terrible for now (it takes a few minutes for a hundred of thousands vectors w/ 100 dims on commodity PCs), but searching doesn't look too bad (it takes ~30 msec for the same dataset) thanks to the skip list-like graph structure. On my current branch I wrapped {{BinaryDocValues}} to store vector
[jira] [Comment Edited] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980772#comment-16980772 ] Tomoko Uchida edited comment on LUCENE-9004 at 11/23/19 3:52 PM: - Just for status update: [my PoC branch|https://github.com/mocobeta/lucene-solr-mirror/tree/jira/LUCENE-9004-aknn] is still on pretty early stage and works only on one segment, but now it can index and query arbitrary vectors by [this example code|https://gist.github.com/mocobeta/5c174ee9fc6408470057a9e7d2020c45]. The newly added KnnGraphQuery is an extension of Query class so it should be able to be combined with other queries with some limitations, because the knn query cannot score entire dataset in nature. Indexing performance is terrible for now (it takes a few minutes for a hundred of thousands vectors w/ 100 dims on commodity PCs), but searching doesn't look too bad (it takes ~30 msec for the same dataset) thanks to the skip list-like graph structure. On my current branch I wrapped {{BinaryDocValues}} to store vector values. However, exposing random access capability for doc values (or its extensions) can be controversial, so I'd like to propose a new codec which combines 1. the HNSW graph and 2. the vectors (float arrays). The new format for each vector field would have three parts (in other words, three files in a segment). They would look like: {code:java} Meta data and index part: +--+ | meta data| ++-+ | doc id | offset to first friend list for the doc | ++-+ | doc id | offset to first friend list for the doc | ++-+ | .. | ++-+ Graph data part: +-+---+-+-+ | friends list at layer N | friends list at layer N-1 | .. | friends list at level 0 | <- friends lists for doc 0 +-+---+-+-+ | friends list at layer N | friends list at layer N-1 | .. | friends list at level 0 | <- friends lists for doc 1 +-+---+-+-+ |.. | <- and so on +-+ Vector data part: +--+ | encoded vector value | <- vector value for doc 0 +--+ | encoded vector value | <- vector value for doc 1 +--+ | .. | <- and so on +--+ {code} - "meta data" includes: number of dimensions, distance function for similarity calculation, and other field level meta data - "doc id" is: doc ids having a vector value on this field - "friends list at layer N" is: a delta encoded target doc id list where each target doc is connected to the doc at Nth layer - "encoded vector value" is: a fixed length byte array. the offset of the value can be calculated on the fly. (limitations: each document can have only one vector value for each vector field) The graph data (friends lists) is relatively small so we could keep all of them on the Java heap for fast retrieval (though some off-heap strategy might be required for very large graphs). The vector data (vector values) is large and only the small fraction of it is needed when searching, so they should be accessed by on-demand style via the index. Feedback is welcomed. And I have a question about introducing new formats - is there a way to inject XXXFormat to the indexing chain so that we can add in this feature without any change on the {{lucene-core}}? was (Author: tomoko uchida): Just for status update: [my PoC branch|https://github.com/mocobeta/lucene-solr-mirror/tree/jira/LUCENE-9004-aknn] is still on pretty early stage and works only on one segment, but now it can index and query arbitrary vectors by [this example code|https://gist.github.com/mocobeta/5c174ee9fc6408470057a9e7d2020c45]. The newly added KnnGraphQuery is an extension of Query class so it should be able to be combined with other queries with some limitations, because the knn query cannot score entire dataset in nature. Indexing performance is terrible for now (it takes a few minutes for a hundred of thousands vectors w/ 100 dims on commodity PCs), but searching doesn't look too bad (it takes ~30 msec for the same dataset) thanks to the skip list-like graph structure. On my current branch I wrapped {{BinaryDocValues}} to store vector values.
[GitHub] [lucene-solr] freedev commented on a change in pull request #996: SOLR-13863: Added spayload query function to read and sort string pay…
freedev commented on a change in pull request #996: SOLR-13863: Added spayload query function to read and sort string pay… URL: https://github.com/apache/lucene-solr/pull/996#discussion_r349876782 ## File path: solr/core/src/java/org/apache/solr/search/StringPayloadValueSource.java ## @@ -0,0 +1,306 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.search; + +import java.io.IOException; +import java.util.Map; + +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.PostingsEnum; +import org.apache.lucene.index.Terms; +import org.apache.lucene.index.TermsEnum; +import org.apache.lucene.queries.function.FunctionValues; +import org.apache.lucene.queries.function.ValueSource; +import org.apache.lucene.queries.function.docvalues.StrDocValues; +import org.apache.lucene.search.DocIdSetIterator; +import org.apache.lucene.search.FieldComparator; +import org.apache.lucene.search.FieldComparatorSource; +import org.apache.lucene.search.IndexSearcher; +import org.apache.lucene.search.SimpleFieldComparator; +import org.apache.lucene.search.SortField; +import org.apache.lucene.util.BytesRef; + +public class StringPayloadValueSource extends ValueSource { Review comment: Hi erik, thanks for the suggestion. Surprisingly, even if I've subscribed for notification I haven't received any message for your comment. I'll try to refactor my solution adding this new behaviour to payload() function. Thanks again for your time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13952) Separate out Gradle-specific code from other (mostly test) changes and commit separately
[ https://issues.apache.org/jira/browse/SOLR-13952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980773#comment-16980773 ] David Smiley commented on SOLR-13952: - Then file an issue RE SuppressWarnings with 200 files and commit that. That's a valid subject/theme that could have a commit message that makes sense. I just don't want a commit/issue that's basically "Bunch of random stuff Mark did; he knows best" > Separate out Gradle-specific code from other (mostly test) changes and commit > separately > > > Key: SOLR-13952 > URL: https://issues.apache.org/jira/browse/SOLR-13952 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: fordavid.patch > > > The gradle_8 branch has many changes unrelated to gradle. It would be much > easier to work on the gradle parts if these were separated. So here's my plan: > - establish a branch to use for the non-gradle parts of the gradle_8 branch > and commit separately. For a first cut, I'll make all the changes I'm > confident of, and mark the others with nocommits so we can iterate and decide > when to merge to master and 8x. > - create a "gradle_9" branch that hosts only the gradle changes for us all to > iterate on. > I hope to have a preliminary cut at this over the weekend. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980772#comment-16980772 ] Tomoko Uchida commented on LUCENE-9004: --- Just for status update: [my PoC branch|https://github.com/mocobeta/lucene-solr-mirror/tree/jira/LUCENE-9004-aknn] is still on pretty early stage and works only on one segment, but now it can index and query arbitrary vectors by [this example code|https://gist.github.com/mocobeta/5c174ee9fc6408470057a9e7d2020c45]. The newly added KnnGraphQuery is an extension of Query class so it should be able to be combined with other queries with some limitations, because the knn query cannot score entire dataset in nature. Indexing performance is terrible for now (it takes a few minutes for a hundred of thousands vectors w/ 100 dims on commodity PCs), but searching doesn't look too bad (it takes ~30 msec for the same dataset) thanks to the skip list-like graph structure. On my current branch I wrapped {{BinaryDocValues}} to store vector values. However, exposing random access capability for doc values (or its extensions) can be controversial, so I'd like to propose a new codec which combines 1. the HNSW graph and 2. the vectors (float arrays). The new format for each vector field would have three parts (in other words, three files in a segment). They would look like: {code:java} Meta data and index part: +--+ | meta data| ++-+ | doc id | offset to first friend list for the doc | ++-+ | doc id | offset to first friend list for the doc | ++-+ | .. | ++-+ Graph data part: +-+---+-+-+ | friends list at layer N | friends list at layer N-1 | .. | friends list at level 0 | +-+---+-+-+ | friends list at layer N | friends list at layer N-1 | .. | friends list at level 0 | +-+---+-+-+ |.. | +-+ Vector data part: +--+ | encoded vector value | +--+ | encoded vector value | +--+ | .. | +--+ {code} - "meta data" includes: number of dimensions, distance function for similarity calculation, and other field level meta data - "doc id" is: doc ids having a vector value on this field - "friends list at layer N" is: a delta encoded target doc id list where each target doc is connected to the doc at Nth layer - "encoded vector value" is: a fixed length byte array. the offset of the value can be calculated on the fly. (limitations: each document can have only one vector value for each vector field) The graph data (friends lists) is relatively small so we could keep all of them on the Java heap for fast retrieval (though some off-heap strategy might be required for very large graphs). The vector data (vector values) is large and only the small fraction of it is needed when searching, so they should be accessed by on-demand style via the index. Feedback is welcomed. And I have a question about introducing new formats - is there a way to inject XXXFormat to the indexing chain so that we can add in this feature without any change on the {{lucene-core}}? > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Priority: Major > Attachments: hnsw_layered_graph.png > > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search engine. SOLR-12890 is exploring various approaches to this, including > providing vector-based scoring functions. This is a spinoff issue from that. > The idea here is to explore approximate nearest-neighbor search. Researchers > have found an approach based on navigating a graph that partially encodes the > nearest neighbor relation at multiple scales can provide accuracy > 95% (as > compared to exact nearest neighbor calculations) at a reasonable cost. This > issue will explore implementing HNSW (hierarchical navigable
[jira] [Commented] (SOLR-13952) Separate out Gradle-specific code from other (mostly test) changes and commit separately
[ https://issues.apache.org/jira/browse/SOLR-13952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980771#comment-16980771 ] Erick Erickson commented on SOLR-13952: --- Thanks for letting me know about XmlOffsetCorrector. re: one big commit. There are over 200 (?) separate files affected, the vast majority of them follow the pattern: {code} @SuppressWarning blah blah blah {code} Mostly for deprecations and the like and another group for thread leaks from other packages that we don't control and shouldn't fail suites because of thread leaks in them. There are maybe 3-4 about matters like this. And this one (XmlOffsetCorrector) doesn't count since I'm going to revert it on your advice. I'm not willing to create a huge number of tickets for this. Look at the bright side, at least the gradle branch won't have them (soon I hope). The goal here is to peel this out of the Gradle build exactly to introduce _some_ separation of changes in the gradle branch without losing Mark's efforts at test improvement (or, in many cases compiler warnings). So what do you suggest here? I'm not going to do a lot of busy work to address this. > Separate out Gradle-specific code from other (mostly test) changes and commit > separately > > > Key: SOLR-13952 > URL: https://issues.apache.org/jira/browse/SOLR-13952 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: fordavid.patch > > > The gradle_8 branch has many changes unrelated to gradle. It would be much > easier to work on the gradle parts if these were separated. So here's my plan: > - establish a branch to use for the non-gradle parts of the gradle_8 branch > and commit separately. For a first cut, I'll make all the changes I'm > confident of, and mark the others with nocommits so we can iterate and decide > when to merge to master and 8x. > - create a "gradle_9" branch that hosts only the gradle changes for us all to > iterate on. > I hope to have a preliminary cut at this over the weekend. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9061) Async channel tests may leak internal java threads
[ https://issues.apache.org/jira/browse/LUCENE-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980766#comment-16980766 ] ASF subversion and git services commented on LUCENE-9061: - Commit fad75cf98dc0e3a24fad259f9cea18b3d8bf9a05 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fad75cf ] LUCENE-9061: Use an explicit executor service in async channel tests, otherwise they leak internal JVM threads. > Async channel tests may leak internal java threads > -- > > Key: LUCENE-9061 > URL: https://issues.apache.org/jira/browse/LUCENE-9061 > Project: Lucene - Core > Issue Type: Test >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9061) Async channel tests may leak internal java threads
[ https://issues.apache.org/jira/browse/LUCENE-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-9061: Fix Version/s: master (9.0) > Async channel tests may leak internal java threads > -- > > Key: LUCENE-9061 > URL: https://issues.apache.org/jira/browse/LUCENE-9061 > Project: Lucene - Core > Issue Type: Test >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: master (9.0) > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9061) Async channel tests may leak internal java threads
[ https://issues.apache.org/jira/browse/LUCENE-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9061. - Resolution: Fixed > Async channel tests may leak internal java threads > -- > > Key: LUCENE-9061 > URL: https://issues.apache.org/jira/browse/LUCENE-9061 > Project: Lucene - Core > Issue Type: Test >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: master (9.0) > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9061) Async channel tests may leak internal java threads
Dawid Weiss created LUCENE-9061: --- Summary: Async channel tests may leak internal java threads Key: LUCENE-9061 URL: https://issues.apache.org/jira/browse/LUCENE-9061 Project: Lucene - Core Issue Type: Test Reporter: Dawid Weiss Assignee: Dawid Weiss -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8985) SynonymGraphFilter cannot handle input stream with tokens filtered.
[ https://issues.apache.org/jira/browse/LUCENE-8985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980750#comment-16980750 ] Michael Sokolov commented on LUCENE-8985: - I looked at the patch - it seems quite a significant change: maybe a good one?! I'm not entirely clear what the goal is though - the bug description in this issue is pretty light on details. Can we explain here what the current behavior of PhraseQuery and other positional queries is w.r.t holes more generally, ignoring synonyms, and then how it currently works (is broken) in the presence of synonyms? I don't have enough context to review this. I'm willing to help out, but I'd need more to go on, and I can't really commit to the 8.4 release schedule, sorry [~janhoy] > SynonymGraphFilter cannot handle input stream with tokens filtered. > --- > > Key: LUCENE-8985 > URL: https://issues.apache.org/jira/browse/LUCENE-8985 > Project: Lucene - Core > Issue Type: Bug >Reporter: Chongchen Chen >Assignee: Jan Høydahl >Priority: Major > Fix For: 8.3 > > Attachments: SGF_SF_interaction.patch.txt > > Time Spent: 2h 40m > Remaining Estimate: 0h > > [~janhoy] find the bug. > In an analyzer with e.g. stopFilter where tokens are removed from the stream > and replaced with a “hole”, synonymgraphfilter will not preserve these holes > but remove them, resulting in certain phrase queries failing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13961) Unsetting Nested Documents using Atomic Update leads to SolrException: undefined field
[ https://issues.apache.org/jira/browse/SOLR-13961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980749#comment-16980749 ] Thomas Wöckinger commented on SOLR-13961: - [~dsmiley]: Changed it already, and resolved the discussion. All tests are passing. > Unsetting Nested Documents using Atomic Update leads to SolrException: > undefined field > -- > > Key: SOLR-13961 > URL: https://issues.apache.org/jira/browse/SOLR-13961 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests, UpdateRequestProcessors >Affects Versions: master (9.0), 8.3, 8.4 >Reporter: Thomas Wöckinger >Assignee: David Smiley >Priority: Critical > Labels: easyfix > Time Spent: 0.5h > Remaining Estimate: 0h > > Using null or empty collection to unset nested documents (as suggested by > documentation) leads to SolrException: undefined field ... . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thomaswoeckinger commented on a change in pull request #1030: SOLR-13961: Fix Atomic Update unset nested documents
thomaswoeckinger commented on a change in pull request #1030: SOLR-13961: Fix Atomic Update unset nested documents URL: https://github.com/apache/lucene-solr/pull/1030#discussion_r349867533 ## File path: solr/core/src/test/org/apache/solr/update/processor/NestedAtomicUpdateTest.java ## @@ -642,6 +642,118 @@ public void testBlockAtomicRemove() throws Exception { ); } + @Test + public void testBlockAtomicSetToNull() throws Exception { +SolrInputDocument doc = sdoc("id", "1", +"cat_ss", new String[] {"aaa", "ccc"}, +"child1", sdocs(sdoc("id", "2", "cat_ss", "child"), sdoc("id", "3", "cat_ss", "child"))); +assertU(adoc(doc)); + +BytesRef rootDocId = new BytesRef("1"); +SolrCore core = h.getCore(); +SolrInputDocument block = RealTimeGetComponent.getInputDocument(core, rootDocId, +RealTimeGetComponent.Resolution.ROOT_WITH_CHILDREN); +// assert block doc has child docs +assertTrue(block.containsKey("child1")); + +assertJQ(req("q", "id:1"), "/response/numFound==0"); + +// commit the changes +assertU(commit()); + +SolrInputDocument committedBlock = RealTimeGetComponent.getInputDocument(core, rootDocId, +RealTimeGetComponent.Resolution.ROOT_WITH_CHILDREN); +BytesRef childDocId = new BytesRef("2"); +// ensure the whole block is returned when resolveBlock is true and id of a child doc is provided +assertEquals(committedBlock.toString(), RealTimeGetComponent +.getInputDocument(core, childDocId, RealTimeGetComponent.Resolution.ROOT_WITH_CHILDREN).toString()); + +assertJQ(req("q", "id:1"), "/response/numFound==1"); + +assertJQ(req("qt", "/get", "id", "1", "fl", "id, cat_ss, child1, [child]"), "=={\"doc\":{'id':\"1\"" + +", cat_ss:[\"aaa\",\"ccc\"], child1:[{\"id\":\"2\",\"cat_ss\":[\"child\"]}, {\"id\":\"3\",\"cat_ss\":[\"child\"]}]}}"); + +assertU(commit()); + +assertJQ(req("qt", "/get", "id", "1", "fl", "id, cat_ss, child1, [child]"), "=={\"doc\":{'id':\"1\"" + +", cat_ss:[\"aaa\",\"ccc\"], child1:[{\"id\":\"2\",\"cat_ss\":[\"child\"]}, {\"id\":\"3\",\"cat_ss\":[\"child\"]}]}}"); + +doc = sdoc("id", "1", "child1", Collections.singletonMap("set", null)); +addAndGetVersion(doc, params("wt", "json")); + +assertJQ(req("qt", "/get", "id", "1", "fl", "id, cat_ss, child1, [child]"), "=={\"doc\":{'id':\"1\", cat_ss:[\"aaa\",\"ccc\"]}}"); + +assertU(commit()); + +// a cut-n-paste of the first big query, but this time it will be retrieved from the index rather than the +// transaction log +// this requires ChildDocTransformer to get the whole block, since the document is retrieved using an index lookup +assertJQ(req("qt", "/get", "id", "1", "fl", "id, cat_ss, child1, [child]"), "=={'doc':{'id':'1', cat_ss:[\"aaa\",\"ccc\"]}}"); + +// ensure the whole block has been committed correctly to the index. +assertJQ(req("q", "id:1", "fl", "*, [child]"), +"/response/numFound==1", +"/response/docs/[0]/id=='1'", +"/response/docs/[0]/cat_ss/[0]==\"aaa\"", +"/response/docs/[0]/cat_ss/[1]==\"ccc\""); + } + + @Test + public void testBlockAtomicSetToEmpty() throws Exception { Review comment: You are right, was a long day, but yes changed it . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9042) Refactor TopGroups.merge tests
[ https://issues.apache.org/jira/browse/LUCENE-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980725#comment-16980725 ] Lucene/Solr QA commented on LUCENE-9042: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 20s{color} | {color:red} lucene_grouping generated 3 new + 108 unchanged - 0 fixed = 111 total (was 108) {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} grouping in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 4m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-9042 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12986537/LUCENE-9042.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 312431b | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | javac | https://builds.apache.org/job/PreCommit-LUCENE-Build/237/artifact/out/diff-compile-javac-lucene_grouping.txt | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/237/testReport/ | | modules | C: lucene/grouping U: lucene/grouping | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/237/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Refactor TopGroups.merge tests > -- > > Key: LUCENE-9042 > URL: https://issues.apache.org/jira/browse/LUCENE-9042 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Diego Ceccarelli >Priority: Minor > Attachments: LUCENE-9042.patch > > > This task proposes a refactoring of the test coverage for the > {{TopGroups.merge}} method implemented in LUCENE-9010. For now it will cover > only 3 main cases. > 1. Merging to empty TopGroups > 2. Merging a TopGroups with scores and a TopGroups without scores (currently > broken because of LUCENE-8996 bug) > 3. Merging two TopGroups with scores. > I'm planning to increase the coverage testing also invalid inputs but I would > do that in a separate PR to keep the code readable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org