Re: [VOTE] Release Lucene/Solr 8.8.0 RC2

2021-01-27 Thread Anshum Gupta
Thanks for handing the release, Noble!

+1 (binding)

SUCCESS! [0:56:12.016387]

Ran the smoke tester, a demo app, and checked the change log. All of that
looks good.

On Mon, Jan 25, 2021 at 2:22 AM Noble Paul  wrote:

> Please vote for release candidate 2 for Lucene/Solr 8.8.0
>
> The artifacts can be downloaded from:
>
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.8.0-RC2-revb10659f0fc18b58b90929cfdadde94544d202c4a/
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.8.0-RC2-revb10659f0fc18b58b90929cfdadde94544d202c4a/
>
>
>
> The vote will be open for at least 72 hours
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
> --
> -
> Noble Paul
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-- 
Anshum Gupta


Re: [VOTE] Release Lucene/Solr 8.8.0 RC2

2021-01-27 Thread Atri Sharma
+1 (binding)

SUCCESS! [1:24:26.38423]

On Mon, Jan 25, 2021 at 3:52 PM Noble Paul  wrote:
>
> Please vote for release candidate 2 for Lucene/Solr 8.8.0
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.8.0-RC2-revb10659f0fc18b58b90929cfdadde94544d202c4a/
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.8.0-RC2-revb10659f0fc18b58b90929cfdadde94544d202c4a/
>
>
>
> The vote will be open for at least 72 hours
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
> --
> -
> Noble Paul
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>


-- 
Regards,

Atri
Apache Concerted

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene/Solr 8.8.0 RC2

2021-01-27 Thread Haoyu Zhai
+1 (non-binding)

Tested Lucene part of RC1 on our service, since that part is not changed so
still +1.

Patrick

Namgyu Kim  于2021年1月27日周三 上午10:26写道:

> +1 (binding)
>
> SUCCESS! [1:30:27.376324]
>
> On Tue, Jan 26, 2021 at 10:19 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> +1 (binding)
>>
>>
>> SUCCESS! [0:43:40.201461]
>>
>>
>> However, the first time I ran smoke tester, it failed with this:
>>
>>[junit4] Tests with failures [seed: D3F97A1F3602195A]:
>>
>>[junit4]   -
>> org.apache.solr.cloud.LeaderTragicEventTest.testLeaderFailsOver
>>
>>
>>[junit4]   2> NOTE: reproduce with: ant test  
>> -Dtestcase=LeaderTragicEventTest
>> -Dtests.method=testLeaderFailsOver -Dtests.seed=D3F97A1F3602195A
>> -Dtests.locale=ar-LB -Dtests.timezone=SystemV/MST7MDT -Dtests.asserts=true
>> -Dtes\
>>
>> ts.file.encoding=US-ASCII
>>
>>[junit4] ERROR   10.9s J1 | LeaderTragicEventTest.testLeaderFailsOver
>> <<<
>>
>>[junit4]> Throwable #1:
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
>> from server at https://127.0.0.1:33003/solr: Underlying core creation
>> failed while creating collection: testLeaderFailsO\
>>
>> ver
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1171)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:934)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:866)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:231)
>>
>>[junit4]>at
>> org.apache.solr.cloud.LeaderTragicEventTest.testLeaderFailsOver(LeaderTragicEventTest.java:80)
>>
>>[junit4]>at
>> java.lang.Thread.run(Thread.java:748)Throwable #2:
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
>> from server at https://127.0.0.1:33003/solr: Could not find collection :
>> \
>>
>> testLeaderFailsOver
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1171)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:934)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:866)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
>>
>>[junit4]>at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:231)
>>
>>[junit4]>at
>> org.apache.solr.cloud.LeaderTragicEventTest.tearDown(LeaderTragicEventTest.java:73)
>>
>>[junit4]>at java.lang.Thread.run(Thread.java:748)
>>
>>
>> I guess it was a transient failure -- I re-ran smoke tester and it passed
>> the 2nd time.  Is this a known Bad Apple test?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Jan 26, 2021 at 4:53 AM Ignacio Vera  wrote:
>>
>>> +1 (binding)
>>>
>>> SUCCESS! [0:53:01.546134]
>>>
>>> On Tue, Jan 26, 2021 at 1:51 AM Tomás Fernández Löbbe <
>>> tomasflo...@gmail.com> wrote:
>>>
 Thanks Noble! And thanks for fixing that concurrency issue, I'd hit it
 but didn't have time to investigate it.

 +1
 SUCCESS! [0:58:32.036482]

 On Mon, Jan 25, 2021 at 10:19 AM Timothy Potter 
 wrote:

> Thanks Noble!
>
> +1 SUCCESS! [1:24:28.212370] (my internet is super slow today)
>
> Re-ran all the Solr 

Re: Is it Time to Deprecate the Legacy Facets API

2021-01-27 Thread Joel Bernstein
It's worth investigating deprecating the stats component also. I believe
JSON facets covers that functionality as well. It will be painful for users
though to switch over unfortunately.


Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Jan 22, 2021 at 1:14 PM Jason Gerlowski 
wrote:

> Personally I'd love to see us stop maintaining the duplicated code of
> the underlying implementations.  I wouldn't mind losing the legacy
> syntax as well - I'll take a clear, verbose API over a less-clear,
> concise one any day.  But I'm probably a minority there.
>
> Either way I agree with Michael when he said above that the first step
> would have to be a parity investigation for features and performance.
>
> Best,
>
> Jason
>
> On Fri, Jan 22, 2021 at 10:05 AM Michael Gibney
>  wrote:
> >
> > I agree it would make long-term sense to consolidate the backend
> implementation. I think leaving the "classic" user-facing facet API (with
> JSON Facet module as a backend) would be a good idea. Either way, I think a
> first step would be checking for parity between existing backend
> implementations -- possibly in terms of features [1], but certainly in
> terms of performance for common use cases [2].
> >
> > I think removal of the "classic" user-facing API would cause a lot of
> consternation in the user community. I can even see a
> non-backward-compatibility argument for preserving the "classic"
> user-facing API: it's simpler for simple use cases. _If_ the ultimate goal
> is removal of the "classic" user-facing API (not presuming that it is),
> that approach could be facilitated in the short term by enticing users
> towards "JSON Facet" API ... basically with a "feature freeze" on the
> legacy implementation. No new features [3], no new optimizations [4] for
> "classic"; concentrate such efforts on JSON Facet. This seems to already be
> the de facto case, but it could be a more intentional decision -- e.g. in
> [3] it's straightforward to extend the the proposed "facet cache" to the
> "classic" impl ... but I could see an argument for intentionally not doing
> so.
> >
> > Robert, I think your concerns about UninvertedField could be addressed
> by the `uninvertible="false"` property (currently defaults to "true" for
> backward compatibility iiuc; but could default to "false", or at least
> provide the ability to set the default for all fields to "false" at node
> level solr.xml? -- I know I've wished for the latter!). Also fwiw I'm not
> aware of any JSON Facet processors that work with string values in RAM ...
> I do think all JSON Facet processors use OrdinalMap now, where relevant.
> >
> > [1] https://issues.apache.org/jira/browse/SOLR-14921
> > [2] https://issues.apache.org/jira/browse/SOLR-14764
> > [3] https://issues.apache.org/jira/browse/SOLR-13807
> > [4] https://issues.apache.org/jira/browse/SOLR-10732
> >
> > On Fri, Jan 22, 2021 at 12:46 AM Robert Muir  wrote:
> >>
> >> Do these two options conflate concerns of input format vs. actual
> >> algorithm? That was always my disappointment.
> >>
> >> I feel like the java apis are off here at the lower level, and it
> >> hurts the user.
> >> I don't talk about the input format from the user, instead I mean the
> >> execution of the faceting query.
> >>
> >> IMO: building top-level caches (e.g. uninvertedfield) or
> >> on-the-fly-caches (e.g. fieldcache) is totally trappy already.
> >> But with the uninvertedfield of json facets it does its own thing,
> >> even if you went thru the trouble to enable docvalues at index time:
> >> that's sad.
> >>
> >> the code by default should not give the user jvm
> >> heap/garbage-collector hell. If you want to do that to yourself, for a
> >> totally static index, IMO that should be opt-in.
> >>
> >> But for the record, it is no longer just two shitty choices like
> >> "top-level vs per-segment". There are different field types, e.g.
> >> numeric types where the per-segment approach works efficiently.
> >> Then you have the strings, but there is a newish middle ground for
> >> Strings: OrdinalMap (lucene Multi* interfaces do it) which builds
> >> top-level integers structures to speed up string-faceting, but doesnt
> >> need *string values* in ram.
> >> It is just integers and mostly compresses as deltas. Adrien compresses
> >> the shit out of it.
> >>
> >> So I'd hate for the user to lose the option here of using docvalues to
> >> keep faceting out of heap memory, which should not be hassling them
> >> already in 2021.
> >> Maybe better to refactor the code such that all these concerns aren't
> >> unexpectedly tied together.
> >>
> >> On Thu, Jan 21, 2021 at 10:08 PM David Smiley 
> wrote:
> >> >
> >> > There's a JIRA issue about this from 5 years ago:
> https://issues.apache.org/jira/browse/SOLR-7296
> >> > I don't recall seeing any resistance to the idea of having the JSON
> Faceting module act as a back-end to the front-end (API surface) of Solr's
> common/classic/original/whatever faceting API.  I don't think that simple
> 

Re: [VOTE] Release Lucene/Solr 8.8.0 RC2

2021-01-27 Thread Namgyu Kim
+1 (binding)

SUCCESS! [1:30:27.376324]

On Tue, Jan 26, 2021 at 10:19 PM Michael McCandless <
luc...@mikemccandless.com> wrote:

> +1 (binding)
>
>
> SUCCESS! [0:43:40.201461]
>
>
> However, the first time I ran smoke tester, it failed with this:
>
>[junit4] Tests with failures [seed: D3F97A1F3602195A]:
>
>[junit4]   -
> org.apache.solr.cloud.LeaderTragicEventTest.testLeaderFailsOver
>
>
>[junit4]   2> NOTE: reproduce with: ant test  
> -Dtestcase=LeaderTragicEventTest
> -Dtests.method=testLeaderFailsOver -Dtests.seed=D3F97A1F3602195A
> -Dtests.locale=ar-LB -Dtests.timezone=SystemV/MST7MDT -Dtests.asserts=true
> -Dtes\
>
> ts.file.encoding=US-ASCII
>
>[junit4] ERROR   10.9s J1 | LeaderTragicEventTest.testLeaderFailsOver
> <<<
>
>[junit4]> Throwable #1:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at https://127.0.0.1:33003/solr: Underlying core creation
> failed while creating collection: testLeaderFailsO\
>
> ver
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1171)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:934)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:866)
>
>[junit4]>at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
>
>[junit4]>at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:231)
>
>[junit4]>at
> org.apache.solr.cloud.LeaderTragicEventTest.testLeaderFailsOver(LeaderTragicEventTest.java:80)
>
>[junit4]>at java.lang.Thread.run(Thread.java:748)Throwable
> #2: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at https://127.0.0.1:33003/solr: Could not find
> collection : \
>
> testLeaderFailsOver
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1171)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:934)
>
>[junit4]>at
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:866)
>
>[junit4]>at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
>
>[junit4]>at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:231)
>
>[junit4]>at
> org.apache.solr.cloud.LeaderTragicEventTest.tearDown(LeaderTragicEventTest.java:73)
>
>[junit4]>at java.lang.Thread.run(Thread.java:748)
>
>
> I guess it was a transient failure -- I re-ran smoke tester and it passed
> the 2nd time.  Is this a known Bad Apple test?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Jan 26, 2021 at 4:53 AM Ignacio Vera  wrote:
>
>> +1 (binding)
>>
>> SUCCESS! [0:53:01.546134]
>>
>> On Tue, Jan 26, 2021 at 1:51 AM Tomás Fernández Löbbe <
>> tomasflo...@gmail.com> wrote:
>>
>>> Thanks Noble! And thanks for fixing that concurrency issue, I'd hit it
>>> but didn't have time to investigate it.
>>>
>>> +1
>>> SUCCESS! [0:58:32.036482]
>>>
>>> On Mon, Jan 25, 2021 at 10:19 AM Timothy Potter 
>>> wrote:
>>>
 Thanks Noble!

 +1 SUCCESS! [1:24:28.212370] (my internet is super slow today)

 Re-ran all the Solr operator tests and verified the Cloud graph UI
 renders correctly now.

 On Mon, Jan 25, 2021 at 3:22 AM Noble Paul 
 wrote:

> Please vote for release candidate 2 for Lucene/Solr 8.8.0
>
> The artifacts can be downloaded from:
>
>
> 

Solr 7.7.2 IndexWriter is closed as a result of NullPointerException at SchemaSimilarityFactory.SchemaSimilarity.get

2021-01-27 Thread evgeniak
Issue:
IndexWriter of a specific core has been closed as a result of
NullPointerException at "SchemaSimilarityFactory.SchemaSimilarity.get"  when
updating one of the documents.
After this exception, Solr stops indexing the next documents to this
specific core and we have to restart the Solr process in order to reopen
IndexWriter of this specific core.

Cause:
Raise condition between multiple threads that perform read/write to the same
object.
In this case, it is *volatile SolrCore core* in SchemaSimilarityFactory.
One thread calls inform() of SchemaSimilarityFactory with the new object of
SolrCore that is still under initialization (write operation)
When at the same time, another thread performs core.getLatestSchema() (at
SchemaSimilarityFactory.SchemaSimilarity.get)  of this new SolrCore object
that still has not been fully initialized (read operation)

When inform of SchemaSimilarityFactory is called (write operation):
•   During the creation of the new core
•   During uploading transient core to transient cache
•   During loading non-transient core to the memory (during Solr startup by
coreLoadExecutor thread)

When  SchemaSimilarityFactory.SchemaSimilarity.get Similarity is called
(read operation)
•   During indexing document

It seems like a bug in SolrCore.setLatestSchema()
The infrom() is called before initialization of schema

Stack of thread that performs inform:
at
org.apache.solr.search.similarities.SchemaSimilarityFactory.inform(SchemaSimilarityFactory.java:97)
at org.apache.solr.core.SolrCore.setLatestSchema(SolrCore.java:319)
at org.apache.solr.core.SolrCore.initSchema(SolrCore.java:1139)
at org.apache.solr.core.SolrCore.(SolrCore.java:947)
at org.apache.solr.core.SolrCore.(SolrCore.java:870)
at
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1189)
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1721)
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:249)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:469)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)


Stack of thread that performs get:
org.apache.solr.common.SolrException: Exception writing document id *** to
the index; possible analysis error.
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:250)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1002)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:1233)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$2(DistributedUpdateProcessor.java:1082)
at
org.apache.solr.update.processor.DistributedUpdateProcessor$$Lambda$344/A4008F60.apply(Unknown
Source)
at
org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1082)
at

Re: Merging segment parts concurrently (SegmentMerger)

2021-01-27 Thread Michael McCandless
LOL Mike did use http://jirasearch.mikemccandless.com, our dog food Lucene
search application demonstrating many of Lucene's features (
http://blog.mikemccandless.com/2016/10/jiraseseach-20-dog-food-using-lucene-to.html),
but it was NOT easy to find!

I think I had one lonely brain cell still insisting we had indeed talked
about this somewhat recently :)

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jan 27, 2021 at 6:43 AM Michael Sokolov  wrote:

> I thought I remembered the discussion, searched for the issue in jira, but
> could not find. Probably Mike used his souped up search?
>
> On Wed, Jan 27, 2021, 3:07 AM Dawid Weiss  wrote:
>
>> Darn... I swear sometimes, when I try hard enough, I can hear my brain
>> cells giving up to atrophy... Sigh.
>>
>>
>> D.
>>
>> On Wed, Jan 27, 2021 at 4:44 AM David Smiley  wrote:
>> >
>> > LOL and it was Dawid :-)  Having amnesia Dawid?
>> > I think I've re-explored my own ideas before too.
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>> >
>> >
>> > On Tue, Jan 26, 2021 at 5:39 PM Michael McCandless <
>> luc...@mikemccandless.com> wrote:
>> >>
>> >> Oh I found this long ago (well, ~2 years) issue exploring this:
>> https://issues.apache.org/jira/browse/LUCENE-8580
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Tue, Jan 26, 2021 at 3:38 PM Dawid Weiss 
>> wrote:
>> >>>
>> >>> > +1 to make a single merge concurrent!  It is horribly frustrating
>> to watch that last merge running on a single core :)  I have lost many
>> hours of my life to this frustration.
>> >>>
>> >>> > Yeah... it is, isn't it? Especially on new machines where you have
>> super-fast SSDs, countless cores, etc... That last merge consumes so few
>> resources that the computer feels practically idle... it's hard to explain
>> to people using our software (who invested in hardware) why we're basically
>> doing nothing... :)
>> >>>
>> >>> > I do think we need to explore concurrency within terms/postings
>> across fields in one segment to really see gains in the common case where
>> merge time is dominated by postings.
>> >>>
>> >>> Yeah, probably.
>> >>>
>> >>> > if you want to experiment with something like that, you can
>> hackishly simulate it today to quickly see the overhead, correct? its a
>> small hack to PerFieldPostingsFormat to force it to emit "files-per-field"
>> and then CFS will combine it all together.
>> >>>
>> >>> Good idea, Robert. I'll try this.
>> >>>
>> >>> > By default merging stored fields is super fast because Lucene can
>> copy compressed data directly, but if there are deletes or index sorting is
>> enabled this optimization is not applicable anymore and I wouldn't be
>> surprised if stored fields started taking non negligible time.
>> >>>
>> >>> In this case these segments are essentially made from scratch but with
>> >>> lots and lots of term vectors and postings... But the more parallel
>> >>> stages we can introduce, the better.
>> >>>
>> >>> I have some other stuff on my plate before I can dive deep into this
>> >>> but I eventually will. Thanks for the pointers, everyone - helpful!
>> >>>
>> >>> D.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>


Re: Merging segment parts concurrently (SegmentMerger)

2021-01-27 Thread Michael Sokolov
I thought I remembered the discussion, searched for the issue in jira, but
could not find. Probably Mike used his souped up search?

On Wed, Jan 27, 2021, 3:07 AM Dawid Weiss  wrote:

> Darn... I swear sometimes, when I try hard enough, I can hear my brain
> cells giving up to atrophy... Sigh.
>
>
> D.
>
> On Wed, Jan 27, 2021 at 4:44 AM David Smiley  wrote:
> >
> > LOL and it was Dawid :-)  Having amnesia Dawid?
> > I think I've re-explored my own ideas before too.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Tue, Jan 26, 2021 at 5:39 PM Michael McCandless <
> luc...@mikemccandless.com> wrote:
> >>
> >> Oh I found this long ago (well, ~2 years) issue exploring this:
> https://issues.apache.org/jira/browse/LUCENE-8580
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Tue, Jan 26, 2021 at 3:38 PM Dawid Weiss 
> wrote:
> >>>
> >>> > +1 to make a single merge concurrent!  It is horribly frustrating to
> watch that last merge running on a single core :)  I have lost many hours
> of my life to this frustration.
> >>>
> >>> > Yeah... it is, isn't it? Especially on new machines where you have
> super-fast SSDs, countless cores, etc... That last merge consumes so few
> resources that the computer feels practically idle... it's hard to explain
> to people using our software (who invested in hardware) why we're basically
> doing nothing... :)
> >>>
> >>> > I do think we need to explore concurrency within terms/postings
> across fields in one segment to really see gains in the common case where
> merge time is dominated by postings.
> >>>
> >>> Yeah, probably.
> >>>
> >>> > if you want to experiment with something like that, you can
> hackishly simulate it today to quickly see the overhead, correct? its a
> small hack to PerFieldPostingsFormat to force it to emit "files-per-field"
> and then CFS will combine it all together.
> >>>
> >>> Good idea, Robert. I'll try this.
> >>>
> >>> > By default merging stored fields is super fast because Lucene can
> copy compressed data directly, but if there are deletes or index sorting is
> enabled this optimization is not applicable anymore and I wouldn't be
> surprised if stored fields started taking non negligible time.
> >>>
> >>> In this case these segments are essentially made from scratch but with
> >>> lots and lots of term vectors and postings... But the more parallel
> >>> stages we can introduce, the better.
> >>>
> >>> I have some other stuff on my plate before I can dive deep into this
> >>> but I eventually will. Thanks for the pointers, everyone - helpful!
> >>>
> >>> D.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Merging segment parts concurrently (SegmentMerger)

2021-01-27 Thread Dawid Weiss
Darn... I swear sometimes, when I try hard enough, I can hear my brain
cells giving up to atrophy... Sigh.


D.

On Wed, Jan 27, 2021 at 4:44 AM David Smiley  wrote:
>
> LOL and it was Dawid :-)  Having amnesia Dawid?
> I think I've re-explored my own ideas before too.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Jan 26, 2021 at 5:39 PM Michael McCandless 
>  wrote:
>>
>> Oh I found this long ago (well, ~2 years) issue exploring this: 
>> https://issues.apache.org/jira/browse/LUCENE-8580
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Jan 26, 2021 at 3:38 PM Dawid Weiss  wrote:
>>>
>>> > +1 to make a single merge concurrent!  It is horribly frustrating to 
>>> > watch that last merge running on a single core :)  I have lost many hours 
>>> > of my life to this frustration.
>>>
>>> > Yeah... it is, isn't it? Especially on new machines where you have 
>>> > super-fast SSDs, countless cores, etc... That last merge consumes so few 
>>> > resources that the computer feels practically idle... it's hard to 
>>> > explain to people using our software (who invested in hardware) why we're 
>>> > basically doing nothing... :)
>>>
>>> > I do think we need to explore concurrency within terms/postings across 
>>> > fields in one segment to really see gains in the common case where merge 
>>> > time is dominated by postings.
>>>
>>> Yeah, probably.
>>>
>>> > if you want to experiment with something like that, you can hackishly 
>>> > simulate it today to quickly see the overhead, correct? its a small hack 
>>> > to PerFieldPostingsFormat to force it to emit "files-per-field" and then 
>>> > CFS will combine it all together.
>>>
>>> Good idea, Robert. I'll try this.
>>>
>>> > By default merging stored fields is super fast because Lucene can copy 
>>> > compressed data directly, but if there are deletes or index sorting is 
>>> > enabled this optimization is not applicable anymore and I wouldn't be 
>>> > surprised if stored fields started taking non negligible time.
>>>
>>> In this case these segments are essentially made from scratch but with
>>> lots and lots of term vectors and postings... But the more parallel
>>> stages we can introduce, the better.
>>>
>>> I have some other stuff on my plate before I can dive deep into this
>>> but I eventually will. Thanks for the pointers, everyone - helpful!
>>>
>>> D.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org