[
https://issues.apache.org/jira/browse/SOLR-16712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Patson Luk updated SOLR-16712:
------------------------------
Description:
The current implementation of PRS requires an extra param to the DocCollection,
the `PrsSupplier`, when `get` is called, would fetch the PRS states from ZK.
The implementation of such supplier `LazyPrsSupplier` would only fetch the
state on first call.
While this flow does work properly, this flow might introduce some unnecessary
complexity:
# PRS entry fetching from ZK is done either during or after the
`DocCollection` construction, this could be a bit inconsistent with existing
non PRS `DocCollection` design which `DocCollection` is simply a immutable
container that does not fetch data after its instantiation
# The lazy fetching could introduce some uncertainties as to when exactly the
fetching happens (and if any Zookeeper IO exceptions arises)
My guess was that the lazy loading was introduced in
https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS
states multiple times in the ctor of `DocCollection`, however, if we only fetch
the `PerReplicaStates` once on update before calling the `DocCollection` ctor,
and pass the `PerReplicaStates` object to the `DocCollection` instead, it can
probably achieve similar result but with reduced uncertainty after
`DocCollection` construction.
There's another branch which experimented with making DocCollection, Slice and
Replica immutable as well for PRS enabled collection
[https://github.com/cowpaths/fullstory-solr/pull/84] but is beyond the
discussion of this Jira ticket
was:
The current implementation of PRS requires an extra param to the DocCollection,
the `PrsSupplier`, when `get` is called, would fetch the PRS states from ZK.
The implementation of such supplier `LazyPrsSupplier` would only fetch the
state on first call.
While this flow does work properly, this flow might introduce some unnecessary
complexity:
# PRS entry fetching from ZK is done either during or after the
`DocCollection` construction, this could be a bit inconsistent with existing
non PRS `DocCollection` design which `DocCollection` is simply a immutable
container that does not fetch data after its instantiation
# The lazy fetching could introduce some uncertainties as to when exactly the
fetching happens (and if any Zookeeper IO exceptions arises)
My guess was that the lazy loading was introduced in
https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS
states multiple times in the ctor of `DocCollection`, however, if we only fetch
the `PerReplicaStates` once on update before calling the `DocCollection` ctor,
and pass the `PerReplicaStates` object to the `DocCollection` instead, it can
probably achieve similar result but with reduced uncertainty after
`DocCollection` construction.
> Simplify PerReplicaStates (PRS) logic in DocCollection, replace PrsSupplier
> with actual PerReplicaStates param
> --------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-16712
> URL: https://issues.apache.org/jira/browse/SOLR-16712
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: main (10.0), 9.1.1
> Reporter: Patson Luk
> Priority: Major
>
> The current implementation of PRS requires an extra param to the
> DocCollection, the `PrsSupplier`, when `get` is called, would fetch the PRS
> states from ZK. The implementation of such supplier `LazyPrsSupplier` would
> only fetch the state on first call.
>
> While this flow does work properly, this flow might introduce some
> unnecessary complexity:
> # PRS entry fetching from ZK is done either during or after the
> `DocCollection` construction, this could be a bit inconsistent with existing
> non PRS `DocCollection` design which `DocCollection` is simply a immutable
> container that does not fetch data after its instantiation
> # The lazy fetching could introduce some uncertainties as to when exactly
> the fetching happens (and if any Zookeeper IO exceptions arises)
>
> My guess was that the lazy loading was introduced in
> https://issues.apache.org/jira/browse/SOLR-16580 as to avoid fetching the PRS
> states multiple times in the ctor of `DocCollection`, however, if we only
> fetch the `PerReplicaStates` once on update before calling the
> `DocCollection` ctor, and pass the `PerReplicaStates` object to the
> `DocCollection` instead, it can probably achieve similar result but with
> reduced uncertainty after `DocCollection` construction.
>
> There's another branch which experimented with making DocCollection, Slice
> and Replica immutable as well for PRS enabled collection
> [https://github.com/cowpaths/fullstory-solr/pull/84] but is beyond the
> discussion of this Jira ticket
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]