[
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755007#comment-16755007
]
Andrzej Bialecki edited comment on SOLR-11127 at 1/29/19 1:25 PM:
-------------------------------------------------------------------
My plan of attack is to implement a collection command that orchestrates the
following steps:
* create a temporary collection with a unique name, eg.
{{.tmpCollection_123}}, using the updated {{.system}} schema
* define an alias that points {{.system -> .tmpCollection_123}}. This should
redirect all updates and queries to the temp collection.
* copy the documents from {{.system}} to the temp collection, avoiding
overwriting updated docs (incremental updates won't work during this process,
but AFAIK no Solr component uses incremental updates when indexing to
{{.system}})
* delete the original {{.system}} and create it again using the updated schema.
* remove the alias
* copy over the documents from temporary collection to {{.system}}, again
avoiding overwrites.
The collection command will take care of async processing, resuming the
operation on Overseer restarts, etc.
I considered doing this as a sort of rolling in-place update but this wouldn't
be any less expensive and I think it would have been impossible to do (and to
get it right) - updated schema uses points instead of trie fields for the same
fields.
Comments and feedback are welcome (thanks [~janhoy] for useful suggestions).
(Also, given that the 8.0 release is imminent I'm not sure I can fix this in
time for the 8.0 release.)
was (Author: ab):
My plan of attack is to implement a collection command that orchestrates the
following steps:
* create a temporary collection with a unique name, eg. {{tmpCollection_123}},
using the updated {{.system}} schema
* define an alias that points {{.system -> tmpCollection_123}}. This should
redirect all updates and queries to the temp collection.
* copy the documents from {{.system}} to the temp collection, avoiding
overwriting updated docs (incremental updates won't work during this process,
but AFAIK no Solr component uses incremental updates when indexing to
{{.system}})
* delete the original {{.system}} and create it again using the updated schema.
* remove the alias
* copy over the documents from temporary collection to {{.system}}, again
avoiding overwrites.
The collection command will take care of async processing, resuming the
operation on Overseer restarts, etc.
I considered doing this as a sort of rolling in-place update but this wouldn't
be any less expensive and I think it would have been impossible to do (and to
get it right) - updated schema uses points instead of trie fields for the same
fields.
Comments and feedback are welcome (thanks [~janhoy] for useful suggestions).
(Also, given that the 8.0 release is imminent I'm not sure I can fix this in
time for the 8.0 release.)
> Add a Collections API command to migrate the .system collection schema from
> Trie-based (pre-7.0) to Points-based (7.0+)
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-11127
> URL: https://issues.apache.org/jira/browse/SOLR-11127
> Project: Solr
> Issue Type: Task
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Steve Rowe
> Assignee: Andrzej Bialecki
> Priority: Blocker
> Labels: numeric-tries-to-points
> Fix For: 8.0
>
>
> SOLR-11119 will switch the Trie fieldtypes in the .system collection's schema
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to
> automatically convert a Trie-based .system collection to a Points-based one.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]