[jira] [Comment Edited] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

Andrzej Bialecki (JIRA) Tue, 29 Jan 2019 05:26:34 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755007#comment-16755007
 ]


Andrzej Bialecki  edited comment on SOLR-11127 at 1/29/19 1:25 PM:
-------------------------------------------------------------------

My plan of attack is to implement a collection command that orchestrates the 
following steps:
 * create a temporary collection with a unique name, eg. 
{{.tmpCollection_123}}, using the updated {{.system}} schema
 * define an alias that points {{.system -> .tmpCollection_123}}. This should 
redirect all updates and queries to the temp collection.
 * copy the documents from {{.system}} to the temp collection, avoiding 
overwriting updated docs (incremental updates won't work during this process, 
but AFAIK no Solr component uses incremental updates when indexing to 
{{.system}})
 * delete the original {{.system}} and create it again using the updated schema.
 * remove the alias
 * copy over the documents from temporary collection to {{.system}}, again 
avoiding overwrites.

The collection command will take care of async processing, resuming the 
operation on Overseer restarts, etc.

I considered doing this as a sort of rolling in-place update but this wouldn't 
be any less expensive and I think it would have been impossible to do (and to 
get it right) - updated schema uses points instead of trie fields for the same 
fields.

Comments and feedback are welcome (thanks [~janhoy] for useful suggestions).

(Also, given that the 8.0 release is imminent I'm not sure I can fix this in 
time for the 8.0 release.)


was (Author: ab):
My plan of attack is to implement a collection command that orchestrates the 
following steps:
 * create a temporary collection with a unique name, eg. {{tmpCollection_123}}, 
using the updated {{.system}} schema
 * define an alias that points {{.system -> tmpCollection_123}}. This should 
redirect all updates and queries to the temp collection.
 * copy the documents from {{.system}} to the temp collection, avoiding 
overwriting updated docs (incremental updates won't work during this process, 
but AFAIK no Solr component uses incremental updates when indexing to 
{{.system}})
 * delete the original {{.system}} and create it again using the updated schema.
 * remove the alias
 * copy over the documents from temporary collection to {{.system}}, again 
avoiding overwrites.

The collection command will take care of async processing, resuming the 
operation on Overseer restarts, etc.

I considered doing this as a sort of rolling in-place update but this wouldn't 
be any less expensive and I think it would have been impossible to do (and to 
get it right) - updated schema uses points instead of trie fields for the same 
fields.

Comments and feedback are welcome (thanks [~janhoy] for useful suggestions).

(Also, given that the 8.0 release is imminent I'm not sure I can fix this in 
time for the 8.0 release.)

> Add a Collections API command to migrate the .system collection schema from 
> Trie-based (pre-7.0) to Points-based (7.0+)
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11127
>                 URL: https://issues.apache.org/jira/browse/SOLR-11127
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Steve Rowe
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>              Labels: numeric-tries-to-points
>             Fix For: 8.0
>
>
> SOLR-11119 will switch the Trie fieldtypes in the .system collection's schema 
> to Points.
> Users with pre-7.0 .system collections will no longer be able to use them 
> once Trie fields have been removed (8.0).
> Solr should provide a Collections API command MIGRATESYSTEMCOLLECTION to 
> automatically convert a Trie-based .system collection to a Points-based one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-11127) Add a Collections API command to migrate the .system collection schema from Trie-based (pre-7.0) to Points-based (7.0+)

Reply via email to