gerlowskija edited a comment on issue #301:
URL: https://github.com/apache/solr-operator/issues/301#issuecomment-896118995
> The only thing the operator should do for these "native" backup options,
is to call the Solr API right?
That's what I'm proposing, yep - the operator wouldn't be doing any of the
compression or relocation features for GCS that it currently supports for
'local' backups. It's "just" calling the Solr API. (Which, I'd contend, isn't
"nothing". That still saves Ops folks from crafting their own solr.xml, from
needing to learn Solr's backup and async-polling APIs, etc.)
> the only real benefit would be to have the operator be able to do this on
a schedule
Definitely agree. As I said above, I think there's value in this ticket
alone. But GCS-support gets much more appealing as the operator's backup
featureset generally gets more robust. I love the idea of a "backupschedule"
entity that creates individual solrbackup objects in turn. I'll file an issue
for that as a placeholder for discussion.
> I think we could change the SolrBackup to do either "managed" or "remote"
backups
I think I agree with your suggestions here, but let me restate a few of them
to make sure I'm understanding you correctly. There's a point or two I'm
unclear on.
1. I see what you're getting at with the "managed" vs "remote" distinction,
but I'm not sure whether and where you imagine that appearing in the yaml
configs. Are you suggesting an explicit setting on 'solrbackup'? Or that it
be implicit based on the value of the 'repository' setting you mention?
2. Letting users specify a "backupRepositoryName" on their 'solrbackup'
makes sense to me. And further it implies that a user should be able to
configure multiple sets of backup configs in their solrcloud's
backupRestoreOptions setting. (i.e. configuring local backup settings and gcs
backup settings aren't mutually exclusive - we'd support use of both within the
same solrcloud.)
3. It seems like you're on the fence about having the operator bootstrap
required buckets/locations, and don't have a strong opinion there. I lean
towards skipping that bc of the complexity of bringing in S3, GCS, etc. clients
to the operator - at least until we get feedback from users that it'd be worth
it, but also have mixed feelings about it.
So taking those suggestions, our new example CRDs would look something like:
**solrcloud**
```
...
dataStorage:
...
backupRestoreOptions:
- name: "my-gcs"
type: "gcs" // A new enum-ish field - either 'local', or 'gcs'
bucket: "solr-log-test"
gcsCredentialSecret: "my-gcs-secret"
baseLocation: "logs"
- name: "my-local"
type: "local"
default: true
...
```
**solrbackup (gcs)**
```
apiVersion: solr.apache.org/v1beta1
kind: SolrBackup
metadata:
name: gcs_techproducts_backup
namespace: default
spec:
solrCloud: jasons_cluster
repository: "my-gcs"
location: "logs_alt"
collections:
- techproducts
```
**solrbackup (local)**
```
apiVersion: solr.apache.org/v1beta1
kind: SolrBackup
metadata:
name: local_techproducts_backup
namespace: default
spec:
solrCloud: jasons_cluster
repository: "my-local"
location: "logs_alt"
persistence: // Ignored if 'my-local' repository isn't of a type that
supports "managed" backups. (i.e. type=local)
volume:
source:
persistentVolumeClaim:
claimName: "pvc-test"
collections:
- techproducts
```
Those could be off a bit based on what you meant regarding the "managed" v.
"remote" flag. Does this look closer to what you were thinking? @HoustonPutman
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]