mchennupati opened a new issue, #726: URL: https://github.com/apache/solr-operator/issues/726
I am restoring a large index (655G) that is currently on google cloud storage to a new solr cloud on kubernetes instance. I am trying to understand how much space I need to allocate to each of my node pvcs. I am currently using the collections api, with async to restore a collection saved in gcs. When I check my disk usage for /var/solr/data on each of the nodes, it looks like this. So each of them appears to be downloading the entire index. I initially allocated 500G to each of the pvcs but that turned out to be too little. I am now doing it with 700G. Is this expected behaviour or am I doing something wrong ? One would have expected the metadata has enough information to download the index in parts and not do it 655G x 3. It's cost me a fair bit in network costs already as I reiterate :) In general, how would one restore a large index, I did not find a solrrestore similar to solrbackups in the solr operator crds. So I ran an async job using the solr collections api. Thanks ! /var/solr/data$ du 4 ./userfiles 4 ./backup-restore/gcs-backups/gcscredential/..2024_10_11_06_16_24.1266852566 4 ./backup-restore/gcs-backups/gcscredential 8 ./backup-restore/gcs-backups 12 ./backup-restore 4 ./filestore 4 ./mycoll_shard3_replica_n3/data/tlog 4 ./mycoll_shard3_replica_n3/data/snapshot_metadata 8 ./mycoll_shard3_replica_n3/data/index 85744132 ./mycoll_shard3_replica_n3/data/restore.20241011062904489 85744152 ./mycoll_shard3_replica_n3/data 85744160 ./mycoll_shard3_replica_n3 85744192 . solr@mycoll-solrcloud-0:/var/solr/data$ du -sh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
