[ 
https://issues.apache.org/jira/browse/SOLR-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396750#comment-17396750
 ] 

Jason Gerlowski commented on SOLR-15500:
----------------------------------------

bq. Do you compare only index file names or also timestamp and size during 
incremental backup?

Yes, all of the above.  Each index file has this information (plus a checksum) 
stored in a shard-level metadata file.  We could leave these metadata files 
unzipped and compress the rest, and I _think_  that'd help at backup-time, but 
still leaves some challenges at restore-time.  The first "restore" would be 
able to read the metadata files to know what specific index files it would then 
need to fetch, but if they're all living in a single tarball somewhere then 
retrieval becomes difficult.  Would Solr fetch the whole tarball and unpack it 
to get what it needs?

Compressing each file individually seems like it'd avoid some of these 
pitfalls, but I'd have to think it through a bit more?

> Compressed Backup
> -----------------
>
>                 Key: SOLR-15500
>                 URL: https://issues.apache.org/jira/browse/SOLR-15500
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Sayan Das
>            Priority: Major
>
> Right now in BliBli, we do dirty hacks to compress backups from the backup 
> scheduler VMs. It would be great if we can improve collection BACKUP command 
> with some expert flag which can compress the backup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to