[ 
https://issues.apache.org/jira/browse/SOLR-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prateek Singhal updated SOLR-17949:
-----------------------------------
    Description: 
Currently, Solr lacks native support for backing up and restoring collections 
to Azure Blob Storage.  Organizations running Solr on Azure infrastructure have 
no built-in way to leverage Azure Blob Storage for their backup and restore 
operations, forcing them to use either local filesystem repositories (which 
don't scale well in cloud environments) or third-party solutions.

This is problematic for Azure-based deployments because:
- Azure Blob Storage is the natural, cost-effective storage solution in Azure 
environments
- Lack of native Azure support creates operational complexity
- Users cannot take advantage of Azure's built-in durability, geo-replication, 
and lifecycle management

This contribution adds a BlobBackupRepository module that implements Solr's 
BackupRepository interface 
for Azure Blob Storage, following the same patterns as the existing GCS and S3 
backup repositories.

Implementation approach:
- New blob-repository module under solr/modules/
- Support for 4 authentication methods (Connection String, Account Name + Key, 
SAS Token, Azure Identity)
- Compatible with Azurite emulator for local development
- Follows Solr's established backup repository patterns
- 76 unit tests covering all authentication methods and backup/restore 
operations
- All dependencies use Apache 2.0 compatible licenses

This enables Solr users on Azure to perform native backup and restore 
operations using Azure Blob Storage, with the same ease of use as S3 and GCS 
repositories.

  was:
This contribution adds a new backup repository implementation for Azure Blob 
Storage, allowing Solr to backup and restore collections to Microsoft Azure.

 

 *Features*
 - Full backup/restore functionality to Azure Blob Storage
 - Support for 4 authentication methods:
  * Connection String (for development)
  * Account Name + Key (for simple production)
  * SAS Token (recommended for production)
  * Azure Identity (Managed Identity, Service Principal, Azure CLI)
 - Compatible with local testing using Azurite emulator
 - Comprehensive documentation and tests
 - Incremental backup support with versioning
 - Data integrity verification (checksum validation)
 - Tested with collections up to 1GB+ with 100K+ documents

*Implementation Details*
 - 8 implementation files
 - 8 test files
 - 76/76 passing unit tests
 - All authentication methods verified
 - Integration tested with real Azure Blob Storage

*Dependencies*

All dependencies use Apache-compatible licenses:
 - Azure SDK for Java (Storage Blobs) 12.25.0 - Apache 2.0 license
 - Azure SDK for Java (Identity) 1.11.0 - Apache 2.0 license

*Testing*

 Local Testing with Azurite:
```bash

Install Azurite
npm install -g azurite

Start Azurite
azurite --silent --location /tmp/azurite

Run tests
./gradlew :solr:modules:blob-repository:test
```

*Integration Testing:*
 - 76/76 unit tests passing
 - Tested with real Azure Blob Storage
 - Large collection testing (1GB, 100K documents)
 - All 4 authentication methods verified
 - Data integrity verified


> Add Azure Blob Storage backup repository module
> -----------------------------------------------
>
>                 Key: SOLR-17949
>                 URL: https://issues.apache.org/jira/browse/SOLR-17949
>             Project: Solr
>          Issue Type: New Feature
>          Components: Backup/Restore
>         Environment: * Tested with Java 17+
>  * Compatible with Solr 10.x
>  * Works with Azurite (local) and Azure Blob Storage (production)
>  * All major operating systems (macOS, Linux, Windows)
>            Reporter: Prateek Singhal
>            Priority: Major
>              Labels: azure, azureblob, backup, pull-request-available, restore
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, Solr lacks native support for backing up and restoring collections 
> to Azure Blob Storage.  Organizations running Solr on Azure infrastructure 
> have no built-in way to leverage Azure Blob Storage for their backup and 
> restore operations, forcing them to use either local filesystem repositories 
> (which don't scale well in cloud environments) or third-party solutions.
> This is problematic for Azure-based deployments because:
> - Azure Blob Storage is the natural, cost-effective storage solution in Azure 
> environments
> - Lack of native Azure support creates operational complexity
> - Users cannot take advantage of Azure's built-in durability, 
> geo-replication, and lifecycle management
> This contribution adds a BlobBackupRepository module that implements Solr's 
> BackupRepository interface 
> for Azure Blob Storage, following the same patterns as the existing GCS and 
> S3 backup repositories.
> Implementation approach:
> - New blob-repository module under solr/modules/
> - Support for 4 authentication methods (Connection String, Account Name + 
> Key, SAS Token, Azure Identity)
> - Compatible with Azurite emulator for local development
> - Follows Solr's established backup repository patterns
> - 76 unit tests covering all authentication methods and backup/restore 
> operations
> - All dependencies use Apache 2.0 compatible licenses
> This enables Solr users on Azure to perform native backup and restore 
> operations using Azure Blob Storage, with the same ease of use as S3 and GCS 
> repositories.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to