[
https://issues.apache.org/jira/browse/SOLR-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arda updated SOLR-17460:
------------------------
Description:
I was attempting to migrate a collection with 3 shards from a Solr 7.0 cluster
to a Solr 8.4 cluster. The data is stored in HDFS. I followed the
backup-restore process but encountered issues with two of the shards during the
restoration.
h1. *Migration Process:*
*1-* *Backup Command:* To avoid timeouts, I initiated the backup with an async
parameter:
curl -k --negotiate -u :
'https://<solrNode>:<solrPort>/solr/admin/collections?action=BACKUP&name=<backupName>&collection=<solrCollectionName>x&location=<hdfsPath>&
async=12346'
*2- Copy Backup to Local:* After the backup, I copied the data from HDFS to the
local filesystem:
hdfs dfs --copyToLocal <backupPath> <localPath>
*3- Transfer Backup to New Cluster:* I then copied the backup files from the
older Solr node to the newer one:
scp -pr <localPath> <username>@<ip>:<localPathDestination>
*4- Prepare New HDFS Path:* On the new Solr cluster, I created a new directory
in HDFS and adjusted ownership:
hdfs dfs -mkdir <pathName2>
hdfs dfs -chown solr:solr <pathName2>
*5- Copy Backup to New HDFS Location:* I transferred the backup data from local
to the new HDFS path. Before that, I deleted "<str>queryDocAuthorization</str>"
parts from solrconfig.xml file to become compatible with the newer version.
hdfs dfs --copyFromLocal <localPathDestination> <pathName2>
*6- Restore Collection:* Finally, I ran the restore command:
curl -k --negotiate -u :
'https://<solrNode>:<solrPort>/solr/admin/collections?action=RESTORE&name=<backupName>&collection=<solrCollectionName>x&location=<hdfs_path>&
async=12345'
h1.
*Issue:*
After the restore process completed, I found that two of the shards could not
be restored. The logs displayed the following errors:
*Error During Shard Restoration:*
ERROR [c: <solrCollectionName> s: shard2 r:core_node5 x: :
<solrCollectionName>_shard2_replica_n4] o.a.s.h.RequestHandlerBase
org.apache.solr.common. SolrException: Error CREATEing SolrCore
'<solrCollectionName>_shard2_replica_n4': Unable to create core
[:<solrCollectionName>_shard2_replica_n4] Caused by:
org.apache.solr.handler.component.QueryDocAuthorizationComponent.....
*FileNotFoundException and Index Corruption:*
WARN
(parallelCoreAdminExecutor-6-thread-7-processing-n:<solrNode>:<solrPort>_solrx:<solrCollectionName>_shard2
_replica_n1 <numbers> RESTORECORE) [x:<solrCollectionName>_shard2_replica_n1]
o.a.s.h. RestoreCore Could not switch to restored index. Rolling back to the
current index => org.apache.lucene.index.CorruptindexException: Unexpected file
read error while reading index.
(resource=BufferedChecksumIndexInput(segments_1g9dk))
Caused by: java.io. FileNotFoundException: File does not exist:
hdfs://<hdfsPath>/core_node2/data/restore/<fileName>
It appears that Solr is looking for a file in HDFS that doesn't exist, despite
no manual deletions being made. I cannot determine why these specific shards
failed to restore, or why the system is unable to locate the required files.
Expected Behavior:
The backup and restore process should complete without errors, and all shards
should be restored successfully to the new cluster.
Actual Behavior:
Two shards failed to restore, with errors related to missing files and index
corruption.
was:
I was attempting to migrate a collection with 3 shards from a Solr 7.0 cluster
to a Solr 8.4 cluster. The data is stored in HDFS. I followed the
backup-restore process but encountered issues with two of the shards during the
restoration.
h1. *Migration Process:*
*1-* *Backup Command:* To avoid timeouts, I initiated the backup with an async
parameter:
curl -k --negotiate -u :
'https://<solrNode>:<solrPort>/solr/admin/collections?action=BACKUP&name=<backupName>&collection=<solrCollectionName>x&location=<hdfsPath>&
async=12346'
*2- Copy Backup to Local:* After the backup, I copied the data from HDFS to the
local filesystem:
hdfs dfs --copyToLocal <backupPath> <localPath>
*3- Transfer Backup to New Cluster:* I then copied the backup files from the
older Solr node to the newer one:
scp -pr <localPath> <username>@<ip>:<localPathDestination>
*4- Prepare New HDFS Path:* On the new Solr cluster, I created a new directory
in HDFS and adjusted ownership:
hdfs dfs -mkdir <pathName2>
hdfs dfs -chown solr:solr <pathName2>
*5- Copy Backup to New HDFS Location:* I transferred the backup data from local
to the new HDFS path:
hdfs dfs --copyFromLocal <localPathDestination> <pathName2>
*6- Restore Collection:* Finally, I ran the restore command:
curl -k --negotiate -u :
'https://<solrNode>:<solrPort>/solr/admin/collections?action=RESTORE&name=<backupName>&collection=<solrCollectionName>x&location=<hdfs_path>&
async=12345'
h1.
*Issue:*
After the restore process completed, I found that two of the shards could not
be restored. The logs displayed the following errors:
*Error During Shard Restoration:*
ERROR [c: <solrCollectionName> s: shard2 r:core_node5 x: :
<solrCollectionName>_shard2_replica_n4] o.a.s.h.RequestHandlerBase
org.apache.solr.common. SolrException: Error CREATEing SolrCore
'<solrCollectionName>_shard2_replica_n4': Unable to create core
[:<solrCollectionName>_shard2_replica_n4] Caused by:
org.apache.solr.handler.component.QueryDocAuthorizationComponent.....
*FileNotFoundException and Index Corruption:*
WARN
(parallelCoreAdminExecutor-6-thread-7-processing-n:<solrNode>:<solrPort>_solrx:<solrCollectionName>_shard2
_replica_n1 <numbers> RESTORECORE) [x:<solrCollectionName>_shard2_replica_n1]
o.a.s.h. RestoreCore Could not switch to restored index. Rolling back to the
current index => org.apache.lucene.index.CorruptindexException: Unexpected file
read error while reading index.
(resource=BufferedChecksumIndexInput(segments_1g9dk))
Caused by: java.io. FileNotFoundException: File does not exist:
hdfs://<hdfsPath>/core_node2/data/restore/<fileName>
It appears that Solr is looking for a file in HDFS that doesn't exist, despite
no manual deletions being made. I cannot determine why these specific shards
failed to restore, or why the system is unable to locate the required files.
Expected Behavior:
The backup and restore process should complete without errors, and all shards
should be restored successfully to the new cluster.
Actual Behavior:
Two shards failed to restore, with errors related to missing files and index
corruption.
> Error During Collection Migration from Solr 7.0 to Solr 8.4: Missing Files
> and Shard Restoration Failures
> ---------------------------------------------------------------------------------------------------------
>
> Key: SOLR-17460
> URL: https://issues.apache.org/jira/browse/SOLR-17460
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: hdfs, SolrCloud
> Affects Versions: 7.0, 8.4
> Reporter: Arda
> Priority: Minor
> Labels: backup, restore
>
> I was attempting to migrate a collection with 3 shards from a Solr 7.0
> cluster to a Solr 8.4 cluster. The data is stored in HDFS. I followed the
> backup-restore process but encountered issues with two of the shards during
> the restoration.
> h1. *Migration Process:*
> *1-* *Backup Command:* To avoid timeouts, I initiated the backup with an
> async parameter:
> curl -k --negotiate -u :
> 'https://<solrNode>:<solrPort>/solr/admin/collections?action=BACKUP&name=<backupName>&collection=<solrCollectionName>x&location=<hdfsPath>&
> async=12346'
> *2- Copy Backup to Local:* After the backup, I copied the data from HDFS to
> the local filesystem:
> hdfs dfs --copyToLocal <backupPath> <localPath>
> *3- Transfer Backup to New Cluster:* I then copied the backup files from the
> older Solr node to the newer one:
> scp -pr <localPath> <username>@<ip>:<localPathDestination>
> *4- Prepare New HDFS Path:* On the new Solr cluster, I created a new
> directory in HDFS and adjusted ownership:
> hdfs dfs -mkdir <pathName2>
> hdfs dfs -chown solr:solr <pathName2>
> *5- Copy Backup to New HDFS Location:* I transferred the backup data from
> local to the new HDFS path. Before that, I deleted
> "<str>queryDocAuthorization</str>" parts from solrconfig.xml file to become
> compatible with the newer version.
> hdfs dfs --copyFromLocal <localPathDestination> <pathName2>
> *6- Restore Collection:* Finally, I ran the restore command:
> curl -k --negotiate -u :
> 'https://<solrNode>:<solrPort>/solr/admin/collections?action=RESTORE&name=<backupName>&collection=<solrCollectionName>x&location=<hdfs_path>&
> async=12345'
> h1.
> *Issue:*
> After the restore process completed, I found that two of the shards could not
> be restored. The logs displayed the following errors:
> *Error During Shard Restoration:*
> ERROR [c: <solrCollectionName> s: shard2 r:core_node5 x: :
> <solrCollectionName>_shard2_replica_n4] o.a.s.h.RequestHandlerBase
> org.apache.solr.common. SolrException: Error CREATEing SolrCore
> '<solrCollectionName>_shard2_replica_n4': Unable to create core
> [:<solrCollectionName>_shard2_replica_n4] Caused by:
> org.apache.solr.handler.component.QueryDocAuthorizationComponent.....
> *FileNotFoundException and Index Corruption:*
> WARN
> (parallelCoreAdminExecutor-6-thread-7-processing-n:<solrNode>:<solrPort>_solrx:<solrCollectionName>_shard2
> _replica_n1 <numbers> RESTORECORE) [x:<solrCollectionName>_shard2_replica_n1]
> o.a.s.h. RestoreCore Could not switch to restored index. Rolling back to the
> current index => org.apache.lucene.index.CorruptindexException: Unexpected
> file read error while reading index.
> (resource=BufferedChecksumIndexInput(segments_1g9dk))
> Caused by: java.io. FileNotFoundException: File does not exist:
> hdfs://<hdfsPath>/core_node2/data/restore/<fileName>
> It appears that Solr is looking for a file in HDFS that doesn't exist,
> despite no manual deletions being made. I cannot determine why these specific
> shards failed to restore, or why the system is unable to locate the required
> files.
> Expected Behavior:
> The backup and restore process should complete without errors, and all shards
> should be restored successfully to the new cluster.
> Actual Behavior:
> Two shards failed to restore, with errors related to missing files and index
> corruption.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]