A regular backup creates the files in this order:
drwxr-xr-x 2 root root 63 Jun 27 09:46 snapshot.shard7
drwxr-xr-x 2 root root 159 Jun 27 09:46 snapshot.shard8
drwxr-xr-x 2 root root 135 Jun 27 09:46 snapshot.shard1
drwxr-xr-x 2 root root 178 Jun 27 09:46 snapshot.shard3
drwxr-xr-x 2 root root 210 Jun 27 09:46 snapshot.shard11
drwxr-xr-x 2 root root 218 Jun 27 09:46 snapshot.shard9
drwxr-xr-x 2 root root 180 Jun 27 09:46 snapshot.shard2
drwxr-xr-x 2 root root 164 Jun 27 09:47 snapshot.shard5
drwxr-xr-x 2 root root 252 Jun 27 09:47 snapshot.shard6
drwxr-xr-x 2 root root 103 Jun 27 09:47 snapshot.shard12
drwxr-xr-x 2 root root 135 Jun 27 09:47 snapshot.shard4
drwxr-xr-x 2 root root 119 Jun 27 09:47 snapshot.shard10
drwxr-xr-x 3 root root 4 Jun 27 09:47 zk_backup
-rw-r--r-- 1 root root 185 Jun 27 09:47 backup.properties
While an async backup creates files in this order:
drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard3
drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard9
drwxr-xr-x 2 root root 62 Jun 27 09:49 snapshot.shard6
drwxr-xr-x 2 root root 37 Jun 27 09:49 snapshot.shard2
drwxr-xr-x 2 root root 67 Jun 27 09:49 snapshot.shard7
drwxr-xr-x 2 root root 75 Jun 27 09:49 snapshot.shard5
drwxr-xr-x 2 root root 70 Jun 27 09:49 snapshot.shard8
drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard4
drwxr-xr-x 2 root root 15 Jun 27 09:50 snapshot.shard11
drwxr-xr-x 2 root root 127 Jun 27 09:50 snapshot.shard1
drwxr-xr-x 2 root root 116 Jun 27 09:50 snapshot.shard12
drwxr-xr-x 3 root root 4 Jun 27 09:50 zk_backup
-rw-r--r-- 1 root root 185 Jun 27 09:50 backup.properties
drwxr-xr-x 2 root root 25 Jun 27 09:51 snapshot.shard10
shard10 is much larger than the other shards.
>From the logs:
INFO - 2017-06-27 09:50:33.832; [ ] org.apache.solr.cloud.BackupCmd;
Completed backing up ZK data for backupName=collection1
INFO - 2017-06-27 09:50:33.800; [ ]
org.apache.solr.handler.admin.CoreAdminOperation; Checking request status
for : backup1103459705035055
INFO - 2017-06-27 09:50:33.800; [ ]
org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/cores
params={qt=/admin/cores&requestid=backup1103459705035055&action=REQUESTSTATUS&wt=javabin&version=2}
status=0 QTime=0
INFO - 2017-06-27 09:51:33.405; [ ] org.apache.solr.handler.SnapShooter;
Done creating backup snapshot: shard10 at file:///online/backup/collection1
Has anyone seen this bug, or knows a workaround?
On 27 June 2017 at 09:47, Damien Kamerman <[email protected]> wrote:
> Yes, the async command returns, and then I poll with REQUESTSTATUS.
>
> On 27 June 2017 at 01:24, Varun Thacker <[email protected]> wrote:
>
>> Hi Damien,
>>
>> A backup command with async is supposed to return early. It is start the
>> backup process and return.
>>
>> Are you using the REQUESTSTATUS (
>> http://lucene.apache.org/solr/guide/6_6/collections-api.html
>> #collections-api
>> ) API to validate if the backup is complete?
>>
>> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <[email protected]>
>> wrote:
>>
>> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
>> > command returning early. The state is finished well before one shard is
>> > finished.
>> >
>> > The collection I'm backing up has 12 shards across 6 nodes and I suspect
>> > the issue is that it is not waiting for all backups on the node to
>> finish.
>> >
>> > Alternatively, I if I change the request to not be async it works OK but
>> > sometimes I get the exception "backup the collection time out:180s".
>> >
>> > Has anyone seen this, or knows a workaround?
>> >
>> > Cheers,
>> > Damien.
>> >
>>
>
>