[jira] [Commented] (SOLR-9952) S3BackupRepository

Richard (JIRA) Thu, 30 May 2019 12:43:13 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16852277#comment-16852277
 ]


Richard commented on SOLR-9952:
-------------------------------

Is anyone using this at a more present time?  
Some background: 
I'm currently running solr v7.4, we run solr on bare metal, and currently store 
our backups on a hdfs cluster which is also on bare metal. We want to store our 
backups into AWS on S3. I tried applying these patches, and following the 
attached PDF and have ran into a pile of problems.

I was getting many errors trying to connect via {{S3N}}, which makes sense due 
to how things have changed since the creation of this ticket. I changed this to 
{{S3A}} and was starting to get somewhere. I also had to add the following 
property to the xml file in order to remove some connection errors I was getting

{code:xml}
  <property>
    <name>fs.s3a.endpoint</name>
    <value>s3.eu-west-2.amazonaws.com</value>
  </property>
{code}

I was then getting issues with the filesystem used, so I changed this to use 
the {{S3AFileSystem}} rather than the {{NativeS3FileSystem}} because that is 
also soon to be depreciated. This started to provide some positive results.

I can now backup onto S3, funnily enough I got the classic {noformat}Caused by: 
java.lang.IllegalStateException: Connection pool shut down{noformat}

It's really hacky, but somehow, got around this by essentially swapping around 
the order of what is being backed up. Originally it backs up the index files 
and then the zookeeper information, however, to get backups working, I swapped 
it round so it backups the zookeeper information first, and then the index 
files. This can be seen below:
{code}
--- a/solr/core/src/java/org/apache/solr/cloud/api/collections/BackupCmd.java
+++ b/solr/core/src/java/org/apache/solr/cloud/api/collections/BackupCmd.java
@@ -89,17 +89,6 @@ public class BackupCmd implements 
OverseerCollectionMessageHandler.Cmd {
     // Create a directory to store backup details.
     repository.createDirectory(backupPath);

-    String strategy = 
message.getStr(CollectionAdminParams.INDEX_BACKUP_STRATEGY, 
CollectionAdminParams.COPY_FILES_STRATEGY);
-    switch (strategy) {
-      case CollectionAdminParams.COPY_FILES_STRATEGY: {
-        copyIndexFiles(backupPath, message, results);
-        break;
-      }
-      case CollectionAdminParams.NO_INDEX_BACKUP_STRATEGY: {
-        break;
-      }
-    }
-
     log.info("Starting to backup ZK data for backupName={}", backupName);

     //Download the configs
@@ -127,6 +116,18 @@ public class BackupCmd implements 
OverseerCollectionMessageHandler.Cmd {
     backupMgr.downloadCollectionProperties(location, backupName, 
collectionName);

     log.info("Completed backing up ZK data for backupName={}", backupName);
+
+    String strategy = 
message.getStr(CollectionAdminParams.INDEX_BACKUP_STRATEGY, 
CollectionAdminParams.COPY_FILES_STRATEGY);
+    switch (strategy) {
+      case CollectionAdminParams.COPY_FILES_STRATEGY: {
+        copyIndexFiles(backupPath, message, results);
+        break;
+      }
+      case CollectionAdminParams.NO_INDEX_BACKUP_STRATEGY: {
+        break;
+      }
+    }
+
   }
{code}

So with the above, I am able to successfully backup a collection to an S3 
bucket. My next problem is restoring.

I am getting the same {noformat}Caused by: java.lang.IllegalStateException: 
Connection pool shut down{noformat}.

It appears it is getting the list of files it needs from the S3 bucket to 
restore and restores the first file successfully. But when it comes to 
restoring the second file, it appears something has closed the connection.

I have tried relentless different versions of many different packages, from the 
{{aws}} package, the {{http lib}} package, even tried upgrade hadoop and still 
getting the same issue. 

I saw on a lot of posts to add the following to the connection
{code:java}
.setConnectionManagerShared(true)
{code}
Which I forced in all instances, and still met the same problem. 

So my question is, has anyone got this working successfully at a more recent 
time? 

> S3BackupRepository
> ------------------
>
>                 Key: SOLR-9952
>                 URL: https://issues.apache.org/jira/browse/SOLR-9952
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Backup/Restore
>            Reporter: Mikhail Khludnev
>            Priority: Major
>         Attachments: 
> 0001-SOLR-9952-Added-dependencies-for-hadoop-amazon-integ.patch, 
> 0002-SOLR-9952-Added-integration-test-for-checking-backup.patch, Running Solr 
> on S3.pdf, core-site.xml.template
>
>
> I'd like to have a backup repository implementation allows to snapshot to AWS 
> S3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-9952) S3BackupRepository

Reply via email to