[jira] [Commented] (SOLR-12523) Confusing error reporting if backup attempted on non-shared FS

David Smiley (JIRA) Thu, 28 Jun 2018 07:53:18 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526389#comment-16526389
 ]


David Smiley commented on SOLR-12523:
-------------------------------------

bq. why does it need a shared filesystem?

Can you explain how you thought or hoped this mechanism worked?  Perhaps you 
thought of it more as as an in-place snapshot mechanism -- SOLR-9038.  This 
feature is conceived of a way to back up everything to one place, and that one 
place needs to accessible to all nodes in the cluster -- hence a shared file 
system requirement.  It could be interesting if just one node has access to the 
backup destination and somehow you indicate which node that is.  

Also FYI Jeff Wartes / Whitepages.com has some cool utilities here: 
https://github.com/whitepages/solrcloud_manager#cluster-commands "backupindex", 
"restoreindex".

Thanks Jan for clarifying this issue is about the need for better error 
messages / documentation.

> Confusing error reporting if backup attempted on non-shared FS
> --------------------------------------------------------------
>
>                 Key: SOLR-12523
>                 URL: https://issues.apache.org/jira/browse/SOLR-12523
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Backup/Restore
>    Affects Versions: 7.3.1
>            Reporter: Timothy Potter
>            Assignee: Jan Høydahl
>            Priority: Minor
>             Fix For: master (8.0), 7.5
>
>         Attachments: SOLR-12523.patch
>
>
> So I have a large collection with 4 shards across 2 nodes. When I try to back 
> it up with:
> {code}
> curl 
> "http://localhost:8984/solr/admin/collections?action=BACKUP&name=sigs&collection=foo_signals&async=5&location=backups";
> {code}
> I either get:
> {code}
> "5170256188349065":{
>     "responseHeader":{
>       "status":0,
>       "QTime":0},
>     "STATUS":"failed",
>     "Response":"Failed to backup core=foo_signals_shard1_replica_n2 because 
> org.apache.solr.common.SolrException: Directory to contain snapshots doesn't 
> exist: file:///vol1/cloud84/backups/sigs"},
>   "5170256187999044":{
>     "responseHeader":{
>       "status":0,
>       "QTime":0},
>     "STATUS":"failed",
>     "Response":"Failed to backup core=foo_signals_shard3_replica_n10 because 
> org.apache.solr.common.SolrException: Directory to contain snapshots doesn't 
> exist: file:///vol1/cloud84/backups/sigs"},
> {code}
> or if I create the directory, then I get:
> {code}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":2},
>   "Operation backup caused 
> exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>  The backup directory already exists: file:///vol1/cloud84/backups/sigs/",
>   "exception":{
>     "msg":"The backup directory already exists: 
> file:///vol1/cloud84/backups/sigs/",
>     "rspCode":400},
>   "status":{
>     "state":"failed",
>     "msg":"found [2] in failed tasks"}}
> {code}
> I'm thinking this has to do with having 2 cores from the same collection on 
> the same node but I can't get a collection with 1 shard on each node to work 
> either:
> {code}
> "ec2-52-90-245-38.compute-1.amazonaws.com:8984_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://ec2-52-90-245-38.compute-1.amazonaws.com:8984/solr: 
> Failed to backup core=system_jobs_history_shard2_replica_n6 because 
> org.apache.solr.common.SolrException: Directory to contain snapshots doesn't 
> exist: file:///vol1/cloud84/backups/ugh1"}
> {code}
> What's weird is that replica (system_jobs_history_shard2_replica_n6) is not 
> even on the ec2-52-90-245-38.compute-1.amazonaws.com node! It lives on a 
> different node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12523) Confusing error reporting if backup attempted on non-shared FS

Reply via email to