Re: SolrCloud shows cluster still healthy even the node data directory is deleted

2020-12-06 Thread Amy Bai
Hi community,

I create a Solr Jira to track this issue.
https://issues.apache.org/jira/browse/SOLR-15028


Regards,
Amy

From: Radar Lei 
Sent: Friday, November 20, 2020 5:13 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrCloud shows cluster still healthy even the node data directory 
is deleted

Hi Erick,

I understand this is how the file handler works.

But for the SolrCloud users, they didn't see the expected replica failover 
happens, then we can not say SolrCloud is totally HA enabled. Do we have plan 
to handle the HA for disk failures? Thanks.

Regards,
Radar

From: Amy Bai 
Date: Wednesday, November 11, 2020 at 8:19 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrCloud shows cluster still healthy even the node data directory 
is deleted
Hi Erick,

Thanks for your kindly reply.
There are two things that confuse me:

1. index/search queries keep failing because one of the node data directory is 
gone, but the node is not marked as down.

2. The replicas on the failed node are not working, but the Index/search 
queries didn't failover to other healthy replicas.

Regards,
Amy

From: Erick Erickson 
Sent: Monday, November 9, 2020 8:43 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrCloud shows cluster still healthy even the node data directory 
is deleted

Depends. *nix systems have delete-on-close semantics, that is as
long as there’s a single file handle open, the file will be still be
available to the process using it. Only when the last file handle is
closed will the file actually be deleted.

Solr (Lucene actually) has  file handle open to every file in the index
all the time.

These files aren’t visible when you do a directory listing. So if you
stop Solr, are the files gone? NOTE: When you start Solr again, if
there are existing replicas that are healthy then the entire index
should be copied from another replica….

Best,
Erick

> On Nov 9, 2020, at 3:30 AM, Amy Bai  wrote:
>
> Hi community,
>
> I found that SolrCloud won't check the IO status if the SolrCloud process is 
> alive.
> E.g. If I delete the SolrCloud data directory, there are no errors report, 
> and I can still log in to the SolrCloud   Admin UI to create/query 
> collections.
> Is this reasonable?
> Can someone explain why SOLR handles it like this?
> Thanks so much.
>
>
> Regards,
> Amy


Re: SolrCloud shows cluster still healthy even the node data directory is deleted

2020-11-20 Thread Radar Lei
Hi Erick,

I understand this is how the file handler works.

But for the SolrCloud users, they didn't see the expected replica failover 
happens, then we can not say SolrCloud is totally HA enabled. Do we have plan 
to handle the HA for disk failures? Thanks.

Regards,
Radar

From: Amy Bai 
Date: Wednesday, November 11, 2020 at 8:19 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrCloud shows cluster still healthy even the node data directory 
is deleted
Hi Erick,

Thanks for your kindly reply.
There are two things that confuse me:

1. index/search queries keep failing because one of the node data directory is 
gone, but the node is not marked as down.

2. The replicas on the failed node are not working, but the Index/search 
queries didn't failover to other healthy replicas.

Regards,
Amy

From: Erick Erickson 
Sent: Monday, November 9, 2020 8:43 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrCloud shows cluster still healthy even the node data directory 
is deleted

Depends. *nix systems have delete-on-close semantics, that is as
long as there’s a single file handle open, the file will be still be
available to the process using it. Only when the last file handle is
closed will the file actually be deleted.

Solr (Lucene actually) has  file handle open to every file in the index
all the time.

These files aren’t visible when you do a directory listing. So if you
stop Solr, are the files gone? NOTE: When you start Solr again, if
there are existing replicas that are healthy then the entire index
should be copied from another replica….

Best,
Erick

> On Nov 9, 2020, at 3:30 AM, Amy Bai  wrote:
>
> Hi community,
>
> I found that SolrCloud won't check the IO status if the SolrCloud process is 
> alive.
> E.g. If I delete the SolrCloud data directory, there are no errors report, 
> and I can still log in to the SolrCloud   Admin UI to create/query 
> collections.
> Is this reasonable?
> Can someone explain why SOLR handles it like this?
> Thanks so much.
>
>
> Regards,
> Amy


Re: SolrCloud shows cluster still healthy even the node data directory is deleted

2020-11-11 Thread Amy Bai
Hi Erick,

Thanks for your kindly reply.
There are two things that confuse me:

1. index/search queries keep failing because one of the node data directory is 
gone, but the node is not marked as down.

2. The replicas on the failed node are not working, but the Index/search 
queries didn't failover to other healthy replicas.

Regards,
Amy

From: Erick Erickson 
Sent: Monday, November 9, 2020 8:43 PM
To: solr-user@lucene.apache.org 
Subject: Re: SolrCloud shows cluster still healthy even the node data directory 
is deleted

Depends. *nix systems have delete-on-close semantics, that is as
long as there’s a single file handle open, the file will be still be
available to the process using it. Only when the last file handle is
closed will the file actually be deleted.

Solr (Lucene actually) has  file handle open to every file in the index
all the time.

These files aren’t visible when you do a directory listing. So if you
stop Solr, are the files gone? NOTE: When you start Solr again, if
there are existing replicas that are healthy then the entire index
should be copied from another replica….

Best,
Erick

> On Nov 9, 2020, at 3:30 AM, Amy Bai  wrote:
>
> Hi community,
>
> I found that SolrCloud won't check the IO status if the SolrCloud process is 
> alive.
> E.g. If I delete the SolrCloud data directory, there are no errors report, 
> and I can still log in to the SolrCloud   Admin UI to create/query 
> collections.
> Is this reasonable?
> Can someone explain why SOLR handles it like this?
> Thanks so much.
>
>
> Regards,
> Amy



Re: SolrCloud shows cluster still healthy even the node data directory is deleted

2020-11-09 Thread Erick Erickson
Depends. *nix systems have delete-on-close semantics, that is as
long as there’s a single file handle open, the file will be still be
available to the process using it. Only when the last file handle is
closed will the file actually be deleted.

Solr (Lucene actually) has  file handle open to every file in the index
all the time.

These files aren’t visible when you do a directory listing. So if you
stop Solr, are the files gone? NOTE: When you start Solr again, if
there are existing replicas that are healthy then the entire index
should be copied from another replica….

Best,
Erick

> On Nov 9, 2020, at 3:30 AM, Amy Bai  wrote:
> 
> Hi community,
> 
> I found that SolrCloud won't check the IO status if the SolrCloud process is 
> alive.
> E.g. If I delete the SolrCloud data directory, there are no errors report, 
> and I can still log in to the SolrCloud   Admin UI to create/query 
> collections.
> Is this reasonable?
> Can someone explain why SOLR handles it like this?
> Thanks so much.
> 
> 
> Regards,
> Amy



SolrCloud shows cluster still healthy even the node data directory is deleted

2020-11-09 Thread Amy Bai
Hi community,

I found that SolrCloud won't check the IO status if the SolrCloud process is 
alive.
E.g. If I delete the SolrCloud data directory, there are no errors report, and 
I can still log in to the SolrCloud   Admin UI to create/query collections.
Is this reasonable?
Can someone explain why SOLR handles it like this?
Thanks so much.


Regards,
Amy