[ 
https://issues.apache.org/jira/browse/SOLR-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-13102:
--------------------------------
    Description: 
We need a general strategy (and probably a general base class) that can work 
with shared storage and not corrupt indexes from multiple writers.

One strategy that is used on local disk is to use locks.  This doesn't extend 
well to remote / shared filesystems when the locking is not tied into the 
object store itself since a process can lose the lock (a long GC or whatever) 
and then immediately try to write a file and there is no way to stop it.

An alternate strategy ditches the use of locks and simply avoids overwriting 
files by some algorithmic mechanism.
One of my colleagues outlined one way to do this: 
https://www.youtube.com/watch?v=UeTFpNeJ1Fo
That strategy uses random looking filenames and then writes a "core.metadata" 
file that maps between the random names and the original names.  The problem is 
then reduced to overwriting "core.metadata" when you lose the lock.  One way to 
fix this is to version "core.metadata".  Since the new leader election code was 
implemented, each shard as a monotonically increasing "leader term", and we can 
use that as part of the filename.  When a reader goes to open an index, it can 
use the latest file from the directory listing, or even use the term obtained 
from ZK if we can't trust the directory listing to be up to date.  
Additionally, we don't need random filenames to avoid collisions... a simple 
unique prefix or suffix would work fine (such as the leader term again)



  was:
We need a general strategy (and probably a general base class) that can work 
with shared storage and not corrupt indexes from multiple writers.

One strategy that is used on local disk is to use locks.  This doesn't extend 
well to remote / shared filesystems when the locking is not tied into the 
object store itself since a process can lose the lock (a long GC or whatever) 
and then immediately try to write a file and there is no way to stop it.

An alternate strategy ditches the use of locks and simply avoids overwriting 
files by some algorithmic mechanism.
One of my colleagues outlined one way to do this: 
https://www.youtube.com/watch?v=UeTFpNeJ1Fo
That strategy uses random looking filenames and then writes a "core.metadata" 
file that maps between the random names and the original names.  The problem is 
then reduced to overwriting "core.metadata" when you lose the lock.  One way to 
fix this is to version "core.metadata".  Since the new leader election code was 
implemented, each shard as a monotonically increasing "leader term", and we can 
use that as part of the filename.  When a reader goes to open an index, it can 
use the latest file from the directory listing, or even use the term obtained 
from ZK if we can't trust the directory listing to be up to date.




> Shared storage Directory implementation
> ---------------------------------------
>
>                 Key: SOLR-13102
>                 URL: https://issues.apache.org/jira/browse/SOLR-13102
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Yonik Seeley
>            Priority: Major
>
> We need a general strategy (and probably a general base class) that can work 
> with shared storage and not corrupt indexes from multiple writers.
> One strategy that is used on local disk is to use locks.  This doesn't extend 
> well to remote / shared filesystems when the locking is not tied into the 
> object store itself since a process can lose the lock (a long GC or whatever) 
> and then immediately try to write a file and there is no way to stop it.
> An alternate strategy ditches the use of locks and simply avoids overwriting 
> files by some algorithmic mechanism.
> One of my colleagues outlined one way to do this: 
> https://www.youtube.com/watch?v=UeTFpNeJ1Fo
> That strategy uses random looking filenames and then writes a "core.metadata" 
> file that maps between the random names and the original names.  The problem 
> is then reduced to overwriting "core.metadata" when you lose the lock.  One 
> way to fix this is to version "core.metadata".  Since the new leader election 
> code was implemented, each shard as a monotonically increasing "leader term", 
> and we can use that as part of the filename.  When a reader goes to open an 
> index, it can use the latest file from the directory listing, or even use the 
> term obtained from ZK if we can't trust the directory listing to be up to 
> date.  Additionally, we don't need random filenames to avoid collisions... a 
> simple unique prefix or suffix would work fine (such as the leader term again)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to