Re: Split shard and stream sub-shards to remote nodes?
The splitting process is nothing but the creation of a bitset with which a LiveDocsReader is created. These readers are then added to the a new index via IW.addIndexes(IndexReader[] readers) method. All this is performed below the IR/IW API and no documents are actually ever read or written directly by Solr. This is why it isn't feasible to stream docs to a remote node. On Fri, Nov 22, 2013 at 5:59 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: At the Lucene level, I think it would require a directory implementation which writes to a remote node directly. Otherwise, on the solr side, we must move the leader itself to another node which has enough disk space and then split the index. Hm what about taking the source shard, splitting it, and sending docs that come out of each sub-shards to a remote node at Solr level, as if these documents are just being added (i.e. nothing at Lucene level)? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Do you think this is something that is actually implementable? If so, I'll open an issue. One use-case where this may come in handy is when the disk space is tight. If a shard is using 50% of the disk space on some node X, you can't really split that shard because the 2 new sub-shards will not fit on the local disk. Or is there some trick one could use in this situation? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Split shard and stream sub-shards to remote nodes?
Ouch :( I guess it's as efficient as it can be but too bad, because writing to a remove node sounds awesomely cool to me at least. :) Thanks for explaining the key bits, Shalin. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Fri, Nov 22, 2013 at 7:54 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: The splitting process is nothing but the creation of a bitset with which a LiveDocsReader is created. These readers are then added to the a new index via IW.addIndexes(IndexReader[] readers) method. All this is performed below the IR/IW API and no documents are actually ever read or written directly by Solr. This is why it isn't feasible to stream docs to a remote node. On Fri, Nov 22, 2013 at 5:59 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: At the Lucene level, I think it would require a directory implementation which writes to a remote node directly. Otherwise, on the solr side, we must move the leader itself to another node which has enough disk space and then split the index. Hm what about taking the source shard, splitting it, and sending docs that come out of each sub-shards to a remote node at Solr level, as if these documents are just being added (i.e. nothing at Lucene level)? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Do you think this is something that is actually implementable? If so, I'll open an issue. One use-case where this may come in handy is when the disk space is tight. If a shard is using 50% of the disk space on some node X, you can't really split that shard because the 2 new sub-shards will not fit on the local disk. Or is there some trick one could use in this situation? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Split shard and stream sub-shards to remote nodes?
Hi, On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: At the Lucene level, I think it would require a directory implementation which writes to a remote node directly. Otherwise, on the solr side, we must move the leader itself to another node which has enough disk space and then split the index. Hm what about taking the source shard, splitting it, and sending docs that come out of each sub-shards to a remote node at Solr level, as if these documents are just being added (i.e. nothing at Lucene level)? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Do you think this is something that is actually implementable? If so, I'll open an issue. One use-case where this may come in handy is when the disk space is tight. If a shard is using 50% of the disk space on some node X, you can't really split that shard because the 2 new sub-shards will not fit on the local disk. Or is there some trick one could use in this situation? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Split shard and stream sub-shards to remote nodes?
No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Regards, Shalin Shekhar Mangar.
Re: Split shard and stream sub-shards to remote nodes?
Do you think this is something that is actually implementable? If so, I'll open an issue. One use-case where this may come in handy is when the disk space is tight. If a shard is using 50% of the disk space on some node X, you can't really split that shard because the 2 new sub-shards will not fit on the local disk. Or is there some trick one could use in this situation? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Regards, Shalin Shekhar Mangar.
Re: Split shard and stream sub-shards to remote nodes?
At the Lucene level, I think it would require a directory implementation which writes to a remote node directly. Otherwise, on the solr side, we must move the leader itself to another node which has enough disk space and then split the index. On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Do you think this is something that is actually implementable? If so, I'll open an issue. One use-case where this may come in handy is when the disk space is tight. If a shard is using 50% of the disk space on some node X, you can't really split that shard because the 2 new sub-shards will not fit on the local disk. Or is there some trick one could use in this situation? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: No, it is not supported yet. We can't split to a remote node directly. The best bet is trigger a new leader election by unloading the leader node once all replicas are active. On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Split shard and stream sub-shards to remote nodes?
Hi, Is it possible to perform a shard split and stream data for the new/sub-shards to remote nodes, avoiding persistence of new/sub-shards on the local/source node first? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/