Re: Split shard and stream sub-shards to remote nodes?

2013-11-22 Thread Shalin Shekhar Mangar
The splitting process is nothing but the creation of a bitset with
which a LiveDocsReader is created. These readers are then added to the
a new index via IW.addIndexes(IndexReader[] readers) method. All this
is performed below the IR/IW API and no documents are actually ever
read or written directly by Solr. This is why it isn't feasible to
stream docs to a remote node.

On Fri, Nov 22, 2013 at 5:59 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
 Hi,

 On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 At the Lucene level, I think it would require a directory
 implementation which writes to a remote node directly. Otherwise, on
 the solr side, we must move the leader itself to another node which
 has enough disk space and then split the index.


 Hm what about taking the source shard, splitting it, and sending docs
 that come out of each sub-shards to a remote node at Solr level, as if
 these documents are just being added (i.e. nothing at Lucene level)?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/





 On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
  Do you think this is something that is actually implementable?  If so,
  I'll open an issue.
 
  One use-case where this may come in handy is when the disk space is
  tight.  If a shard is using  50% of the disk space on some node X,
  you can't really split that shard because the 2 new sub-shards will
  not fit on the local disk.  Or is there some trick one could use in
  this situation?
 
  Thanks,
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
  On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
  shalinman...@gmail.com wrote:
  No, it is not supported yet. We can't split to a remote node directly.
  The best bet is trigger a new leader election by unloading the leader
  node once all replicas are active.
 
  On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
  otis.gospodne...@gmail.com wrote:
  Hi,
 
  Is it possible to perform a shard split and stream data for the
  new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
  on the local/source node first?
 
  Thanks,
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Split shard and stream sub-shards to remote nodes?

2013-11-22 Thread Otis Gospodnetic
Ouch :(
I guess it's as efficient as it can be but too bad, because writing to
a remove node sounds awesomely cool to me at least. :)

Thanks for explaining the key bits, Shalin.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Fri, Nov 22, 2013 at 7:54 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 The splitting process is nothing but the creation of a bitset with
 which a LiveDocsReader is created. These readers are then added to the
 a new index via IW.addIndexes(IndexReader[] readers) method. All this
 is performed below the IR/IW API and no documents are actually ever
 read or written directly by Solr. This is why it isn't feasible to
 stream docs to a remote node.

 On Fri, Nov 22, 2013 at 5:59 AM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
  Hi,
 
  On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
  At the Lucene level, I think it would require a directory
  implementation which writes to a remote node directly. Otherwise, on
  the solr side, we must move the leader itself to another node which
  has enough disk space and then split the index.
 
 
  Hm what about taking the source shard, splitting it, and sending docs
  that come out of each sub-shards to a remote node at Solr level, as if
  these documents are just being added (i.e. nothing at Lucene level)?
 
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
 
 
 
  On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic
  otis.gospodne...@gmail.com wrote:
   Do you think this is something that is actually implementable?  If so,
   I'll open an issue.
  
   One use-case where this may come in handy is when the disk space is
   tight.  If a shard is using  50% of the disk space on some node X,
   you can't really split that shard because the 2 new sub-shards will
   not fit on the local disk.  Or is there some trick one could use in
   this situation?
  
   Thanks,
   Otis
   --
   Performance Monitoring * Log Analytics * Search Analytics
   Solr  Elasticsearch Support * http://sematext.com/
  
  
   On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
   shalinman...@gmail.com wrote:
   No, it is not supported yet. We can't split to a remote node
 directly.
   The best bet is trigger a new leader election by unloading the leader
   node once all replicas are active.
  
   On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
   otis.gospodne...@gmail.com wrote:
   Hi,
  
   Is it possible to perform a shard split and stream data for the
   new/sub-shards to remote nodes, avoiding persistence of
 new/sub-shards
   on the local/source node first?
  
   Thanks,
   Otis
   --
   Performance Monitoring * Log Analytics * Search Analytics
   Solr  Elasticsearch Support * http://sematext.com/
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 



 --
 Regards,
 Shalin Shekhar Mangar.



Re: Split shard and stream sub-shards to remote nodes?

2013-11-21 Thread Otis Gospodnetic
Hi,

On Wed, Nov 20, 2013 at 12:53 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 At the Lucene level, I think it would require a directory
 implementation which writes to a remote node directly. Otherwise, on
 the solr side, we must move the leader itself to another node which
 has enough disk space and then split the index.


Hm what about taking the source shard, splitting it, and sending docs
that come out of each sub-shards to a remote node at Solr level, as if
these documents are just being added (i.e. nothing at Lucene level)?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/





 On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
  Do you think this is something that is actually implementable?  If so,
  I'll open an issue.
 
  One use-case where this may come in handy is when the disk space is
  tight.  If a shard is using  50% of the disk space on some node X,
  you can't really split that shard because the 2 new sub-shards will
  not fit on the local disk.  Or is there some trick one could use in
  this situation?
 
  Thanks,
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
  On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
  shalinman...@gmail.com wrote:
  No, it is not supported yet. We can't split to a remote node directly.
  The best bet is trigger a new leader election by unloading the leader
  node once all replicas are active.
 
  On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
  otis.gospodne...@gmail.com wrote:
  Hi,
 
  Is it possible to perform a shard split and stream data for the
  new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
  on the local/source node first?
 
  Thanks,
  Otis
  --
  Performance Monitoring * Log Analytics * Search Analytics
  Solr  Elasticsearch Support * http://sematext.com/
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.



 --
 Regards,
 Shalin Shekhar Mangar.



Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Shalin Shekhar Mangar
No, it is not supported yet. We can't split to a remote node directly.
The best bet is trigger a new leader election by unloading the leader
node once all replicas are active.

On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
 Hi,

 Is it possible to perform a shard split and stream data for the
 new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
 on the local/source node first?

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/



-- 
Regards,
Shalin Shekhar Mangar.


Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Otis Gospodnetic
Do you think this is something that is actually implementable?  If so,
I'll open an issue.

One use-case where this may come in handy is when the disk space is
tight.  If a shard is using  50% of the disk space on some node X,
you can't really split that shard because the 2 new sub-shards will
not fit on the local disk.  Or is there some trick one could use in
this situation?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
 No, it is not supported yet. We can't split to a remote node directly.
 The best bet is trigger a new leader election by unloading the leader
 node once all replicas are active.

 On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
 Hi,

 Is it possible to perform a shard split and stream data for the
 new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
 on the local/source node first?

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/



 --
 Regards,
 Shalin Shekhar Mangar.


Re: Split shard and stream sub-shards to remote nodes?

2013-11-20 Thread Shalin Shekhar Mangar
At the Lucene level, I think it would require a directory
implementation which writes to a remote node directly. Otherwise, on
the solr side, we must move the leader itself to another node which
has enough disk space and then split the index.

On Wed, Nov 20, 2013 at 8:37 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
 Do you think this is something that is actually implementable?  If so,
 I'll open an issue.

 One use-case where this may come in handy is when the disk space is
 tight.  If a shard is using  50% of the disk space on some node X,
 you can't really split that shard because the 2 new sub-shards will
 not fit on the local disk.  Or is there some trick one could use in
 this situation?

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Wed, Nov 20, 2013 at 6:48 AM, Shalin Shekhar Mangar
 shalinman...@gmail.com wrote:
 No, it is not supported yet. We can't split to a remote node directly.
 The best bet is trigger a new leader election by unloading the leader
 node once all replicas are active.

 On Wed, Nov 20, 2013 at 1:32 AM, Otis Gospodnetic
 otis.gospodne...@gmail.com wrote:
 Hi,

 Is it possible to perform a shard split and stream data for the
 new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
 on the local/source node first?

 Thanks,
 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/



 --
 Regards,
 Shalin Shekhar Mangar.



-- 
Regards,
Shalin Shekhar Mangar.


Split shard and stream sub-shards to remote nodes?

2013-11-19 Thread Otis Gospodnetic
Hi,

Is it possible to perform a shard split and stream data for the
new/sub-shards to remote nodes, avoiding persistence of new/sub-shards
on the local/source node first?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/