Re: Straggler problem in Accumulo BatchScans
David, Have you tried the TableLoadBalancer? I'd trying it before rolling your own... I think it should try to spread the tablets in your one table across the tablet servers in a balanced way rather than balancing all of the tablets for all tables across the nodes. Other than that, I'd consider your key design and query plans. If you are routinely working with 0.5-5% of your data, I imagine things will be a little slow in general... Good luck! Jim On Wed, Aug 21, 2013 at 8:15 PM, dlmar...@comcast.net wrote: You can set it in the shell on the table. Just override the default tablet balancer for the table. I think the master has to use the Table load balancer also if it is not set by default. -- *From: *David M. Slater david.sla...@jhuapl.edu *To: *user@accumulo.apache.org *Sent: *Wednesday, August 21, 2013 8:12:46 PM *Subject: *RE: Straggler problem in Accumulo BatchScans Thanks Eric, Just to make sure I’m going in the right direction, this would involve extending the TabletBalancer class, correct? How do I add it to the table after that (and remove the old one)? I don’t see it under the Connector’s TableOperations(). Is using a load-balancer what you would recommend if I wanted to make sure that two different tables stored related information (e.g. data and indexes) on the same tablets? Thanks, David *From:* Eric Newton [mailto:eric.new...@gmail.com] *Sent:* Wednesday, August 21, 2013 8:03 PM *To:* user@accumulo.apache.org *Subject:* Re: Straggler problem in Accumulo BatchScans A new balancer is a plug-in class that instructs the Master process where to place tablets. If you know you need your tablets spread out over servers based on time (row id), you can do that. It's pretty common, in fact. -Eric On Wed, Aug 21, 2013 at 7:54 PM, Slater, David M. david.sla...@jhuapl.edu wrote: Hi Dave, The table is currently organizing netflow data with its rowID of timestamp_netflowRecordID, some columns corresponding to various netflow quantites, and one column representing the entire netflow in binary form. The table is about 1.2 TB, and I am scanning 5-40 GB per scan, which scans about 7-28 tablets. What do you mean by a custom load balancer? Do you mean balancing the data on ingest, or balancing the query load? What would you recommend for balancing the query load if I can only retrieve the data from a particular tablet? I’ve played with index/data caches, though I haven’t used readahead threads or max open files. Is that referring to rfiles? I’m noticing that most of the queries are CPU bound, and that read i/o is not being hit very hard. Is that a typical behavior for scans? Thanks, David *From:* Dave Marion [mailto:dlmar...@comcast.net] *Sent:* Wednesday, August 21, 2013 7:29 PM *To:* user@accumulo.apache.org *Subject:* RE: Straggler problem in Accumulo BatchScans How is the table organized? What percent of the table are you scanning in these large operations? Have you considered writing a custom load balancer? I don’t think that a tablet can be hosted on multiple servers. But you might be able to play around with the index/data caches, readahead threads (concurrent queries), and max open files to achieve better performance. *From:* Slater, David M. [mailto:david.sla...@jhuapl.edudavid.sla...@jhuapl.edu] *Sent:* Wednesday, August 21, 2013 7:09 PM *To:* user@accumulo.apache.org *Subject:* Straggler problem in Accumulo BatchScans Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? Thanks in advance, David
Straggler problem in Accumulo BatchScans
Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? Thanks in advance, David
Re: Straggler problem in Accumulo BatchScans
David, Each tablet is hosted by one tablet server, and there's no way around that. (This is actually quite reasonably; otherwise, we would receive duplicate results from multiple tablet servers.) One strategy to deal with imbalanced data is to add a random partition prefix to your row Ids. This does complicate building queries, but in general, you'll be able to leverage all of your nodes. I did some testing with the nodes of such random shard ids, and it seems like having 1-2x as many shards as tablet servers worked pretty well. (I'd suggest 2x in case you ever grow your cloud.) In particular, if you can reingest your data, prepend a random 01-14~ to the beginning of each row Id, and see if that helps. After that, you can help Accumulo decide where it should split tablets with addSplits 01 02 etc 14 from the Accumulo shell (or programmatically with the addSplits). After that, you can make sure that your 14+ splits are distributed across the 7 nodes in a reasonable way. I hope that helps, Jim http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#addSplits%28java.lang.String,%20java.util.SortedSet%29 On Wed, Aug 21, 2013 at 7:09 PM, Slater, David M. david.sla...@jhuapl.eduwrote: Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. ** ** When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? ** ** Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? ** ** Thanks in advance, David
Re: Straggler problem in Accumulo BatchScans
You can write your own balancer, and use it just for your table. -Eric On Wed, Aug 21, 2013 at 7:47 PM, Slater, David M. david.sla...@jhuapl.eduwrote: Hi James, ** ** I already had the data sharded into 7 partitions, and that works well to distribute the data into 7 tablets. (I have 2 GB tablet sizes, with about 1.2 TB of data, so there are numerous tablets per server). The difficulty is that Accumulo seems to decide for itself where each tablet goes. When I only had 10 GB of data, it nicely divided into 7 tablets, one on each node. However, with dozens of tablets per tablet server, it assigns tablets to tablet servers orthogonally to my presplits. ** ** Is there a way to force Accumulo to keep specific ranges on specific nodes? If not, I suppose that I could have 4x or more shards per tablet server to ensure that it was more uniformly placed. ** ** D ** ** *From:* James Hughes [mailto:jn...@virginia.edu] *Sent:* Wednesday, August 21, 2013 7:29 PM *To:* user@accumulo.apache.org *Subject:* Re: Straggler problem in Accumulo BatchScans ** ** David, Each tablet is hosted by one tablet server, and there's no way around that. (This is actually quite reasonably; otherwise, we would receive duplicate results from multiple tablet servers.) One strategy to deal with imbalanced data is to add a random partition prefix to your row Ids. This does complicate building queries, but in general, you'll be able to leverage all of your nodes. I did some testing with the nodes of such random shard ids, and it seems like having 1-2x as many shards as tablet servers worked pretty well. (I'd suggest 2x in case you ever grow your cloud.) In particular, if you can reingest your data, prepend a random 01-14~ to the beginning of each row Id, and see if that helps. After that, you can help Accumulo decide where it should split tablets with addSplits 01 02 etc 14 from the Accumulo shell (or programmatically with the addSplits). After that, you can make sure that your 14+ splits are distributed across the 7 nodes in a reasonable way. I hope that helps, Jim http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#addSplits%28java.lang.String,%20java.util.SortedSet%29 ** ** On Wed, Aug 21, 2013 at 7:09 PM, Slater, David M. david.sla...@jhuapl.edu wrote: Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? Thanks in advance, David ** **
Re: Straggler problem in Accumulo BatchScans
A new balancer is a plug-in class that instructs the Master process where to place tablets. If you know you need your tablets spread out over servers based on time (row id), you can do that. It's pretty common, in fact. -Eric On Wed, Aug 21, 2013 at 7:54 PM, Slater, David M. david.sla...@jhuapl.eduwrote: Hi Dave, ** ** The table is currently organizing netflow data with its rowID of timestamp_netflowRecordID, some columns corresponding to various netflow quantites, and one column representing the entire netflow in binary form.* *** ** ** The table is about 1.2 TB, and I am scanning 5-40 GB per scan, which scans about 7-28 tablets. ** ** What do you mean by a custom load balancer? Do you mean balancing the data on ingest, or balancing the query load? What would you recommend for balancing the query load if I can only retrieve the data from a particular tablet? ** ** I’ve played with index/data caches, though I haven’t used readahead threads or max open files. Is that referring to rfiles? ** ** I’m noticing that most of the queries are CPU bound, and that read i/o is not being hit very hard. Is that a typical behavior for scans? ** ** Thanks, David ** ** *From:* Dave Marion [mailto:dlmar...@comcast.net] *Sent:* Wednesday, August 21, 2013 7:29 PM *To:* user@accumulo.apache.org *Subject:* RE: Straggler problem in Accumulo BatchScans ** ** How is the table organized? What percent of the table are you scanning in these large operations? Have you considered writing a custom load balancer? ** ** I don’t think that a tablet can be hosted on multiple servers. But you might be able to play around with the index/data caches, readahead threads (concurrent queries), and max open files to achieve better performance.*** * ** ** *From:* Slater, David M. [mailto:david.sla...@jhuapl.edudavid.sla...@jhuapl.edu] *Sent:* Wednesday, August 21, 2013 7:09 PM *To:* user@accumulo.apache.org *Subject:* Straggler problem in Accumulo BatchScans ** ** Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. ** ** When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? ** ** Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? ** ** Thanks in advance, David
RE: Straggler problem in Accumulo BatchScans
Thanks Eric, Just to make sure I'm going in the right direction, this would involve extending the TabletBalancer class, correct? How do I add it to the table after that (and remove the old one)? I don't see it under the Connector's TableOperations(). Is using a load-balancer what you would recommend if I wanted to make sure that two different tables stored related information (e.g. data and indexes) on the same tablets? Thanks, David From: Eric Newton [mailto:eric.new...@gmail.com] Sent: Wednesday, August 21, 2013 8:03 PM To: user@accumulo.apache.org Subject: Re: Straggler problem in Accumulo BatchScans A new balancer is a plug-in class that instructs the Master process where to place tablets. If you know you need your tablets spread out over servers based on time (row id), you can do that. It's pretty common, in fact. -Eric On Wed, Aug 21, 2013 at 7:54 PM, Slater, David M. david.sla...@jhuapl.edumailto:david.sla...@jhuapl.edu wrote: Hi Dave, The table is currently organizing netflow data with its rowID of timestamp_netflowRecordID, some columns corresponding to various netflow quantites, and one column representing the entire netflow in binary form. The table is about 1.2 TB, and I am scanning 5-40 GB per scan, which scans about 7-28 tablets. What do you mean by a custom load balancer? Do you mean balancing the data on ingest, or balancing the query load? What would you recommend for balancing the query load if I can only retrieve the data from a particular tablet? I've played with index/data caches, though I haven't used readahead threads or max open files. Is that referring to rfiles? I'm noticing that most of the queries are CPU bound, and that read i/o is not being hit very hard. Is that a typical behavior for scans? Thanks, David From: Dave Marion [mailto:dlmar...@comcast.netmailto:dlmar...@comcast.net] Sent: Wednesday, August 21, 2013 7:29 PM To: user@accumulo.apache.orgmailto:user@accumulo.apache.org Subject: RE: Straggler problem in Accumulo BatchScans How is the table organized? What percent of the table are you scanning in these large operations? Have you considered writing a custom load balancer? I don't think that a tablet can be hosted on multiple servers. But you might be able to play around with the index/data caches, readahead threads (concurrent queries), and max open files to achieve better performance. From: Slater, David M. [mailto:david.sla...@jhuapl.edu] Sent: Wednesday, August 21, 2013 7:09 PM To: user@accumulo.apache.orgmailto:user@accumulo.apache.org Subject: Straggler problem in Accumulo BatchScans Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? Thanks in advance, David
Re: Straggler problem in Accumulo BatchScans
You can set it in the shell on the table. Just override the default tablet balancer for the table. I think the master has to use the Table load balancer also if it is not set by default. - Original Message - From: David M. Slater david.sla...@jhuapl.edu To: user@accumulo.apache.org Sent: Wednesday, August 21, 2013 8:12:46 PM Subject: RE: Straggler problem in Accumulo BatchScans Thanks Eric, Just to make sure I’m going in the right direction, this would involve extending the TabletBalancer class, correct? How do I add it to the table after that (and remove the old one)? I don’t see it under the Connector’s TableOperations(). Is using a load-balancer what you would recommend if I wanted to make sure that two different tables stored related information (e.g. data and indexes) on the same tablets? Thanks, David From: Eric Newton [mailto:eric.new...@gmail.com] Sent: Wednesday, August 21, 2013 8:03 PM To: user@accumulo.apache.org Subject: Re: Straggler problem in Accumulo BatchScans A new balancer is a plug-in class that instructs the Master process where to place tablets. If you know you need your tablets spread out over servers based on time (row id), you can do that. It's pretty common, in fact. -Eric On Wed, Aug 21, 2013 at 7:54 PM, Slater, David M. david.sla...@jhuapl.edu wrote: Hi Dave, The table is currently organizing netflow data with its rowID of timestamp_netflowRecordID, some columns corresponding to various netflow quantites, and one column representing the entire netflow in binary form. The table is about 1.2 TB, and I am scanning 5-40 GB per scan, which scans about 7-28 tablets. What do you mean by a custom load balancer? Do you mean balancing the data on ingest, or balancing the query load? What would you recommend for balancing the query load if I can only retrieve the data from a particular tablet? I’ve played with index/data caches, though I haven’t used readahead threads or max open files. Is that referring to rfiles? I’m noticing that most of the queries are CPU bound, and that read i/o is not being hit very hard. Is that a typical behavior for scans? Thanks, David From: Dave Marion [mailto: dlmar...@comcast.net ] Sent: Wednesday, August 21, 2013 7:29 PM To: user@accumulo.apache.org Subject: RE: Straggler problem in Accumulo BatchScans How is the table organized? What percent of the table are you scanning in these large operations? Have you considered writing a custom load balancer? I don’t think that a tablet can be hosted on multiple servers. But you might be able to play around with the index/data caches, readahead threads (concurrent queries), and max open files to achieve better performance. From: Slater, David M. [ mailto:david.sla...@jhuapl.edu ] Sent: Wednesday, August 21, 2013 7:09 PM To: user@accumulo.apache.org Subject: Straggler problem in Accumulo BatchScans Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4. When I run large BatchScanner operations, the number of tablets scanned per node is not uniform, leading to the overloaded nodes taking much longer to finish than the others. For queries that require all of the scans to finish before returning, this is a major latency issue. What are some practical means of load-balancing this to reduce delay? Is it possible for tablets to be hosted on multiple tablet servers, up to the replication factor of the underlying hdfs? Are there reasons this might be an undesirable design? Thanks in advance, David