Our memory problems might be as simple as not closing a scanner every time one is opened, but I know we had to implement nagios based restarts of thrift as our 4g thrift memory gets eaten up and it eventually freezes and stop responding to requests after less than 1 week of running. We are running the thrift that is bundled with 0.90.4 so hopefully a lot of this is fixed now...
Thanks. On Sat, Jan 21, 2012 at 9:29 AM, <[email protected]> wrote: > Thrift has been upgraded to 0.8 in trunk. 0.92 still uses 0.7 > > Can you provide Jira number which deals with memory leak ? > > Thanks > > > > On Jan 21, 2012, at 5:34 AM, Wayne <[email protected]> wrote: > > > Sorry but it would be too hard for us to be able to provide enough info > in > > a Jira to accurately reproduce. Our read problem is through thrift and > has > > everything to do with the row just being too big to bring back in its > > entirety (13 million col row times out 1/3 of the time). Filters in .92 > and > > thrift should help us there. I just closed > > https://issues.apache.org/jira/browse/HBASE-4187 as filters now support > > offset, limit patterns for the get. Of course we would all prefer a > > streaming model to avoid any of these issues and having to build our > > own pseudo streaming model. Is Thrift still the best option for high > > performance python based reads? From Hadoop World it seems some people > are > > pushing thrift and others are pushing Avro. Does .92 bundle/work with > > Thrift .8 and are the memory leaks fixed in .8? > > > > As far as the write bottleneck it has a lot to do with memory, and other > > low level config issues. I would hope that the automated tests of hbase > can > > eventually include patterns for large col counts. In order for hbase to > > truly be a col based storage system it needs to scale cols into the 100s > > millions and beyond. This is the pattern we have the hardest time > modeling > > in base because there is an unknown "limit" here we have to watch out > for. > > There is a known limit that a row must be stored within 1 and only one > > region, but that should not be a problem. One single large region storing > > one large row should still "work". > > > > Thanks. > > > > On Fri, Jan 20, 2012 at 3:45 PM, Stack <[email protected]> wrote: > > > >> On Fri, Jan 20, 2012 at 11:43 AM, Wayne <[email protected]> wrote: > >> > >>> Does 0.92 support a significant increase in row size over 0.90.x? With > >>> 0.90.4 we have seen writes start choking at 30 million cols/row and > reads > >>> start choking at 10 million cols/row. Can we assume these numbers will > go > >>> up with .92 and if yes how much? > >>> > >>> > >> Any chance of a JIRA on issues you see Wayne when writes/read choke? > >> Thanks, > >> > >> St.Ack > >> P.S. I don't know of any comparison. We have new fileformat in 0.92.0 > and > >> both read/write paths have been amended so it could be different; not > sure > >> if better or worse. > >> >
