Re: 0.92 Max Row Size

Wayne Mon, 23 Jan 2012 06:42:59 -0800

Our memory problems might be as simple as not closing a scanner every time
one is opened, but I know we had to implement nagios based restarts of
thrift as our 4g thrift memory gets eaten up and it eventually freezes and
stop responding to requests after less than 1 week of running. We are
running the thrift that is bundled with 0.90.4 so hopefully a lot of this
is fixed now...


Thanks.


On Sat, Jan 21, 2012 at 9:29 AM, <[email protected]> wrote:

> Thrift has been upgraded to 0.8 in trunk. 0.92 still uses 0.7
>
> Can you provide Jira number which deals with memory leak ?
>
> Thanks
>
>
>
> On Jan 21, 2012, at 5:34 AM, Wayne <[email protected]> wrote:
>
> > Sorry but it would be too hard for us to be able to provide enough info
> in
> > a Jira to accurately reproduce. Our read problem is through thrift and
> has
> > everything to do with the row just being too big to bring back in its
> > entirety (13 million col row times out 1/3 of the time). Filters in .92
> and
> > thrift should help us there. I just closed
> > https://issues.apache.org/jira/browse/HBASE-4187 as filters now support
> > offset, limit patterns for the get. Of course we would all prefer a
> > streaming model to avoid any of these issues and having to build our
> > own pseudo streaming model. Is Thrift still the best option for high
> > performance python based reads? From Hadoop World it seems some people
> are
> > pushing thrift and others are pushing Avro. Does .92 bundle/work with
> > Thrift .8 and are the memory leaks fixed in .8?
> >
> > As far as the write bottleneck it has a lot to do with memory, and other
> > low level config issues. I would hope that the automated tests of hbase
> can
> > eventually include patterns for large col counts. In order for hbase to
> > truly be a col based storage system it needs to scale cols into the 100s
> > millions and beyond. This is the pattern we have the hardest time
> modeling
> > in base because there is an unknown "limit" here we have to watch out
> for.
> > There is a known limit that a row must be stored within 1 and only one
> > region, but that should not be a problem. One single large region storing
> > one large row should still "work".
> >
> > Thanks.
> >
> > On Fri, Jan 20, 2012 at 3:45 PM, Stack <[email protected]> wrote:
> >
> >> On Fri, Jan 20, 2012 at 11:43 AM, Wayne <[email protected]> wrote:
> >>
> >>> Does 0.92 support a significant increase in row size over 0.90.x? With
> >>> 0.90.4 we have seen writes start choking at 30 million cols/row and
> reads
> >>> start choking at 10 million cols/row. Can we assume these numbers will
> go
> >>> up with .92 and if yes how much?
> >>>
> >>>
> >> Any chance of a JIRA on issues you see Wayne when writes/read choke?
> >> Thanks,
> >>
> >> St.Ack
> >> P.S. I don't know of any comparison.  We have new fileformat in 0.92.0
> and
> >> both read/write paths have been amended so it could be different; not
> sure
> >> if better or worse.
> >>
>

Re: 0.92 Max Row Size

Reply via email to