Re: Hbase on HDFS versus Cassandra

Neelesh Thu, 01 Dec 2016 08:05:18 -0800

Yes, PHOENIX-1718 has about the same information

On Dec 1, 2016 1:12 AM, "Ted Yu" <[email protected]> wrote:


> w.r.t. "Unable to find cached index metadata" error, have you seen this ?
>
> http://search-hadoop.com/m/Phoenix/9UY0h2YBSOhgbflB1?
> subj=Re+Global+Secondary+Index+ERROR+2008+INT10+Unable+
> to+find+cached+index+metadata+PHOENIX+1718+
>
> Cheers
>
>
> On Wednesday, November 30, 2016 11:59 PM, Neelesh <[email protected]>
> wrote:
>
>
> Ted,
> we use HDP 2.3.4 (HBase 1.1.2, phoenix 4.4 - but with a lot of backports
> from later versions)
>
> The key of the data table is  <customerid- 11 bytes><userid- 36 bytes-
>  but it really is a right padded long due to historic data><event type
> long><timestamp><random long>
>
> The two global indexes are <customerid-11 bytes><event type
> long><timestamp> <user id> and <customerid-11 bytes><campaign id
> long><timestamp>
> Around 100B rows in the main table. The main issues we see are
>
> # Sudden spikes in queueSize - going all the way to 1G limit and staying
> there, without any correlated client traffic
> # Boatloads of these errors 2016-11-30 11:28:54,907 INFO
>  [RW.default.writeRpcServer.handler=43,queue=9,port=16020]
> util.IndexManagementUtil: Rethrowing 
> org.apache.hadoop.hbase.DoNotRetryIOException:
> ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index
> metadata.  key=120521194876100862 region=<region-key>. Index update failed
>
> We have cross datacenter WAL replication enabled.
> We saw PHOENIX-1718, and changed all recommended timeouts to 1 hour.  Our
> HBase version has HBase-11705. We also discovered that the queuesize is
> global (across general/replication/priority queues) and if it reaches the
> 1GB limit, calls to all queues will drop. That was interesting because even
> though the replication handlers have  a different queue, the size is
> counted globally, affecting others. Please correct me on this. I hope I'm
> wrong on this one :)
>
> Our challenge has been to understand what's HBase doing under various
> scenarios. We monitor call queue lengths, sizes and latencies as the
> primary alerting mechanism to tell us something is going on with HBase.
>
> Thanks!
> -neelesh
>
> On Wed, Nov 30, 2016 at 1:15 PM, Ted Yu <[email protected]> wrote:
>
> Neelesh:Can you share more details about the sluggish cluster performance
> (such as version of hbase / phoenix, your schema, region server log
> snippet, stack traces, etc) ?
> As hbase / phoenix evolve, I hope the performance keeps getting better for
> your use case.
> Cheers
>
>     On Wednesday, November 30, 2016 10:07 AM, Neelesh <[email protected]>
> wrote:
>
>
>  We use both, in different capacities. Cassandra is an x-DC archive store
> with mostly batch writes and occasional key based reads. Hbase is for
> real-time event ingestion. Our experience so far on hbase + phoenix is that
> when it works, it is fast and scales like crazy. But if you ever hit a snag
> around data patterns, you will have a VERY hard time figuring out what's
> going on. A combination of global phoenix indexes and heavy writes leave an
> entire cluster sluggish, if there is a hint of hotspotting.
>
> On the other hand, we had a big struggle getting Cassandra when a node
> recovery was in progress. What with twice the amount of disk requirements
> during recovery etc. Other than that, it is quiet.
> But the access patterns are not the same.
>
> I think the old rule still stays. If you are already on hadoop , or
> interested in using/analysing data in several different ways, go with hbase
> . If you just need a big data store with a few predefined query patterns,
> Cassandra is good
>
> Of course, I'm biased towards HBase.
>
> On Nov 30, 2016 7:02 AM, "Mich Talebzadeh" <[email protected]>
> wrote:
>
> > Hi Guys,
> >
> > Used Hbase on HDFS reasonably well. Happy to to stick with it and more
> with
> > Hive/Phoenix views and Phoenix indexes where I can.
> >
> > I have a bunch of users now vocal about the use case for Cassandra and
> > whether it can do a better job than Hbase.
> >
> > Unfortunately I am no expert on Cassandra. However, some use case fit
> would
> > be very valuable.
> >
> > Thanks
> >
> > Dr Mich Talebzadeh
> >
> >
> >
> > LinkedIn * https://www.linkedin.com/ profile/view?id=
> <https://www.linkedin.com/profile/view?id=>
> > AAEAAAAWh2gBxianrbJd6zP6AcPCCd OABUrV8Pw
> > <https://www.linkedin.com/ profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd>
> > OABUrV8Pw>*
> >
> >
> >
> > http://talebzadehmich. wordpress.com
> <http://talebzadehmich.wordpress.com/>
> >
> >
> > *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> > loss, damage or destruction of data or any other property which may arise
> > from relying on this email's technical content is explicitly disclaimed.
> > The author will in no case be liable for any monetary damages arising
> from
> > such loss, damage or destruction.
> >
>
>
>
>
>
>
>
>

Re: Hbase on HDFS versus Cassandra

Reply via email to