How to scan only Memstore from end point co-processor

2015-06-01 Thread Gautam Borah
Hi all, Here is our use case, We have a very write heavy cluster. Also we run periodic end point co processor based jobs that operate on the data written in the last 10-15 mins, every 10 minute. Is there a way to only query in the MemStore from the end point co-processor? The periodic job scans

Re: How to scan only Memstore from end point co-processor

2015-06-01 Thread ramkrishna vasudevan
We have a postScannerOpen hook in the CP but that may not give you a direct access to know which one are the internal scanners on the Memstore and which one are on the store files. But this is possible but we may need to add some new hooks at this place where we explicitly add the internal

HBase client: refreshing the connection

2015-06-01 Thread Hariharan_Sethuraman
Hi All, We are using 0.94.15 in our Opendaylight/TSDR project currently. We observed put operation hanged for 20 mins (with all default timeouts) and then throws an IOException. Even when we re-attempt the same put operation, it hangs for 20 mins again. We observed there is an zxid mismatch on

Monitor off heap Bucket Cache

2015-06-01 Thread Dejan Menges
Hi, What's the best way to monitor / know how's bucket cache being used, how much stuff is cached there, etc? Our RegionServer can use 32G of heap size, so we exported HBASE_OFFHEAPSIZE to 24G in hbase-env.sh, set hfile.block.cache.size to 0.05, and set couple of block sizes that we know we are

Java Hbase Client or Rest approach

2015-06-01 Thread Mahadevappa, Shobha
Hi, We have a java based web application. There is a requirement to fetch the data from Hbase and build some dashboards. What is the best way to go about fetching the data from Hbase? 1 Using java hbase client api OR 2 Using the hbase rest api. Appreciate if anyone can provide the pros

Re: How to scan only Memstore from end point co-processor

2015-06-01 Thread Vladimir Rodionov
InternalScan has ctor from Scan object See https://issues.apache.org/jira/browse/HBASE-12720 You can instantiate InternalScan from Scan, set checkOnlyMemStore, then open RegionScanner, but the best approach is to cache data on write and run regular RegionScanner from memstore and block cache.

Re: hfile.bucket.BucketAllocatorException: Allocation too big size

2015-06-01 Thread Dejan Menges
Oh, cool, something that will push us to upgrade sooner than later :) Just for my information - what limit was used than in 2.1 as maximum cache block size (or whatever name it was)? Size of the block, or something else? On Mon, Jun 1, 2015 at 5:00 PM Ted Yu yuzhih...@gmail.com wrote: Dejan:

Re: hfile.bucket.BucketAllocatorException: Allocation too big size

2015-06-01 Thread Ted Yu
Which hbase release are you using ? I seem to recall that hbase.bucketcache.bucket.sizes was the key. Cheers On Mon, Jun 1, 2015 at 7:04 AM, Dejan Menges dejan.men...@gmail.com wrote: Hi, I'm getting messages like: 015-06-01 14:02:29,529 WARN

Re: hfile.bucket.BucketAllocatorException: Allocation too big size

2015-06-01 Thread Ted Yu
Dejan: hbase.bucketcache.bucket.sizes was introduced by: HBASE-10641 Configurable Bucket Sizes in bucketCache which was integrated to 0.98.4 HDP 2.2 has the fix while HDP 2.1 didn't. FYI On Mon, Jun 1, 2015 at 7:23 AM, Dejan Menges dejan.men...@gmail.com wrote: Hi Ted, It's 0.98.0 with

Re: hfile.bucket.BucketAllocatorException: Allocation too big size

2015-06-01 Thread Dejan Menges
Hi Ted, It's 0.98.0 with bunch of patches (from Hortonworks). Let me try with that key, on my way :) On Mon, Jun 1, 2015 at 4:19 PM Ted Yu yuzhih...@gmail.com wrote: Which hbase release are you using ? I seem to recall that hbase.bucketcache.bucket.sizes was the key. Cheers On Mon, Jun

hfile.bucket.BucketAllocatorException: Allocation too big size

2015-06-01 Thread Dejan Menges
Hi, I'm getting messages like: 015-06-01 14:02:29,529 WARN org.apache.hadoop.hbase.io.hfile.bucket.BucketCache: Failed allocating for block ce18012f4dfa424db88e92de29e76a9b_25809098330 org.apache.hadoop.hbase.io.hfile.bucket.BucketAllocatorException: Allocation too big size=750465 at

Re: hfile.bucket.BucketAllocatorException: Allocation too big size

2015-06-01 Thread Anoop John
Yes Ted is right. hbase.bucketcache.bucket.sizes is the correct config name... I think wrong name was added to hbase-default.xml.. There was bug already raised for this? Some thing related to bucket cache was already there.. Am not sure.. We need fix in xml. -Anoop- On Mon, Jun 1, 2015 at

Re: Monitor off heap Bucket Cache

2015-06-01 Thread Nick Dimiduk
Also note that configuration is slightly changed between 0.98 and 1.0, see HBASE-11520. From the release note: Remove hbase.bucketcache.percentage.in.combinedcache. Simplifies config of block cache. If you are using this config., after this patch goes in, it will be ignored. The L1

PhoenixIOException resolved only after compaction, is there a way to avoid it?

2015-06-01 Thread Siva
Hi Everyone, We load the data to Hbase tables through BulkImports. If the data set is small, we can query the imported data from phoenix with no issues. If data size is huge (with respect to our cluster, we have very small cluster), I m encountering the following error

Re: Hbase vs Cassandra

2015-06-01 Thread Jerry He
Another point to add is the new HBase read high-availability using timeline-consistent region replicas feature from HBase 1.0 onward, which brings HBase closer to Cassandra in term of Read Availability during node failures. You have a choice for Read Availability now.

Re: zookeeper closing socket connection exception

2015-06-01 Thread Ted Yu
How many zookeeper servers do you have ? Cheers On Mon, Jun 1, 2015 at 12:15 PM, jeevi tesh jeevitesh...@gmail.com wrote: Hi, I'm running into this issue several times but still not able resolve kindly help me in this regard. I have written a crawler which will be keep running for several

Re: zookeeper closing socket connection exception

2015-06-01 Thread Esteban Gutierrez
Hi Jeevi, Have you looked into why the ZooKeeper server is no longer accepting connections? what is the number of clients you have running per host and what is the configured value of maxClientCnxns in the ZooKeeper servers? Also is the issue impacting clients only or is it also impacting the

zookeeper closing socket connection exception

2015-06-01 Thread jeevi tesh
Hi, I'm running into this issue several times but still not able resolve kindly help me in this regard. I have written a crawler which will be keep running for several days after 4 days of continuous interaction of data base with my application system. Data base fails to responsed. I'm not able to

Re: Hbase vs Cassandra

2015-06-01 Thread Michael Segel
Well since you brought up coprocessors… lets talk about a lack of security and stability that’s been introduced by coprocessors. ;-) I’m not saying that you don’t want server side extensibility, but you need to recognize the risks introduced by coprocessors. On May 31, 2015, at 3:32 PM,

Re: Hbase vs Cassandra

2015-06-01 Thread Andrew Purtell
You are both making correct points, but FWIW HBase does not require use of Hadoop YARN or MapReduce. We do require HDFS of course. Some of the tools we ship are MapReduce applications but these are not core functions. We know of several large production use cases where the HBase(+HDFS) clusters

Re: Hbase vs Cassandra

2015-06-01 Thread Michael Segel
Saying Ambari rules is like saying that you like to drink MD 20/20 and calling it a fine wine. Sorry to all the Hortonworks guys but Amabari has a long way to go…. very immature. What that has to do with Cassandra vs HBase? I haven’t a clue. The key issue is that unless you need or want to

Re: Hbase vs Cassandra

2015-06-01 Thread Vladimir Rodionov
The key issue is that unless you need or want to use Hadoop, you shouldn’t be using HBase. Its not a stand alone product or system. Hello, what is use case of a big data application w/o Hadoop? -Vlad On Mon, Jun 1, 2015 at 2:26 PM, Michael Segel michael_se...@hotmail.com wrote: Saying Ambari

Re: Hbase vs Cassandra

2015-06-01 Thread Otis Gospodnetic
Hi Ajay, You won't be able to get unbiased opinion here easily. You'll need to try and see how each works for your use case. We use HBase for the SPM backend and it has worked well for us - it's stable, handles billions and billions of rows (I lost track of the actual number many moons ago) and

Re: Hbase vs Cassandra

2015-06-01 Thread Michael Segel
The point is that HBase is part of the Hadoop ecosystem. Not a stand alone database like Cassandra. This is one thing that gets lost when people want to compare NoSQL databases / data stores. As to Big Data without Hadoop? Well, there’s spark on mesos … :-P And there are other Big Data

Re: How to scan only Memstore from end point co-processor

2015-06-01 Thread Gautam Borah
Thanks Vladimir. We will try this out soon. Regards, Gautam On Mon, Jun 1, 2015 at 12:22 AM, Vladimir Rodionov vladrodio...@gmail.com wrote: InternalScan has ctor from Scan object See https://issues.apache.org/jira/browse/HBASE-12720 You can instantiate InternalScan from Scan, set

Re: Hbase vs Cassandra

2015-06-01 Thread Russell Jurney
Hbase can do range scans, and one can attack many problems with range scans. Cassandra can't do range scans. Hbase has a master. Cassandra does not. Those are the two main differences. On Monday, June 1, 2015, Andrew Purtell andrew.purt...@gmail.com wrote: HBase can very well be a standalone

Re: Hbase vs Cassandra

2015-06-01 Thread Andrew Purtell
HBase can very well be a standalone database, but we are debating semantics not technology I suspect. HBase uses some Hadoop ecosystem technologies but is absolutely a first class data store. I need to look no further than my employer for an example of a rather large production deploy of HBase*

[OFFTOPIC] Big Data Application Meetup

2015-06-01 Thread Alex Baranau
Hi everyone, I wanted to drop a note about a newly organized developer meetup in Bay Area: the Big Data Application Meetup (http://meetup.com/bigdataapps) and call for speakers. The plan is for meetup topics to be focused on application use-cases: how developers can build end-to-end solutions