Re: Issues with Spark On Hbase Connector

2016-08-28 Thread spats
Thanks Sachin. So it won't work with hbase 1.2.0 even if we use your code from shc branch? -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Issues-with-Spark-On-Hbase-Connector-tp4082151p4082162.html Sent from the HBase User mailing list archive at Nabble.com.

Re: HBase for Small Key Value Tables

2016-08-28 Thread Dima Spivak
(Though if it is only 7 GB, why not just store it in memory?) On Sunday, August 28, 2016, Dima Spivak wrote: > If your data can all fit on one machine, HBase is not the best choice. I > think you'd be better off using a simpler solution for small data and leave > HBase for use cases that require

Re: HBase for Small Key Value Tables

2016-08-28 Thread Dima Spivak
If your data can all fit on one machine, HBase is not the best choice. I think you'd be better off using a simpler solution for small data and leave HBase for use cases that require proper clusters. On Sunday, August 28, 2016, Manish Maheshwari wrote: > We dont want to invest into another DB lik

Re: Issues with Spark On Hbase Connector

2016-08-28 Thread Sachin Jain
Hi Sudhir, There is connection leak problem with hortonworks hbase connector if you use hbase 1.2.0. I tried to use hortonwork's connector and felt into the same problem. Have a look at this Hbase issue HBASE-16017 [0]. The fix for this was backported to 1.3.0, 1.4.0 and 2.0.0 I have raised a tic

Re: HBase for Small Key Value Tables

2016-08-28 Thread Manish Maheshwari
We dont want to invest into another DB like Dynamo, Cassandra and Already are in the Hadoop Stack. Managing another DB would be a pain. Why HBase over RDMS, is because we call HBase via Spark Streaming to lookup the keys. Manish On Mon, Aug 29, 2016 at 1:47 PM, Dima Spivak wrote: > Hey Manish,

Re: HBase for Small Key Value Tables

2016-08-28 Thread Dima Spivak
Hey Manish, Just to ask the naive question, why use HBase if the data fits into such a small table? On Sunday, August 28, 2016, Manish Maheshwari wrote: > Hi, > > We have a scenario where HBase is used like a Key Value Database to map > Keys to Regions. We have over 5 Million Keys, but the tabl

HBase for Small Key Value Tables

2016-08-28 Thread Manish Maheshwari
Hi, We have a scenario where HBase is used like a Key Value Database to map Keys to Regions. We have over 5 Million Keys, but the table size is less than 7 GB. The read volume is pretty high - About 50x of the put/delete volume. This causes hot spotting on the Data Node and the region is not split

Re: Issues with Spark On Hbase Connector

2016-08-28 Thread sudhir patil
Ok, thanks for the link Ted On Aug 29, 2016 9:54 AM, "Ted Yu" wrote: > For hortonworks product(s), consider raising question on > https://community.hortonworks.com > > FYI > > On Sun, Aug 28, 2016 at 6:45 PM, spats wrote: > > > Regarding hbase connector by hortonworks > > https://github.com/hor

Re: HBase Region Size of 2.5 TB

2016-08-28 Thread Ted Yu
Looking at source of IncreasingToUpperBoundRegionSplitPolicy, I don't see other parameters being used. FYI On Sun, Aug 28, 2016 at 5:58 PM, yeshwanth kumar wrote: > Hi Ted, > > thanks for the reply, > > i couldn't find the hbase.increasing.policy.initial.size in hbase conf, > we haven't changed

Re: Issues with Spark On Hbase Connector

2016-08-28 Thread Ted Yu
For hortonworks product(s), consider raising question on https://community.hortonworks.com FYI On Sun, Aug 28, 2016 at 6:45 PM, spats wrote: > Regarding hbase connector by hortonworks > https://github.com/hortonworks-spark/shc, it would be great if someone can > answer these > > 1. What version

Issues with Spark On Hbase Connector

2016-08-28 Thread spats
Regarding hbase connector by hortonworks https://github.com/hortonworks-spark/shc, it would be great if someone can answer these 1. What versions of Hbase & Spark expected? I could not run examples provided using spark 1.6.0 & hbase 1.2.0 2. I get error when i run example provided here , any poi

Re: HBase Region Size of 2.5 TB

2016-08-28 Thread yeshwanth kumar
Hi Ted, thanks for the reply, i couldn't find the hbase.increasing.policy.initial.size in hbase conf, we haven't changed that value. so that means intial regionsize should be 2 GB, but the region size is 2.5TB i can manually split the regions, but trying to figure out the root cause. any other c

Re: Hbase Heap Size problem and Native API response is slow

2016-08-28 Thread Dima Spivak
And what kind of performance do you see vs. what you expect to see? How big is your cluster in production/how much total data will you be storing in production? On Sunday, August 28, 2016, Manjeet Singh wrote: > Hi > I performed this testing on 2 node cluster where its i7 core processor with > 1

Re: Hbase Heap Size problem and Native API response is slow

2016-08-28 Thread Manjeet Singh
Hi I performed this testing on 2 node cluster where its i7 core processor with 16 gb ram 8 core on each node. I have very frequent get put operation on hbase using spark streaming and sql where we r aggregate data on spark group and saving it to hbase Can you give us more specifics about what kind