Re: Change delimiter in column qualifier

2017-09-19 Thread Sachin Jain
you encountered ? > > If you look at the example from: > > http://hbase.apache.org/book.html#_put_2 > > there is no delimiter: family and qualifier are two parameters to add() > method. > > On Tue, Sep 19, 2017 at 6:10 AM, Sachin Jain > wrote: > > > Hi, > > >

Change delimiter in column qualifier

2017-09-19 Thread Sachin Jain
Hi, I am using hbase in a system which does not allow using colon between column name and column family. Is there any configuration where we an provide hbase to use underscore (_) as delimiter instead of colon (:) between name and family ? Thanks -Sachin

Re: Slow HBase write across data center

2017-06-29 Thread Sachin Jain
Try to figure out which region server is handling those writes and it could be possible that particular region server is skewing your cluster's write performance. Another thing to check if your data is already skewed across regions/region servers. Once I faced this issue, I enabled multiwal and u

Re: Implementation of full table scan using Spark

2017-06-28 Thread Sachin Jain
gt; flushedCellsSize > > > > FlushMemstoreSize_num_ops > > > > For Q2, there is no client side support for knowing where the data comes > > from. > > > > On Wed, Jun 28, 2017 at 8:15 PM, Sachin Jain > > wrote: > > > > > Hi, > > >

Implementation of full table scan using Spark

2017-06-28 Thread Sachin Jain
Hi, I have used TableInputFormat and newAPIHadoopRDD defined on sparkContext to do a full table scan and get an rdd from it. Partial piece of code looks like this: sparkContext.newAPIHadoopRDD( HBaseConfigurationUtil.hbaseConfigurationForReading(table.getName.getNameWithNamespaceInclAsString,

Re: Regarding Connection Pooling

2017-06-16 Thread Sachin Jain
n Mon, Jun 12, 2017 at 9:35 PM, Sachin Jain wrote: > Thanks Allan, > > This is what I understood initially that further calls will be serial if a > request is already pending on some RS. I am running hbase 1.3.1 > Is "hbase.client.ipc.pool.size" still valid ? I thought it

Re: Regarding Connection Pooling

2017-06-12 Thread Sachin Jain
cket to each RS, and the calls written to this > socket are synchronized(or queued using another thread called CallSender ). > But usually, this won't become a bottleneck. If this is a problem for you, > you can tune "hbase.client.ipc.pool.size". > > 2017-06-12 23:47 GMT+08:

Re: Regarding Connection Pooling

2017-06-12 Thread Sachin Jain
. On 12-Jun-2017 7:31 PM, "Allan Yang" wrote: Connection is thread safe. You can use it across different threads. And requests made by different thread are handled in parallel no matter the keys are in the same region or not. 2017-06-12 20:44 GMT+08:00 Sachin Jain : > Hi, > >

Regarding Connection Pooling

2017-06-12 Thread Sachin Jain
Hi, I was going through connections in hbase. Here is reference from ConnectionFactory API doc. > Connection encapsulates all housekeeping for a connection to the cluster. All tables and interfaces created from returned connection share zookeeper connection, meta cache, and connections to region

Re: getting start and stop key

2017-06-06 Thread Sachin Jain
Just to add @Ted Yu's answer, you can confirm this by looking at your HMaster UI and see the regions and their boundaries. On Tue, Jun 6, 2017 at 3:50 PM, Ted Yu wrote: > Looks like your table has only one region. > > > On Jun 6, 2017, at 3:14 AM, Rajeshkumar J > wrote: > > > > I am getting sta

Re: Any Repercussions of using Multiwal

2017-06-06 Thread Sachin Jain
data ingestion continues but flush delayed, the memstore size might > exceed the upper limit thus throw RegionTooBusyException > > Hope these information helps. > > Best Regards, > Yu > > On 6 June 2017 at 13:39, Sachin Jain wrote: > > > Hi, > > > &g

Any Repercussions of using Multiwal

2017-06-05 Thread Sachin Jain
Hi, I was in the middle of a situation where I was getting *RegionTooBusyException* with log something like: *Above Memstore limit, regionName = X ... memstore size = Y and blockingMemstoreSize = Z* This potentially hinted me towards *hotspotting* of a particular region. So I fixed my keyspa

Re: Creating HBase table with presplits

2016-12-13 Thread Sachin Jain
calculate your keyspace size by a lot, you are stuck with > the > > hash function and range you selected even if you later get more regions > > unless you're willing to do complete migration to a new table > > > > Hope above helps. > > > > > > Saad >

Re: Downsides of having large number of versions in hbase

2016-11-30 Thread Sachin Jain
. [0]: http://hbase.apache.org/book.html#schema.versions On Tue, Nov 29, 2016 at 4:07 PM, Sachin Jain wrote: > Hi, > > I am curious to understand the impact of having large number of versions > in HBase. Suppose I want to maintain previous 100 versions for a row/cell. > &g

Downsides of having large number of versions in hbase

2016-11-29 Thread Sachin Jain
Hi, I am curious to understand the impact of having large number of versions in HBase. Suppose I want to maintain previous 100 versions for a row/cell. My thoughts are:- Having large number of versions means more number of HFiles More number of HFiles can increase lookup time of a rowKey. Hyp

Re: Creating HBase table with presplits

2016-11-29 Thread Sachin Jain
orrect that there is no way to > presplit your regions in an effective way. Either you need to make some > starting guess, such as a small number of uniform splits, or wait until you > have some information about what the data will look like. > > Dave > > On Mon, Nov 28, 20

Creating HBase table with presplits

2016-11-28 Thread Sachin Jain
Hi, I was going though pre-splitting a table article [0] and it is mentioned that it is generally best practice to presplit your table. But don't we need to know the data in advance in order to presplit it. Question: What should be the best practice when we don't know what data is going to be ins

Re: Default value of caching in Scanner

2016-11-01 Thread Sachin Jain
/issues.apache.org/jira/browse/HBASE-16973 recently, you can get > more details there. > > Small world, isn't it? (Smile) > > Best Regards, > Yu > > On 1 November 2016 at 13:10, Sachin Jain wrote: > > > Hi, > > > > I am using HBase v1.1.2. I have f

Default value of caching in Scanner

2016-10-31 Thread Sachin Jain
Hi, I am using HBase v1.1.2. I have few questions regarding full table scan:- 1. When we instantiate a Scanner and do not set any caching on it. What is the value it picks by default. - By looking at the code, I have found the following: >From documentation on the top in Scan.java class * To mo

Re: Issues with Spark On Hbase Connector

2016-08-29 Thread Sachin Jain
If you take my code then it should work. I have tested it on Hbase 1.2.1. On Aug 29, 2016 12:21 PM, "spats" wrote: > Thanks Sachin. > > So it won't work with hbase 1.2.0 even if we use your code from shc branch? > > > > > -- > View this message in context: http://apache-hbase.679495.n3. > nabble.

Re: Issues with Spark On Hbase Connector

2016-08-28 Thread Sachin Jain
Hi Sudhir, There is connection leak problem with hortonworks hbase connector if you use hbase 1.2.0. I tried to use hortonwork's connector and felt into the same problem. Have a look at this Hbase issue HBASE-16017 [0]. The fix for this was backported to 1.3.0, 1.4.0 and 2.0.0 I have raised a tic

Re: How to get size of Hbase Table

2016-07-21 Thread Sachin Jain
leRegions(final TableName tableName) > > From HRegion: > > public static HDFSBlocksDistribution computeHDFSBlocksDistribution(final > Configuration conf, > > final HTableDescriptor tableDescriptor, final HRegionInfo regionInfo) > throws IOException { > > FYI >

How to get size of Hbase Table

2016-07-20 Thread Sachin Jain
*Context* I am using Spark (1.5.1) with HBase (1.1.2) to dump the output of Spark Jobs into HBase which will be further available as lookups from HBase Table. BaseRelation extends HadoopFSRelation and is used to read and write to HBase. Spark Default Source API is used. *Use Case* Now, whenever I