Re: HBase maven package built against Hadoop 2.X

2015-07-28 Thread Varun Sharma
Is it possible to upload one or is there a guide on how to upload ? THanks Varun On Mon, Jul 27, 2015 at 2:24 PM, Ted Yu yuzhih...@gmail.com wrote: To my knowledge, there is no published maven artifact for 0.94 built against hadoop 2.X Cheers On Mon, Jul 27, 2015 at 2:21 PM, Varun Sharma

HBase maven package built against Hadoop 2.X

2015-07-27 Thread Varun Sharma
Hi, Is there a maven package for HBase 0.94 built against hadoop 2.X ? I think the default one available is built against Hadoop 1.X ? Thanks Varun

Questions about custom helix rebalancer/controller/agent

2014-07-31 Thread Varun Sharma
Hi, I am trying to write a customized rebalancing algorithm. I would like to run the rebalancer every 30 minutes inside a single thread. I would also like to completely disable Helix triggering the rebalancer. I have a few questions: 1) What's the best way to run the custom controller ? Can I

Re: Questions about custom helix rebalancer/controller/agent

2014-07-31 Thread Varun Sharma
Sorry - wrong user mailing list - please ignore... On Thu, Jul 31, 2014 at 12:12 PM, Varun Sharma va...@pinterest.com wrote: Hi, I am trying to write a customized rebalancing algorithm. I would like to run the rebalancer every 30 minutes inside a single thread. I would also like

Question about HFile indices

2014-07-14 Thread Varun Sharma
Hi folks, I am wondering why we have a tiered index in the HFile format. Is it because the root index must fit in memory - hence must be limited in size. Does the bound on the root index pretty much dictate the index tiers ? Thank Varun

Question about Prefix Encodings

2014-05-30 Thread Varun Sharma
Hi, I have a question about prefix encodings. When we specify encoding to be FAST_DIFF, PREFIX, are the index/bloom filter blocks also encoded with the same encoding. Also, are these blocks such as index/bloom blocks kept in the encoded form inside the block cache ? Thanks Varun

Re: Question about Prefix Encodings

2014-05-30 Thread Varun Sharma
Seems like its called DATA_BLOCK_ENCODING, so it should only apply to data blocks ? On Fri, May 30, 2014 at 11:36 AM, Varun Sharma va...@pinterest.com wrote: Hi, I have a question about prefix encodings. When we specify encoding to be FAST_DIFF, PREFIX, are the index/bloom filter blocks

Re: no-flush based snapshot policy?

2014-04-02 Thread Varun Sharma
Seems like those JIRAs are 1.0 - did not see a 0.94 version # there ? On Wed, Apr 2, 2014 at 1:40 PM, Ted Yu yuzhih...@gmail.com wrote: Tianying: Have you seen the design doc attached to HBASE-7912 'HBase Backup/Restore Based on HBase Snapshot' ? Cheers On Tue, Mar 25, 2014 at 2:38

Cell values larger than Column Family Block Size

2014-02-24 Thread Varun Sharma
Hi, What happens if my block size is 32K while the cells are 50K. Do Hfile blocks round up to 50K or are values split across blocks ? Also how does this play with the block cache ? Thanks Varun

Re: Cell values larger than Column Family Block Size

2014-02-24 Thread Varun Sharma
:07 AM, Ted Yu yuzhih...@gmail.com wrote: Cycling old bits: http://search-hadoop.com/m/DHED4v7stT1/larger+HFile+block+size+for+very+wide+rowsubj=larger+HFile+block+size+for+very+wide+row+ On Mon, Feb 24, 2014 at 11:51 AM, Varun Sharma va...@pinterest.com wrote: Hi, What happens if my

Re: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)

2014-02-02 Thread Varun Sharma
Actually there are 2 read aheads in linux (from what I learned last time, I did benchmarking on random reads). One is the filesystem readahead which linux does and then there is also a disk level read ahead which can be modified by using the hdparm command. IIRC, there is no sure way of removing

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-28 Thread Varun Sharma
...@gmail.com wrote: Varun: Take a look at http://hbase.apache.org/book.html#dm.sort There's no contradiction. Cheers On Jan 27, 2014, at 11:40 PM, Varun Sharma va...@pinterest.com wrote: Actually, I now have another question because of the way our work load is structured. We use a wide

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-28 Thread Varun Sharma
, 'above' means before. 'below it' would mean after. So 'smaller' would mean before. Cheers On Tue, Jan 28, 2014 at 8:47 AM, Varun Sharma va...@pinterest.com wrote: Hi Ted, Not satisfied with your answer, the document you sent does not talk about Delete ColumnFamily marker sort order

Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
We are seeing one other issue with high read latency (p99 etc.) on one of our read heavy hbase clusters which is correlated with the balancer runs - every 5 minutes. If there is no balancing to do, does the balancer only scan the table every 5 minutes - does it do anything on top of that if the

Re: Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
(), we have (same for 0.94 and 0.96): for (RegionPlan plan: plans) { LOG.info(balance + plan); Do you see such log in master log ? On Mon, Jan 27, 2014 at 7:26 PM, Varun Sharma va...@pinterest.com wrote: We are seeing one other issue with high read latency (p99 etc

Re: Balancer switch runs causing problems

2014-01-27 Thread Varun Sharma
Actually not sometimes but we are always seeing a large # of .META. reads every 5 minutes. On Mon, Jan 27, 2014 at 7:47 PM, Varun Sharma va...@pinterest.com wrote: The default one with 0.94.7... - I dont see any of those logs. Also we turned off the balancer switch - but looks like sometimes

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-27 Thread Varun Sharma
profiling has been focused on data in the blockcache since in the normal case the vast majority of all data is found there and only recent changes are in the memstore. -- Lars From: Varun Sharma va...@pinterest.com To: user@hbase.apache.org user

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-27 Thread Varun Sharma
and test them. -Vladimir On Mon, Jan 27, 2014 at 9:36 PM, Varun Sharma va...@pinterest.com wrote: Hi lars, Thanks for the background. It seems that for our case, we will have to consider some solution like the Facebook one, since the next column is always the next one - this can

Sporadic memstore slowness for Read Heavy workloads

2014-01-26 Thread Varun Sharma
Hi, We are seeing some unfortunately low performance in the memstore - we have researched some of the previous JIRA(s) and seen some inefficiencies in the ConcurrentSkipListMap. The symptom is a RegionServer hitting 100 % cpu at weird points in time - the bug is hard to reproduce and there isn't

Re: Sporadic memstore slowness for Read Heavy workloads

2014-01-26 Thread Varun Sharma
) org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4750) org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2152) org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3700) On Sun, Jan 26, 2014 at 1:14 PM, Varun Sharma va...@pinterest.com wrote: Hi, We

Question about hbase.rpc.timeout

2013-11-11 Thread Varun Sharma
Hi, Can hbase rpc timeout be changed across different HBase rpc calls for HBase 0.94. From the code, it looks like this is not possible ? I am wondering if there is a way to fix this ? Thanks Varun

Re: Question about hbase.rpc.timeout

2013-11-11 Thread Varun Sharma
, Varun Sharma va...@pinterest.com wrote: Hi, Can hbase rpc timeout be changed across different HBase rpc calls for HBase 0.94. From the code, it looks like this is not possible ? I am wondering if there is a way to fix this ? I'm guessing you have already dug in and noticed

Re: HBase Random Read latency 100ms

2013-10-08 Thread Varun Sharma
How many reads per second per region server are you throwing at the system - also is 100ms the average latency ? On Mon, Oct 7, 2013 at 2:04 PM, lars hofhansl la...@apache.org wrote: He still should not see 100ms latency. 20ms, sure. 100ms seems large; there are still 8 machines serving the

Re: Is there a problem with having 4000 tables in a cluster?

2013-09-24 Thread Varun Sharma
Its better to do some salting in your keys for the reduce phase. Basically, make ur key be something like KeyHash + Key and then decode it in your reducer and write to HBase. This way you avoid the hotspotting problem on HBase due to MapReduce sorting. On Tue, Sep 24, 2013 at 2:50 PM, Jean-Marc

Re: Is there a problem with having 4000 tables in a cluster?

2013-09-24 Thread Varun Sharma
I would need to do a full table scan for every lookup. Jean-Marc : What problems do you see with my solution? Do you have a better suggestion? --Jeremy On Tue, Sep 24, 2013 at 3:16 PM, Varun Sharma va...@pinterest.com wrote: Its better to do some salting in your keys

Re: Is there a problem with having 4000 tables in a cluster?

2013-09-24 Thread Varun Sharma
is implied by the term salt.) What you really want to do is take the hash of the key, and then truncate the hash. Use that instead of a salt. Much better than a salt. Sent from a remote device. Please excuse any typos... Mike Segel On Sep 24, 2013, at 5:17 PM, Varun Sharma va

Re: HBase - stable versions

2013-09-04 Thread Varun Sharma
We, at Pinterest, are also going to stay on 0.94 for a while since it has worked well for us and we don't have the resources to test 0.96 in the EC2 environment. That may change in the future but we don't know when... On Wed, Sep 4, 2013 at 1:53 PM, Andrew Purtell apurt...@apache.org wrote: If

Re: Excessive .META scans

2013-08-01 Thread Varun Sharma
at 11:27 AM, Varun Sharma va...@pinterest.com wrote: JD, its a big problem. The region server holding .META has 2X the network traffic and 2X the cpu load, I can easily spot the region server holding .META. by just looking at the ganglia graphs of the region servers side by side - I don't

Re: Excessive .META scans

2013-07-30 Thread Varun Sharma
hbase.client.prefetch.limit to 0 Also, is it even causing a problem or you're just worried it might since it doesn't look normal? J-D On Mon, Jul 29, 2013 at 10:32 AM, Varun Sharma va...@pinterest.com wrote: Hi folks, We are seeing an issue with hbase 0.94.3 on CDH 4.2.0 with excessive .META

Excessive .META scans

2013-07-29 Thread Varun Sharma
Hi folks, We are seeing an issue with hbase 0.94.3 on CDH 4.2.0 with excessive .META. reads... In the steady state where there are no client crashes and there are no region server crashes/region movement, the server holding .META. is serving an incredibly large # of read requests on the .META.

Re: EC2 instance type recommendation ?

2013-07-16 Thread Varun Sharma
We have both c1.xlarge and hi1.4xlarge clusters at Pinterest. We have used the following guidelines: 1) hi1.4xlarge - small data sets, random read heavy and IOPs bound - very expensive per GB but very cheap per IOP 2) c1.xlarge/m1.xlarge - larger data sets, medium to low read load - cheap per GB

Re: GC recommendations for large Region Server heaps

2013-07-09 Thread Varun Sharma
Hi Suraj, One thing I have observed is that if you very high block cache churn which happens in a ready heavy workload - a full GC eventually happens because more block cache blocks bleed into the old generation (LRU based caching). I have seen this happen particularly when the read load is

Re: optimizing block cache requests + eviction

2013-07-08 Thread Varun Sharma
FYI, if u disable your block cache - you will ask for Index blocks for every single request. So such a high rate of request is plausible for Index blocks even when your requests are totally random on your data. Varun On Mon, Jul 8, 2013 at 4:45 PM, Viral Bajaria viral.baja...@gmail.comwrote:

Re: Issues with delete markers

2013-07-01 Thread Varun Sharma
- Original Message - From: Varun Sharma va...@pinterest.com To: d...@hbase.apache.org d...@hbase.apache.org; user@hbase.apache.org Cc: Sent: Sunday, June 30, 2013 1:56 PM Subject: Re: Issues with delete markers Sorry, typo, i meant that for user scans, should we be passing delete

Re: Issues with delete markers

2013-07-01 Thread Varun Sharma
I mean version tracking with delete markers... On Mon, Jul 1, 2013 at 8:17 AM, Varun Sharma va...@pinterest.com wrote: So, yesterday, I implemented this change via a coprocessor which basically initiates a scan which is raw, keeps tracking of # of delete markers encountered and stops when

Issues with delete markers

2013-06-30 Thread Varun Sharma
Hi, We are having an issue with the way HBase does handling of deletes. We are looking to retrieve 300 columns in a row but the row has tens of thousands of delete markers in it before we span the 300 columns something like this row DeleteCol1 Col1 DeleteCol2 Col2 ...

Re: Issues with delete markers

2013-06-30 Thread Varun Sharma
I tried this a little bit and it seems that filters are not called on delete markers. For raw scans returning delete markers, does it make sense to do that ? Varun On Sun, Jun 30, 2013 at 12:03 PM, Varun Sharma va...@pinterest.com wrote: Hi, We are having an issue with the way HBase does

Re: Issues with delete markers

2013-06-30 Thread Varun Sharma
Sorry, typo, i meant that for user scans, should we be passing delete markers through.the filters as well ? Varun On Sun, Jun 30, 2013 at 1:03 PM, Varun Sharma va...@pinterest.com wrote: For user scans, i feel we should be passing delete markers through as well. On Sun, Jun 30, 2013 at 12

Question about local reads and HDFS 347

2013-06-25 Thread Varun Sharma
I was looking at HDFS 347 and the nice long story with impressive benchmarks and that it should really help with region server performance. The question I had was whether it would still help if we were already using the short circuit local reads setting already provided by HBase. Are there any

Re: Replication not suited for intensive write applications?

2013-06-20 Thread Varun Sharma
What is the ageOfLastShippedOp as reported on your Master region servers (should be available through the /jmx) - it tells the delay your edits are experiencing before being shipped. If this number is 1000 (in milliseconds), I would say replication is doing a very good job. This is the most

Re: Writing unit tests against HBase

2013-06-20 Thread Varun Sharma
-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java On Tue, Jun 18, 2013 at 4:22 PM, Stack st...@duboce.net wrote: On Tue, Jun 18, 2013 at 4:17 PM, Varun Sharma va...@pinterest.com wrote: Hi, If I wanted to write to write a unit test against HTable/HBase

Re: Replication not suited for intensive write applications?

2013-06-20 Thread Varun Sharma
On Thu, Jun 20, 2013 at 11:10 AM, Asaf Mesika asaf.mes...@gmail.com wrote: On Thu, Jun 20, 2013 at 7:12 PM, Varun Sharma va...@pinterest.com wrote: What is the ageOfLastShippedOp as reported on your Master region servers (should be available through the /jmx) - it tells the delay your edits

Re: Writing unit tests against HBase

2013-06-20 Thread Varun Sharma
org.apache.hadoop.hbase.regionserver.HRegionServer cleanup SEVERE: Failed init Is there a fix for this or can I disable the WAL here completely ? Varun On Thu, Jun 20, 2013 at 12:12 PM, Christophe Taton ta...@wibidata.comwrote: Hey Varun, On Thu, Jun 20, 2013 at 11:56 AM, Varun Sharma va...@pinterest.com wrote: Now that I think

Writing unit tests against HBase

2013-06-18 Thread Varun Sharma
Hi, If I wanted to write to write a unit test against HTable/HBase, is there an already available utility to that for unit testing my application logic. I don't want to write code that either touches production or requires me to mock an HTable. I am looking for a test htable object which behaves

Re: Multiple different failures

2013-06-01 Thread Varun Sharma
Are you saying 97 % data was lost or was it offlined until the region servers came back up ? Varun On Sat, Jun 1, 2013 at 6:31 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, Today I faced a power outage. 4 computers stayed up. The 3 ZK servers, the Master, the NN and 2 DN/RS.

Major vs Minor compactions

2013-05-29 Thread Varun Sharma
Hi, I am working on some compaction coprocessors for a column family with versions set to 1. I am using the preCompact hook to wrap a scanner around the compaction scanner. I wanted to know what to expect from the minor compaction output. Can I assume the following: 1) Versions for each cell

Re: RS crash upon replication

2013-05-23 Thread Varun Sharma
for sure. Could you create a jira with logs/znode dump/steps to reproduce it? Thanks, himanshu On Wed, May 22, 2013 at 5:01 PM, Varun Sharma va...@pinterest.com wrote: It seems I can reproduce this - I did a few rolling restarts and got screwed with NoNode exceptions - I am running 0.94.7

Re: RS crash upon replication

2013-05-23 Thread Varun Sharma
thing and I'm still trying to figure out how to replace their jars with mine in a clean and non intrusive way On Thu, May 23, 2013 at 10:33 AM, Varun Sharma va...@pinterest.com wrote: Actually, it seems like something else was wrong here - the servers came up just fine on trying

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
Basically, You had va-p-hbase-02 crash - that caused all the replication related data in zookeeper to be moved to va-p-hbase-01 and have it take over for replicating 02's logs. Now each region server also maintains an in-memory state of whats in ZK, it seems like when you start up 01, its trying

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
Also what version of HBase are you running ? On Wed, May 22, 2013 at 1:38 PM, Varun Sharma va...@pinterest.com wrote: Basically, You had va-p-hbase-02 crash - that caused all the replication related data in zookeeper to be moved to va-p-hbase-01 and have it take over for replicating 02's

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
,60020,1369249873379 [1] [zk: va-p-zookeeper-01-c:2181(CONNECTED) 2] ls /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1 [] I'm on hbase-0.94.2-cdh4.2.1 Thanks On Wed, May 22, 2013 at 11:40 PM, Varun Sharma va...@pinterest.com wrote: Also what version of HBase

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
amit.mor.m...@gmail.com wrote: empty return: [zk: va-p-zookeeper-01-c:2181(CONNECTED) 10] ls /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1 [] On Thu, May 23, 2013 at 12:05 AM, Varun Sharma va...@pinterest.com wrote: Do an ls not a get here and give the output ? ls /hbase

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
I see - so looks okay - there's just a lot of deep nesting in there - if you look into these you nodes by doing ls - you should see a bunch of WAL(s) which still need to be replicated... Varun On Wed, May 22, 2013 at 2:16 PM, Varun Sharma va...@pinterest.com wrote: 2013-05-22 15:31:25,929

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
Can you do ls /hbase/rs and see what you get for 02-d - instead of looking in /replication/, could you look in /hbase/replication/rs - I want to see if the timestamps are matching or not ? Varun On Wed, May 22, 2013 at 2:17 PM, Varun Sharma va...@pinterest.com wrote: I see - so looks okay

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
Basically ls /hbase/rs and what do you see for va-p-02-d ? On Wed, May 22, 2013 at 2:19 PM, Varun Sharma va...@pinterest.com wrote: Can you do ls /hbase/rs and see what you get for 02-d - instead of looking in /replication/, could you look in /hbase/replication/rs - I want to see

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
, Varun Sharma va...@pinterest.com wrote: Basically ls /hbase/rs and what do you see for va-p-02-d ? On Wed, May 22, 2013 at 2:19 PM, Varun Sharma va...@pinterest.com wrote: Can you do ls /hbase/rs and see what you get for 02-d - instead of looking in /replication/, could you

Re: RS crash upon replication

2013-05-22 Thread Varun Sharma
:) be ? I have no serious problem running copyTable with a time period corresponding to the outage and then to start the sync back again. One question though, how did it cause a crash ? On Thu, May 23, 2013 at 12:32 AM, Varun Sharma va...@pinterest.com wrote: I believe there were

Re: Questions about HBase replication

2013-05-20 Thread Varun Sharma
a compaction happens at the slave after the Deletes are shipped to the slave, but before the Puts are shipped... The Puts will reappear. -- Lars From: Varun Sharma va...@pinterest.com To: user@hbase.apache.org Sent: Sunday, May 19, 2013 12:13 PM Subject

Re: Questions about HBase replication

2013-05-20 Thread Varun Sharma
the WAL has been replicated - is it purged immediately or soonish from the zookeeper ? Thanks Varun On Mon, May 20, 2013 at 9:57 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: On Mon, May 20, 2013 at 12:35 AM, Varun Sharma va...@pinterest.com wrote: Hi Lars, Thanks for the response

Re: Questions about HBase replication

2013-05-20 Thread Varun Sharma
On Mon, May 20, 2013 at 3:54 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: On Mon, May 20, 2013 at 3:48 PM, Varun Sharma va...@pinterest.com wrote: Thanks JD for the response... I was just wondering if issues have ever been seen with regards to moving over a large number of WAL(s

Re: Questions about HBase replication

2013-05-20 Thread Varun Sharma
So, we have a separate thread doing the recovered logs. That is good to know. I was mostly concerned about any potential races b/w the master renaming the log files, doing the distributed log split and doing a lease recovery over the final file when the DN also dies. Apart from that, it seemed to

Re: Question about HFile seeking

2013-05-17 Thread Varun Sharma
concatenated key (row+col. qualifier+timestamp) in which case, it would be difficult to run prefix scans since prefixes could potentially bleed across row and col. Varun On Thu, May 16, 2013 at 11:54 PM, Michael Stack st...@cloudera.com wrote: On Thu, May 16, 2013 at 3:26 PM, Varun Sharma va

Key Value collision

2013-05-16 Thread Varun Sharma
Hi, I am wondering what happens when we add the following: row, col, timestamp -- v1 A flush happens. Now, we add row, col, timestamp -- v2 A flush happens again. In this case if MAX_VERSIONS == 1, how is the tie broken during reads and during minor compactions, is it arbitrary ? Thanks

Question about HFile seeking

2013-05-16 Thread Varun Sharma
Lets say I have the following in my table: col1 row1 v1 -- HFile entry would be row1,col1,ts1--v1 ol1 row1c v2 -- HFile entry would be row1c,ol1,ts1--v2 Now I issue a prefix scan asking row for row row1c, how do we seek - do we seek

Re: Question about HFile seeking

2013-05-16 Thread Varun Sharma
Or do we use some kind of demarcator b/w rows and columns and timestamps when building the HFile keys and the indices ? Thanks Varun On Thu, May 16, 2013 at 1:56 PM, Varun Sharma va...@pinterest.com wrote: Lets say I have the following in my table: col1 row1 v1

Re: Question about HFile seeking

2013-05-16 Thread Varun Sharma
wrote: What you seeing Varun (or think you are seeing)? St.Ack On Thu, May 16, 2013 at 2:30 PM, Stack st...@duboce.net wrote: On Thu, May 16, 2013 at 2:03 PM, Varun Sharma va...@pinterest.com wrote: Or do we use some kind of demarcator b/w rows and columns and timestamps when building

Re: Question about HFile seeking

2013-05-16 Thread Varun Sharma
to grab the real portion from the concatenated HFile key and discard all row1 entries. Does that make my query clearer ? On Thu, May 16, 2013 at 2:42 PM, Varun Sharma va...@pinterest.com wrote: Nothing, I am just curious... So, we will do a bunch of wasteful scanning - that's lets say row1 has

Re: Question about HFile seeking

2013-05-16 Thread Varun Sharma
On Thu, May 16, 2013 at 2:55 PM, Varun Sharma va...@pinterest.com wrote: Sorry I may have misunderstood what you meant. When you look for row1c in the HFile index - is it going to also match for row1,col1 or only match row1c. It all depends how the index is organized, if its only on HFile keys

Running prefix scans in hbase

2013-05-15 Thread Varun Sharma
Hi, I was looking at PrefixFilter but going by the implementation - it looks we scan every row until we hit the prefix instead of seeking to the row with the required prefix. I was wondering if there are more efficient alternatives which would do a real seek rather than scanning all rows. Would

Re: Running prefix scans in hbase

2013-05-15 Thread Varun Sharma
Not after but only before hitting the prefix - I will check the startRow stuff - I could not find where the seek happens for that... On Wed, May 15, 2013 at 7:51 AM, Stack st...@duboce.net wrote: On Tue, May 14, 2013 at 11:33 PM, Varun Sharma va...@pinterest.com wrote: Hi, I

Re: Failed deleting my ephemeral node

2013-05-07 Thread Varun Sharma
Do you have NTP on your cluster - I have seen this manifest due to clock skew.. Varun On Tue, May 7, 2013 at 6:05 AM, Fabien Chung chung.fab...@gmail.com wrote: Hi all, i have a cluster with 8 machines (CDH4). I use an ETL (Talend) to insert data into hbase. Mostof time that works

Re: JVM seg fault in HBase region server

2013-05-06 Thread Varun Sharma
Did you have the jvm error logging enabled -XX:ErrorLog or something and if yes, did that spew anything out ? Thanks Varun On Sun, May 5, 2013 at 10:18 PM, tsuna tsuna...@gmail.com wrote: On Thu, May 2, 2013 at 1:49 PM, Andrew Purtell apurt...@apache.org wrote: In that blog post Benoît

Re: JVM seg fault in HBase region server

2013-05-02 Thread Varun Sharma
1.6.0u38. Is this possibly too old ? Thanks Varun On Thu, May 2, 2013 at 12:08 PM, Andrew Purtell apurt...@apache.org wrote: Can you pastebin or post somewhere the entire hs_err* file? On Wed, May 1, 2013 at 1:54 PM, Varun Sharma va...@pinterest.com wrote: Hi, I am seeing the following

Re: JVM seg fault in HBase region server

2013-05-02 Thread Varun Sharma
:39 PM, Varun Sharma va...@pinterest.com wrote: I don't have one unfortunately - We did not have the -XX:ErrorLog turned on :( But I did some digging following what Benoit wrote in his Blog. Basically the segfault happens in the same place inside a clearerr() function in glibc which

JVM seg fault in HBase region server

2013-05-01 Thread Varun Sharma
Hi, I am seeing the following which is a JVM segfault: hbase-regionser[28734]: segfault at 8 ip 7f269bcc307e sp 7fff50f7e638 error 4 in libc-2.15.so[7f269bc51000+1b5000] Benoit Tsuna reported a similar issue a while back -

Re: JVM seg fault in HBase region server

2013-05-01 Thread Varun Sharma
On Wed, May 1, 2013 at 1:54 PM, Varun Sharma va...@pinterest.com wrote: Hi, I am seeing the following which is a JVM segfault: hbase-regionser[28734]: segfault at 8 ip 7f269bcc307e sp 7fff50f7e638 error 4 in libc-2.15.so[7f269bc51000+1b5000] Benoit Tsuna reported a similar issue a while

Re: Slow region server recoveries

2013-04-21 Thread Varun Sharma
Hi Ted, Nicholas, Thanks for the comments. We found some issues with lease recovery and I patched HBASE 8354 to ensure we don't see data loss. Could you please look at HDFS 4721 and HBASE 8389 ? Thanks Varun On Sat, Apr 20, 2013 at 10:52 AM, Varun Sharma va...@pinterest.com wrote

Re: Slow region server recoveries

2013-04-20 Thread Varun Sharma
clarify this point it would be great. Cheers, Nicolas On Fri, Apr 19, 2013 at 10:10 PM, Varun Sharma va...@pinterest.com wrote: This is 0.94.3 hbase... On Fri, Apr 19, 2013 at 1:09 PM, Varun Sharma va...@pinterest.com wrote: Hi Ted, I had a long offline discussion

Re: Slow region server recoveries

2013-04-20 Thread Varun Sharma
The important thing to note is the block for this rogue WAL is UNDER_RECOVERY state. I have repeatedly asked HDFS dev if the stale node thing kicks in correctly for UNDER_RECOVERY blocks but failed. On Sat, Apr 20, 2013 at 10:47 AM, Varun Sharma va...@pinterest.com wrote: Hi Nicholas

Re: Slow region server recoveries

2013-04-19 Thread Varun Sharma
if + * the namenode has not received heartbeat msg from a + * datanode for more than staleInterval (default value is + * {@link DFSConfigKeys#DFS_NAMENODE_STALE_DATANODE_INTERVAL_MILLI_DEFAULT}), + * the datanode will be treated as stale node. On Fri, Apr 19, 2013 at 10:28 AM, Varun Sharma va

Re: Slow region server recoveries

2013-04-19 Thread Varun Sharma
, Apr 19, 2013 at 10:28 AM, Varun Sharma va...@pinterest.com wrote: Is there a place to upload these logs ? On Fri, Apr 19, 2013 at 10:25 AM, Varun Sharma va...@pinterest.com wrote: Hi Nicholas, Attached are the namenode, dn logs (of one of the healthy replicas of the WAL

Re: Slow region server recoveries

2013-04-19 Thread Varun Sharma
This is 0.94.3 hbase... On Fri, Apr 19, 2013 at 1:09 PM, Varun Sharma va...@pinterest.com wrote: Hi Ted, I had a long offline discussion with nicholas on this. Looks like the last block which was still being written too, took an enormous time to recover. Here's what happened. a) Master

Slow region server recoveries

2013-04-18 Thread Varun Sharma
Hi, We are facing problems with really slow HBase region server recoveries ~ 20 minuted. Version is hbase 0.94.3 compiled with hadoop.profile=2.0. Hadoop version is CDH 4.2 with HDFS 3703 and HDFS 3912 patched and stale node timeouts configured correctly. Time for dead node detection is still 10

Full row delete followed by Put

2013-04-08 Thread Varun Sharma
Hi, If I perform a full row Delete using the Delete API for a row and then after few milliseconds, issue a Put(row, Map of columns, values) - will that go through assuming that timestamps are applied in increasing order ? Thanks Varun

Re: Interactions between max versions and filters

2013-04-06 Thread Varun Sharma
0.94.6.1 to see if the problem is solved ? Writing a unit test probably is the easiest way for validation. Thanks On Fri, Apr 5, 2013 at 5:25 PM, Varun Sharma va...@pinterest.com wrote: HBASE 5257 is probably what lars is talking about - that fixed a bug with version

Recovery failure during single Get()

2013-04-06 Thread Varun Sharma
Hi, We are observing this bug for a while when we use HTable.get() operation to do a single Get call using the Result get(Get get) API and I thought its best to bring it up. Steps to reproduce this bug: 1) Gracefull restart a region server causing regions to get redistributed. 2) Client call to

Re: Adding String offset for ColumnPaginationFilter

2013-04-05 Thread Varun Sharma
4, 2013 at 10:31 AM, Varun Sharma va...@pinterest.com wrote: Hi, I am thinking of adding a string offset to ColumnPaginationFilter. There are two reasons: 1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT. 2) For correctness reasons, this approach is better if the list

Re: Interactions between max versions and filters

2013-04-05 Thread Varun Sharma
HBASE 5257 is probably what lars is talking about - that fixed a bug with version tracking on ColumnPaginatinoFilter - there is a patch for 0.92, 0.94 and 0.96 but not for the cdh versions... On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl la...@apache.org wrote: Normally Filters are evaluated

Adding String offset for ColumnPaginationFilter

2013-04-04 Thread Varun Sharma
Hi, I am thinking of adding a string offset to ColumnPaginationFilter. There are two reasons: 1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT. 2) For correctness reasons, this approach is better if the list of columns is mutation. Lets say you get 1st 50 columns using the current

GC performance/benchmark

2013-03-05 Thread Varun Sharma
Hey folks, I was wondering what kind of GC times do people see (preferably on ec2). Such as what is the typical time to collect 256M new generation on an X core machine. We are seeing a pause time of ~50 milliseconds on a c1.xlarge machine for 256M - this has 8 virtual cores. Is that typical ?

Re: HBase Thrift inserts bottlenecked somewhere -- but where?

2013-03-03 Thread Varun Sharma
What is the size of your writes ? On Sat, Mar 2, 2013 at 2:29 PM, Dan Crosta d...@magnetic.com wrote: Hm. This could be part of the problem in our case. Unfortunately we don't have very good control over which rowkeys will come from which workers (we're not using map-reduce or anything like

Re: HBase Thrift inserts bottlenecked somewhere -- but where?

2013-03-01 Thread Varun Sharma
Hi, I don't know how many worker threads you have at the thrift servers. Each thread gets dedicated to a single connection and only serves that connection. New connections get queued. Also, are you sure that you are not saturating the client side making the calls ? Varun On Fri, Mar 1, 2013 at

Re: HBase Thrift inserts bottlenecked somewhere -- but where?

2013-03-01 Thread Varun Sharma
Did you try running 30-40 proc(s) on one machine and another 30-40 proc(s) on another machine to see if that doubles the throughput ? On Fri, Mar 1, 2013 at 10:46 AM, Varun Sharma va...@pinterest.com wrote: Hi, I don't know how many worker threads you have at the thrift servers. Each thread

Re: HBase Thrift inserts bottlenecked somewhere -- but where?

2013-03-01 Thread Varun Sharma
: We are generating the load from multiple machines, yes. Do you happen to know what the name of the setting for the number of ThriftServer threads is called? I can't find anything that is obviously about that in the CDH manager. - Dan On Mar 1, 2013, at 1:46 PM, Varun Sharma wrote

Re: 答复: GC frequency

2013-02-21 Thread Varun Sharma
if NewSize too low:) Generally speaking, most of YGC should be less than 5ms for a normal size heap. maybe your load is too high or there're vm options be misconfigured ? 发件人: Varun Sharma [va...@pinterest.com] 发送时间: 2013年2月21日 15:32 收件人: user

Re: 答复: 答复: GC frequency

2013-02-21 Thread Varun Sharma
size needs to go down in this case ? On Thu, Feb 21, 2013 at 12:50 AM, 谢良 xieli...@xiaomi.com wrote: Here is a good formula to estimate: http://blog.ragozin.info/2011/06/understanding-gc-pauses-in-jvm-hotspots.html Hope it helpful:) 发件人: Varun Sharma

GC frequency

2013-02-20 Thread Varun Sharma
Hi, I have a system tuned with new Gen 512M with a lot write load. The system has 4 cores - ParNewGC and GCThreads is set to 4. I am using ConcMarkGC and CMSInitiating fraction is set to 60 %. I am observing the 90th/99th percentile of latency and see it highly correlated with GC pauses. There

Re: Optimizing Multi Gets in hbase

2013-02-19 Thread Varun Sharma
filter (may have to implement your own filter, though). Maybe we could a version of RowFilter that match against multiple keys. -- Lars From: Varun Sharma va...@pinterest.com To: user@hbase.apache.org Sent: Monday, February 18, 2013 1:57 AM

Re: Optimizing Multi Gets in hbase

2013-02-19 Thread Varun Sharma
of the scan, and create multiple scan if necessary. I'm also interested in Lars' opinion on this. Nicolas On Tue, Feb 19, 2013 at 4:52 PM, Varun Sharma va...@pinterest.com wrote: I have another question, if I am running a scan wrapped around multiple rows in the same region

Optimizing Multi Gets in hbase

2013-02-18 Thread Varun Sharma
Hi, I am trying to batched get(s) on a cluster. Here is the code: ListGet gets = ... // Prepare my gets with the rows i need myHTable.get(gets); I have two questions about the above scenario: i) Is this the most optimal way to do this ? ii) I have a feeling that if there are multiple gets in

  1   2   >