Re: Uneven write request to regions

2013-11-19 Thread Himanshu Vashishtha
Re: The 32 limit makes HBase go into stress mode, and dump all involving regions contains in those 32 WAL Files. Pardon, I haven't read all your data points/details thoroughly, but the above statement is not true. Rather, it looks at the oldest WAL file, and flushes those regions which would free

Re: Migrate 0.94 to remote 0.96

2013-11-11 Thread Himanshu Vashishtha
hbase upgrade... It seems to be pretty easy. I will update this thread with the result... JM 2013/11/10 Himanshu Vashishtha hv.cs...@gmail.com JM, Did you look at the upgrade section in the book, http://hbase.apache.org/upgrading.html#upgrade0.96 It does in-place upgrade

Re: Migrate 0.94 to remote 0.96

2013-11-10 Thread Himanshu Vashishtha
JM, Did you look at the upgrade section in the book, http://hbase.apache.org/upgrading.html#upgrade0.96 It does in-place upgrade of a 94 installation to 96. In case your 96 is fresh, you could dump/copy all your 94 data under root dir, and run the upgrade script. No, 96 doesn't convert each

Re: Unable to run HBase 0.96 with Hadoop 2.1-beta

2013-10-07 Thread Himanshu Vashishtha
0.96 doesn't has ROOT table; and it looks like yours is not a new installation (hdfs://localhost:8020/hbase). If you are coming from a 0.94.x, follow the upgrade steps here: http://hbase.apache.org/book.html#upgrade0.96 On Mon, Oct 7, 2013 at 10:42 AM, Thomas Bailet

Re: Warning messages in hbase logs

2013-09-26 Thread Himanshu Vashishtha
On Thu, Sep 26, 2013 at 4:07 AM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote: I was on 0.94.11 and still got the same msgs as Vimal. Ultimately, I moved to 0.95.2. You know it is a dev release, right? Regards, - kiru Kiru Pakkirisamy | webcloudtech.wordpress.com

Re: HBase Table Row Count Optimization - A Solicitation For Help

2013-09-26 Thread Himanshu Vashishtha
Sorry for chiming in late here. The Aggregation coprocessor works well for smaller datasets, or in case you are computing it on a range of a table. During its development phase, I used to do row count of 1m, 10m rows (spanning across about 25 regions for the test table). In its current form, I

Re: 0.95.2 upgrade errors

2013-09-16 Thread Himanshu Vashishtha
:* Himanshu Vashishtha hv.cs...@gmail.com *To:* user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com *Cc:* Ted Yu yuzhih...@gmail.com *Sent:* Friday, September 13, 2013 4:18 PM *Subject:* Re: 0.95.2 upgrade errors Is there something odd (exceptions on starting) in Master logs? Does

Re: 0.95.2 upgrade errors

2013-09-13 Thread Himanshu Vashishtha
:1861) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:111) Regards, - kiru From: Himanshu Vashishtha hv.cs...@gmail.com To: user@hbase.apache.org; Kiru Pakkirisamy

Re: HBase replication - EOF while reading

2013-07-02 Thread Himanshu Vashishtha
Patrick, What is the HBase version you using for Master cluster? If 0.94.8, does it has 7122? https://issues.apache.org/jira/browse/HBASE-7122 Thanks, Himanshu On Tue, Jul 2, 2013 at 3:09 PM, Patrick Schless patrick.schl...@gmail.comwrote: I've just enabled replication (to 1 peer), and I'm

Re: 答复: flushing + compactions after config change

2013-07-01 Thread Himanshu Vashishtha
bq. On Thu, Jun 27, 2013 at 4:27 PM, Viral Bajaria viral.baja...@gmail.com wrote: It's not random, it picks the region with the most data in its memstores. That's weird, because I see some of my regions which receive the least amount of data in a given time period flushing before the regions

Re: Weird Replication exception

2013-06-02 Thread Himanshu Vashishtha
Hey Asaf, It looks like you only need 7122. Either upgrade, or you could also patch it up. Syncing up the master and slave cluster is also advised, but that stands good in case you are using master-master replication. bq. 172.25.98.74,60020, 1369903540894/172.25.98.74

Re: HConnectionManager$HConnectionImplementation.locateRegionInMeta

2013-05-30 Thread Himanshu Vashishtha
bq. Anoop attached backported patch in HBASE-8655. It should go into 0.94.9, the next release - current is 0.94.8 In case you want it sooner, you can apply 8655 patch and test/verify it. Thanks, Himanshu On Thu, May 30, 2013 at 7:26 AM, Ted Yu yuzhih...@gmail.com wrote: Anoop attached

Re: RS crash upon replication

2013-05-22 Thread Himanshu Vashishtha
I'd suggest to please patch the code with 8207; cdh4.2.1 doesn't have it. With hyphens in the name, ReplicationSource gets confused and tried to set data in a znode which doesn't exist. Thanks, Himanshu On Wed, May 22, 2013 at 2:42 PM, Amit Mor amit.mor.m...@gmail.com wrote: yes, indeed -

Re: RS crash upon replication

2013-05-22 Thread Himanshu Vashishtha
- I am running 0.94.7 which has the fix but my nodes don't contain hyphens - nodes are no longer coming back up... Thanks Varun On Wed, May 22, 2013 at 3:02 PM, Himanshu Vashishtha hv.cs...@gmail.com wrote: I'd suggest to please patch the code with 8207; cdh4.2.1 doesn't have

Re: Under Heavy Write Load + Replication On : Brings All My Region Servers Dead

2013-04-17 Thread Himanshu Vashishtha
Hello Ameya, Sorry to hear that. You have two options: 1) Apply HBase-8099 patch to your version. ( https://issues.apache.org/jira/browse/HBASE-8099) The patch is simple, so should be easy to do, OR, 2) Turn off zk.multi feature (see hbase-default.xml). (You can refer to CDH4.2.0 docs for that)

Re: HBASE HA

2013-03-30 Thread Himanshu Vashishtha
Hey Azurry, I tried to answer your comments on the jira. Please have a look. Thanks, Himanshu On Fri, Mar 29, 2013 at 11:37 PM, Azuryy Yu azury...@gmail.com wrote: but I dont think HBASE-8211 is really support HDFS-HA, I have comments on it. On Mar 29, 2013 11:08 PM, Ted Yu

Re: NPE in log cleaner

2013-03-26 Thread Himanshu Vashishtha
How did you enable it? Going by the line numbers of the stacktrace, it is very unlikely to have a npe there. Did you see anything suspicious in the rs-logs on enabling it? Can you pastebin more rs logs when you enable it. Himanshu On Tue, Mar 26, 2013 at 9:57 AM, Jeff Whiting je...@qualtrics.com

Re: Replication Configuration Multiple Peers, Which Correct?

2013-03-15 Thread Himanshu Vashishtha
1 zk1,zk2,zk3,zk4,zk5:2181:/hbase This is correct. You only want one peer, right? So, adding separate zk servers as different peers is not the right way. Read up some of the replication blogs we have; that should help. Thanks, Himanshu On Thu, Mar 14, 2013 at 8:13 PM, Time

Re: Regionserver goes down while endpoint execution

2013-03-14 Thread Himanshu Vashishtha
the fetch would be from rowkey11 to rowkey20, irrespective of the region region servers? ** ** Regards, Deepak ** ** ** ** -Original Message- From: hv.cs...@gmail.com [mailto:hv.cs...@gmail.com] On Behalf Of Himanshu Vashishtha Sent: Wednesday, March 13, 2013 12:09 PM

Re: Regionserver goes down while endpoint execution

2013-03-13 Thread Himanshu Vashishtha
On Wed, Mar 13, 2013 at 8:19 AM, Kumar, Deepak8 deepak8.ku...@citi.com wrote: Thanks guys for assisting. I am getting OOM exception yet. I have one query about Endpoints. As endpoint executes in parallel, so if I have a table which is distributed at 101 regions across 5 regionserver. Would it

Re: Regionserver goes down while endpoint execution

2013-03-12 Thread Himanshu Vashishtha
I don't see RS dying with this. It says that it is taking more time than 60 sec (default timeout for clients), and therefore it stops processing the coprocessor call as the client is disconnected. Is your cluster okay? how many rows in the table? Normal scan works good? Can you share more about

Re: Is there a way to view multiple versions of a cell in the HBase shell?

2013-03-07 Thread Himanshu Vashishtha
Hey Natty, hbase(main):020:0 get 't','r', { COLUMN = 'f', VERSIONS = 2 } COLUMN CELL f:l timestamp=1362676782345, value=value2 f:l timestamp=1362676779492, value=value1 2 row(s) in

Re: exception during coprocessor run

2013-01-13 Thread Himanshu Vashishtha
How much time your CP calls take? more than 60sec? On Sun, Jan 13, 2013 at 11:42 AM, Andrew Purtell apurt...@apache.org wrote: This means your client disconnected. On Sun, Jan 13, 2013 at 6:04 AM, Skovronik, Amir askov...@akamai.comwrote: Hi I am using version 0.94.3, while running an end

Re: [ANNOUNCE] New Apache HBase Committers - Matteo Bertozzi and Chunhui Shen

2013-01-02 Thread Himanshu Vashishtha
Congrats Matteo and Chunhui! :) On Wed, Jan 2, 2013 at 11:40 AM, Jimmy Xiang jxi...@cloudera.com wrote: Congratulations! Matteo and Chunhui!! On Wed, Jan 2, 2013 at 11:37 AM, Jonathan Hsieh j...@cloudera.com wrote: Along with bringing in the new year, we've brought in two new Apache HBase

Re: Master Master replication

2012-11-13 Thread Himanshu Vashishtha
On Tue, Nov 13, 2012 at 1:36 PM, Varun Sharma va...@pinterest.com wrote: Hi, I want to setup a master-master replicated cluster using Hbase 0.94 - the two clusters will be in different availability zones. I was wondering if the following would work: 1) Setup cluster A and start

Re: RS not processing any requests

2012-09-05 Thread Himanshu Vashishtha
Your RS priority handlers are blocked on meta lookup, so it becomes unresponsive. Looks like you hitting https://issues.apache.org/jira/browse/HBASE-6165 You running HBase replication? just confirming. Himanshu On Wed, Sep 5, 2012 at 4:39 PM, Stack st...@duboce.net wrote: On Wed, Sep 5, 2012 at

Re: RS not processing any requests

2012-09-05 Thread Himanshu Vashishtha
would you have to prevent the problem? ~Jeff On 9/5/2012 5:23 PM, Himanshu Vashishtha wrote: Number of PRI handlers are governed by hbase.regionserver.metahandler.count; default is 10. Increasing their number will not solve it, but will delay its occurring (i don't know about your load etc

Re: Table is neither in disabled nor in enabled state

2012-08-16 Thread Himanshu Vashishtha
I am assuming you initiated disable table request at the shell. Is it possible to have master server logs since you initiated the above request? I think the znode Stack is referring to is in DISABLING state; deleting it should resolve it but good to know the root cause. can you look at the UI,

Re: Coprocessor POC

2012-07-30 Thread Himanshu Vashishtha
TestAggregateProtocol methods for usage. Thanks Himanshu Regards Cyril SCETBON On Jul 30, 2012, at 12:50 AM, Himanshu Vashishtha hvash...@cs.ualberta.ca wrote: And also, what are your cell values look like? Himanshu On Sun, Jul 29, 2012 at 3:54 PM, yuzhih...@gmail.com wrote: Can you use

Re: Coprocessor POC

2012-07-30 Thread Himanshu Vashishtha
SCETBON On Jul 30, 2012, at 5:56 PM, Himanshu Vashishtha hvash...@cs.ualberta.ca wrote: On Mon, Jul 30, 2012 at 6:55 AM, Cyril Scetbon cyril.scet...@free.fr wrote: I've given the values returned by scan 'table' command in hbase shell in my first email. Somehow I missed the scan result in your

Re: Coprocessor POC

2012-07-29 Thread Himanshu Vashishtha
And also, what are your cell values look like? Himanshu On Sun, Jul 29, 2012 at 3:54 PM, yuzhih...@gmail.com wrote: Can you use 0.94 for your client jar ? Please show us the NullPointerException stack. Thanks On Jul 29, 2012, at 2:49 PM, Cyril Scetbon cyril.scet...@free.fr wrote: Hi,

Re: Unable to run aggregation using AggregationClient in HBase0.92

2012-05-07 Thread Himanshu Vashishtha
org.apache.hadoop.hbase.ipc.HBaseRPC$UnknownProtocolException: means the coprocessor is not registered. You should read https://blogs.apache.org/hbase/entry/coprocessor_introduction, especially the deployment section. Thanks, Himanshu On Mon, May 7, 2012 at 12:44 PM, anil gupta

Re: aggregation performance

2012-05-03 Thread Himanshu Vashishtha
I did some experiments which compares scan, coprocessor and mapreduce approach, in an ec2 environment. You may find it interesting: http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html Thanks, Himanshu On Thu, May 3, 2012 at 11:02 AM, James Taylor jtay...@salesforce.com

Re: HBase Cyclic Replication Issue: some data are missing in the replication for intensive write

2012-05-01 Thread Himanshu Vashishtha
Hello Jerry, Did you try this again. Whenever you try next, can you please share the logs somehow. I tried replicating your scenario today, but no luck. I used the same workload you have copied here; master cluster has 5 nodes and slave has just 2 nodes; and made tiny regions of 8MB (memstore

Re: HBase Cyclic Replication Issue: some data are missing in the replication for intensive write

2012-05-01 Thread Himanshu Vashishtha
. By the way, is your test running with master-slave replication or master-master replication? I will resume this again. I was busy on something else for the past week or so. Best Regards, Jerry On 2012-05-01, at 6:41 PM, Himanshu Vashishtha wrote: Hello Jerry, Did you try this again

Re: Coprocessor execution

2012-03-21 Thread Himanshu Vashishtha
Hello, Any info about specific numbers (number of regions vs response time, etc) will help. Btw, for rowcount, you should use FirstKeyOnlyFilter. And in your code, you should add a callback to sum individual Region side results (though that is not related to response time, but with your rowcount

Re: Scan.addFamiliy reduces results

2012-03-15 Thread Himanshu Vashishtha
Let's also say there are 1000 rows with A,B,C and 500 rows with only B and C. If I add families A, B and C and scan with no filter will I get 1500, 1000 or 500 results? In this case, you will get 1000 rows. In case you add only B, you will get 500 rows. It's not like if you add families A, B

Re: AggregateProtocol Help

2012-01-03 Thread Himanshu Vashishtha
1, 2012 at 5:53 PM, Ted Yu yuzhih...@gmail.com wrote: Thanks for the reminder Himanshu. Royston: From this blog you can get some history on this subject: http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor .html On Sun, Jan 1, 2012 at 5:18 PM, Himanshu Vashishtha hvash

Re: AggregateProtocol Help

2012-01-03 Thread Himanshu Vashishtha
. Royston: From this blog you can get some history on this subject: http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocess or .html On Sun, Jan 1, 2012 at 5:18 PM, Himanshu Vashishtha hvash...@cs.ualberta.ca wrote: Hello Royston, Sorry

Re: AggregateProtocol Help

2012-01-01 Thread Himanshu Vashishtha
Hello Royston, Sorry to hear that you are getting trouble while using Aggregation functionalities. 557k rows seems to be a small table and a SocketTimeout does not seem to be an ok response. It will be good to know the region distribution as such. (how many regions? Is it a full table scan?)

Re: speeding up rowcount

2011-10-09 Thread Himanshu Vashishtha
Since a MapReduce is a separate process, try with a high Scan cache value. http://hbase.apache.org/book.html#perf.hbase.client.caching Himanshu On Sun, Oct 9, 2011 at 9:09 AM, Ted Yu yuzhih...@gmail.com wrote: I guess your hbase.hregion.max.filesize is quite high. If possible, lower its value

Re: speeding up rowcount

2011-10-09 Thread Himanshu Vashishtha
set the high Scan cache values? On Sun, Oct 9, 2011 at 11:19 AM, Himanshu Vashishtha hvash...@cs.ualberta.ca wrote: Since a MapReduce is a separate process, try with a high Scan cache value. http://hbase.apache.org/book.html#perf.hbase.client.caching Himanshu On Sun, Oct 9, 2011

Re: Using Scans in parallel

2011-10-09 Thread Himanshu Vashishtha
I don't think it will work without exception in that case. These scanner Ids are generated from Random instance of HRegionServer. In case there is same scannerId then one will get a LeaseStillHeldException in the addScanner method? Himanshu On Sun, Oct 9, 2011 at 3:53 PM, lars hofhansl

Re: speeding up rowcount

2011-10-09 Thread Himanshu Vashishtha
MapReduce support in HBase inherently provides parallelism such that each Region is given to one mapper. Himanshu On Sun, Oct 9, 2011 at 6:44 PM, lars hofhansl lhofha...@yahoo.com wrote: Be aware that the contract for a scan is to return all rows sorted by rowkey, hence it cannot scan regions

Re: Using Scans in parallel

2011-10-09 Thread Himanshu Vashishtha
Interesting. Hey Bryan, can you please share the stats about: how many Regions, how many Region Servers, time taken by Serial scanner and with 8 parallel scanners. Himanshu On Sun, Oct 9, 2011 at 6:49 PM, Bryan Keller brya...@gmail.com wrote: This is 100% reproducible for me, so I doubt it is

Re: how to make tuning for hbase (every couple of days hbase region sever/s crashe)

2011-08-31 Thread Himanshu Vashishtha
Sorry, I missed the fact that you guys were talking about the oome thing (the exceptions were of sockettimeout) Can you give the log snippet where it oome'd? I want to explore this use case :) You have about 200 regions per server, and each region configured to 500MB makes it 100GB data per

Re: how to make tuning for hbase (every couple of days hbase region sever/s crashe)

2011-08-23 Thread Himanshu Vashishtha
Are you doing some intensive tasks at the RegionServer side (which takes more than default time out of, 60 sec - iirc)? One can get these exception when client side socket connection is closed (probably a time out from client side). As per the exception, when RegionServer tried to send the result

Re: Coprocessors and batch processing

2011-08-11 Thread Himanshu Vashishtha
Client side batch processing is done at RegionServer level, i.e., all Action objects are grouped together per RS basis and send in one RPC. Once the batch arrives at a RS, it gets distributed across corresponding Regions, and these Action objects are processed, one by one. This include

Re: Coprocessors and batch processing

2011-08-11 Thread Himanshu Vashishtha
From: Himanshu Vashishtha hvash...@cs.ualberta.ca To: user@hbase.apache.org; lars hofhansl lhofha...@yahoo.com Sent: Wednesday, August 10, 2011 11:21 PM Subject: Re: Coprocessors and batch processing Client side batch processing is done at RegionServer level

Re: Hbase performance with HDFS

2011-07-07 Thread Himanshu Vashishtha
Mohit, just like how SSTables are stored on GFS? BigTable sstable = HBase HFile. Does this help? Himanshu On Thu, Jul 7, 2011 at 12:53 PM, Mohit Anchlia mohitanch...@gmail.comwrote: I have looked at bigtable and it's ssTables etc. But my question is directly related to how it's used with

Re: Best practices for HBase in EC2?

2011-06-23 Thread Himanshu Vashishtha
Hey Wilson, I will be rerunning experiments up there on ec2 and interested to know your experience about Whirr, in case you tried it. Interested in bash scripts vs Whirr thing for a scenario where all one need is to start a cluster, run some experiments and then terminate it. Running experiments

Re: tech. talk at imageshack/yfrog

2011-06-08 Thread Himanshu Vashishtha
+1 to Matt's opinion (if possible?). I am interested in your use case, sounds very impressive by the stats you gave. You said 1000 tables? Looking forward to see what optimizations/config tweaks you had to do to cope up with your read/write requirements. Thanks, Himanshu On Wed, Jun 8, 2011 at

Re: full table scan

2011-06-06 Thread Himanshu Vashishtha
Also, How big is each row? Are you using scanner cache? You just fetching all the rows to the client, and?. 300k is not big (It seems you have 1'ish region, that could explain similar timing). Add more data and mapreduce will pick up! Thanks, Himanshu On Mon, Jun 6, 2011 at 8:59 AM, Christopher

Re: Best practices for HBase in EC2?

2011-06-04 Thread Himanshu Vashishtha
I used ec2, but just for experiments. Here is what I did: a) used the ephemeral disks. My experiment datasets were persisted on S3, and I copied them onto the cluster. b) Use the hbase-ec2 scripts. get this repo https://github.com/ekoontz/hbase-ec2.git. c) Consult Andrew's pdf:

Re: Best practices for HBase in EC2?

2011-06-04 Thread Himanshu Vashishtha
about it's working. Himanshu On Sat, Jun 4, 2011 at 1:02 PM, Himanshu Vashishtha hvash...@cs.ualberta.ca wrote: I used ec2, but just for experiments. Here is what I did: a) used the ephemeral disks. My experiment datasets were persisted on S3, and I copied them onto the cluster. b) Use

Re: Row count without iterating over ResultScanner?

2011-05-01 Thread Himanshu Vashishtha
If you are interested row count only (and not want to fetch the table rows to your client side), you can also try out https://issues.apache.org/jira/browse/HBASE-1512. PS: Which version you are on? The above patch is in main trunk as of now, so to use it you would have to checkout the code and

Re: Row count without iterating over ResultScanner?

2011-05-01 Thread Himanshu Vashishtha
...@gmail.comwrote: Hi, On 01.05.2011 20:03, Himanshu Vashishtha wrote: If you are interested row count only (and not want to fetch the table rows to your client side), you can also try out https://issues.apache.org/jira/browse/HBASE-1512. Yes, I only want to count rows and apply filters or select

Re: How to scan rows starting with a particular string?

2011-04-27 Thread Himanshu Vashishtha
On Wed, Apr 27, 2011 at 11:00 AM, Joe Pallas pal...@cs.stanford.edu wrote: On Apr 26, 2011, at 11:54 PM, Himanshu Vashishtha wrote: HBase uses utf-8 encoding to store the row keys, so it can store non-ascii characters too (yes they will be larger than 1 byte). That statement may

Re: HDFS reports corrupted blocks after HBase reinstall

2011-04-26 Thread Himanshu Vashishtha
Could it be the /tmp/hbase-userID directory that is playing the culprit. just a wild guess though. On Tue, Apr 26, 2011 at 5:56 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Unless HBase was running when you wiped that out (and even then), I don't see how this could happen. Could you match

Re: A possible bug in the scanner.

2011-04-13 Thread Himanshu Vashishtha
Vidhya, Did you try setting scanner time range. It takes min and max timestamps, and when instantiating the scanner at RS, a time based filtering is done to include only selected store files. Have a look at StoreFile.shouldseek(Scan, Sortedsetbyte[]). I think it should improve the response time.

Re: A possible bug in the scanner.

2011-04-13 Thread Himanshu Vashishtha
the underlying problem. When a scanner is busy, but doesn't have any rows to return yet, neither the client nor the region server should mistake it for an unresponsive scanner. V On 4/13/11 8:43 AM, Himanshu Vashishtha hvash...@cs.ualberta.ca wrote: Vidhya, Did you try setting scanner time range

Re: Number of column families vs Number of column family qualifiers

2010-10-09 Thread Himanshu Vashishtha
isn't depends on your app data access pattern? Are you reading all those columns against a pk simultaneously or not. That would help in discerning which way to go. :) Himanshu. On Sat, Oct 9, 2010 at 7:42 PM, weliam.cl...@gmail.com wrote: Hi folks, I have a question about the scheme design

Re: try to read content of stored file

2010-09-09 Thread Himanshu Vashishtha
just a minor point: ./bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f file name where filename should be the configured fs (local/hdfs). -f option is necessary to read a given file. In case one gives wrong dfs, it will give error. hbase org.apache.hadoop.hbase.io.hfile.HFile -vf

Re: Limits on HBase

2010-09-07 Thread Himanshu Vashishtha
but yes you will not be having different versions of those objects as they are not stored as such in a table. So, that's the down side. In case your objects are write once read multi types, I think it should work. Let's see what others say :) ~Himanshu On Tue, Sep 7, 2010 at 12:49 AM, Himanshu