Re: question about incremental backup and cluster replication

2014-09-06 Thread Suraj Varma
The answer to can they solve your problem is yes. For how to do this, start reading about your options here so you can pick what works best for your need (and your version of hbase): http://hbase.apache.org/book/ops.backup.html (and links out of this page)

Re: Bulk load to multiple tables

2014-06-27 Thread Suraj Varma
See this : https://issues.apache.org/jira/browse/HBASE-3727 And see this thread: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/21724 You may need to rebase the code to your specific version of hbase, though. --Suraj On Thu, Jun 26, 2014 at 10:28 AM, Kevin kevin.macksa...@gmail.com

Re: Custom TableInputFormat and TableOutputFormat classes

2014-06-27 Thread Suraj Varma
See this thread that seems similar to your use case http://apache-hbase.679495.n3.nabble.com/Hbase-sequential-row-merging-in-MapReduce-job-td4033194.html --Suraj On Wed, Jun 25, 2014 at 2:58 AM, Kuldeep Bora kuldeep.b...@gmail.com wrote: Hello, I have keys in hbase of form `abc:xyz` and i

Re: Hbase Performance Issue

2014-01-07 Thread Suraj Varma
Akhtar: There is no manual step for bulk load. You essentially have your script that runs the map reduce job that creates the HFiles. On success of this script/command, you run the completebulkload command ... the whole bulk load can be automated, just like your map reduce job. --Suraj On Mon,

Re: GC recommendations for large Region Server heaps

2013-07-09 Thread Suraj Varma
normally hit a full GC. I can surely try this out on a test cluster ... thanks for the pointer on the 3x increase if it hits full gc. That quantifies it for me much better. Thanks --Suraj On Mon, Jul 8, 2013 at 11:56 AM, Stack st...@duboce.net wrote: On Mon, Jul 8, 2013 at 11:09 AM, Suraj Varma

Re: GC recommendations for large Region Server heaps

2013-07-09 Thread Suraj Varma
, 2013 at 11:09 AM, Suraj Varma svarma...@gmail.com wrote: Hello: We have an HBase cluster with region servers running on 8GB heap size with a 0.6 block cache (it is a read heavy cluster, with bursty write traffic via MR jobs). (version: hbase-0.94.6.1) During HBaseCon, while speaking

Re: GC recommendations for large Region Server heaps

2013-07-09 Thread Suraj Varma
://sematext.com/spm On Mon, Jul 8, 2013 at 2:56 PM, Stack st...@duboce.net wrote: On Mon, Jul 8, 2013 at 11:09 AM, Suraj Varma svarma...@gmail.com wrote: Hello: We have an HBase cluster with region servers running on 8GB heap size with a 0.6 block cache (it is a read heavy cluster

Re: GC recommendations for large Region Server heaps

2013-07-09 Thread Suraj Varma
...@duboce.net wrote: On Mon, Jul 8, 2013 at 11:09 AM, Suraj Varma svarma...@gmail.com wrote: Hello: We have an HBase cluster with region servers running on 8GB heap size with a 0.6 block cache (it is a read heavy cluster, with bursty write traffic via MR jobs

GC recommendations for large Region Server heaps

2013-07-08 Thread Suraj Varma
Hello: We have an HBase cluster with region servers running on 8GB heap size with a 0.6 block cache (it is a read heavy cluster, with bursty write traffic via MR jobs). (version: hbase-0.94.6.1) During HBaseCon, while speaking to a few attendees, I heard some folks were running region servers as

Re: Logging for MR Job

2013-06-22 Thread Suraj Varma
Did you try passing in the log level via generic options? E.g. I can switch the log level of a running job via: hadoop jar hadoop-mapreduce-examples.jar pi *-D mapred.map.child.log.level=DEBUG *10 10 hadoop jar hadoop-mapreduce-examples.jar pi *-D mapred.map.child.log.level=INFO* 10 10 --Suraj

Re: Any mechanism in Hadoop to run in background

2013-06-22 Thread Suraj Varma
Yes, you can change your task tracker startup script to use nice and ionice and restart the task tracker process. The mappers and reducers spun off this task tracker will inherit the niceness. See the first comment in http://blog.cloudera.com/blog/2011/04/hbase-dos-and-donts/ Quoting: change the

Re: difference between major and minor compactions?

2013-06-22 Thread Suraj Varma
In contrast, the major compaction is invoked in offpeak time and usually can be assume to have resource exclusively. There is no resource exclusivity with major compactions. It is just more resource _intensive_ because a major compaction will rewrite all the store files to end up with a single

Re: HBase Availability - Mean Time to Failure Recovery Time

2013-06-11 Thread Suraj Varma
Read this: http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/ There are some jira ticket links in there that will give you more reading material on MTTR. --Suraj On Tue, Jun 11, 2013 at 2:19 AM, Pankaj Misra pankaj.mi...@impetus.co.inwrote: Hi, We are using

Re: Examples of Multi Get and Multi Put using Stargate (JSON)

2013-05-14 Thread Suraj Varma
*Moving your post to user mailing list.* Here are more examples: http://wiki.apache.org/hadoop/Hbase/HbaseRest For your curl calls, escape your ampersands, else it is interpreted by your shell as a command to run in the background. curl -v -H Accept: application/json

Re: Loading text files from local file system

2013-04-17 Thread Suraj Varma
Have you considered using hfile.compression, perhaps with snappy compression? See this thread: http://grokbase.com/t/hbase/user/10cqrd06pc/hbase-bulk-load-script --Suraj On Tue, Apr 16, 2013 at 9:31 PM, Omkar Joshi omkar.jo...@lntinfotech.comwrote: The background thread is here :

Re: Loading text files from local file system

2013-04-17 Thread Suraj Varma
LoadIncrementalFiles to process compressed input files and so forth ... See if the dfs.replication + hfile.compression option works for you first. --Suraj On Wed, Apr 17, 2013 at 1:00 AM, Suraj Varma svarma...@gmail.com wrote: Have you considered using hfile.compression, perhaps with snappy

Re: Stargate / Rest - Get Multiple Rows by their keys in one call

2013-03-15 Thread Suraj Varma
Per http://svn.apache.org/repos/asf/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/rest/TestMultiRowResource.java and http://svn.apache.org/repos/asf/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/TableResource.java, it appears you can. Also, see

Re: Stargate / Rest - Get Multiple Rows by their keys in one call

2013-03-15 Thread Suraj Varma
Also - this was added in 0.92 (https://issues.apache.org/jira/browse/HBASE-3541) ... since you didn't mention your hbase version. --Suraj On Fri, Mar 15, 2013 at 4:12 PM, Suraj Varma svarma...@gmail.com wrote: Per http://svn.apache.org/repos/asf/hbase/trunk/hbase-server/src/test/java/org

Re: Why only check1-and-putMany and check1-and-deleteMany?

2012-11-29 Thread Suraj Varma
There was this https://issues.apache.org/jira/browse/HBASE-4999 ticket for supporting atomic row level mutations based on generic constraints with some discussion on possible solution. Jonathan - is there another jira that you were working on for a similar checkAndMutate functionality? Thanks,

Re: Configuration setup

2012-11-29 Thread Suraj Varma
The directory where hbase-site.xml is located should be in your client side classpath. Can you check if this is the case (dump your classpath right before you call to check if the directory is present). Here's the HBaseConfiguration.addHBaseResources code.

Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-27 Thread Suraj Varma
Ian Varley's excellent HBaseCon presentation is another great resource. http://ianvarley.com/coding/HBaseSchema_HBaseCon2012.pdf On Mon, Nov 26, 2012 at 5:43 AM, Doug Meil doug.m...@explorysmedical.com wrote: Hi there, somebody already wisely mentioned the link to the # of CF's entry, but here

Re: Runs in Eclipse but not as a Jar

2012-11-26 Thread Suraj Varma
The difference is your classpath. So -for problem 1, you need to specify jars under /hbase-0.94.2/lib to your classpath. You only need a subset ... but first to get over the problem set your classpath with all these jars. I don't think specifying a wildcard * works ... like below

Re: Does MapReduce need HBase to be running?

2012-11-21 Thread Suraj Varma
Right - if your map is not accessing HBase at all ... it can be down. --Suraj On Wed, Nov 14, 2012 at 11:03 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hum. I'm calling HTables, and doing puts and gets. So it will not be working for my scenario. But if I simply need to map the data

Re: memory leak

2012-11-21 Thread Suraj Varma
Have you turned on mslab in your hbase-site.xml? --Suraj On Wed, Nov 21, 2012 at 12:16 AM, Yusup Ashrap aph...@gmail.com wrote: Does your hadoop support append/sync? current version of hadoop supports append/sync ,but we have not trun on explicitely in hdfs-site.xml. Yeah, there is a whole

Re: Region hot spotting

2012-11-21 Thread Suraj Varma
Ajay: Why would you not want to specify splits while creating table? If your 0-10 prefix is at random ... why not pre-split with that? Without presplitting, as Ram says, you cannot avoid region hotspotting until table starts automatic splits. --S On Wed, Nov 21, 2012 at 3:46 AM, Ajay Bhosle

Re: Question about Master Master replication

2012-11-20 Thread Suraj Varma
Per The Peers znode section of http://hbase.apache.org/replication.html, the cluster key is of the format “zk1.host.com,zk2.host.com,zk3.host.com:2181:/hbase. So, that's the value you would provide to the add_peer command. --Suraj On Mon, Nov 19, 2012 at 2:00 PM, Varun Sharma

Re: Question about Master Master replication

2012-11-20 Thread Suraj Varma
until I just gave one peer. Have you yourself tried this ? On Tue, Nov 20, 2012 at 9:17 AM, Suraj Varma svarma...@gmail.com wrote: Per The Peers znode section of http://hbase.apache.org/replication.html, the cluster key is of the format “zk1.host.com,zk2.host.com,zk3.host.com:2181:/hbase. So

Re: how to work Htable connection

2012-11-05 Thread Suraj Varma
For your first question, see Section 9.3.1 on http://hbase.apache.org/book.html which says When creating HTable instances, it is advisable to use the same HBaseConfiguration instance. This will ensure sharing of ZooKeeper and socket instances to the RegionServers which is usually what you want.

Re: more regionservers does not improve performance

2012-10-12 Thread Suraj Varma
What have you configured your hbase.hstore.blockingStoreFiles and hbase.hregion.memstore.block.multiplier? Both of these block updates when the limit is hit. Try increasing these to say 20 and 4 from the default 7 and 2 and see if it helps. If this still doesn't help, see if you can set up

Re: 答复: [Stand alone - distributed mode] HBase master isn't initializing completely

2012-10-12 Thread Suraj Varma
Shutdown the cluster and remove all the *.pid files from the configured tmp.dir - especially of the master. Then bring up the cluster again. See if this resolves it. Also - did you check your hosts file and refer to the host configuration specified in the hbase online guide? --S On Fri, Oct 12,

Re: more regionservers does not improve performance

2012-10-12 Thread Suraj Varma
in 90second blocks. Check the RS logs for those messages as well and then Suraj's advice. This is where I would start to optimize your write path. I hope the above helps. On Fri, Oct 12, 2012 at 3:34 AM, Suraj Varma svarma...@gmail.com wrote: What have you configured your

Re: more regionservers does not improve performance

2012-10-12 Thread Suraj Varma
TableMapReduceUtil based splits ... or custom splits? Are the mappers going across the network to region servers on other nodes? Or are they all local calls? Just trying to understanding your cluster setup a bit more ... --Suraj On Fri, Oct 12, 2012 at 7:30 PM, Suraj Varma svarma...@gmail.com

Re: Query a version of a column efficiently

2012-07-30 Thread Suraj Varma
You may need to setup your Eclipse workspace and search using references etc.To get started, this is one class that uses TimeRange based matching ... org.apache.hadoop.hbase.regionserver.ScanQueryMatcher Also - Get is internally implemented as a Scan over a single row. Hope this gets you started.

Re: Cluster load

2012-07-28 Thread Suraj Varma
You can also do an online merge to merge the regions together and then resplit it ... https://issues.apache.org/jira/browse/HBASE-1621 --S On Sat, Jul 28, 2012 at 11:07 AM, Mohit Anchlia mohitanch...@gmail.com wrote: On Fri, Jul 27, 2012 at 6:03 PM, Alex Baranau alex.barano...@gmail.comwrote:

Re: Problem with ColumnPaginationFilter: after put twice,get half of limit columns

2012-07-18 Thread Suraj Varma
It's not clear what your question is ... can you provide your hbase shell session or code snippet that shows the below scenario? --S On Tue, Jul 17, 2012 at 8:01 PM, deanforwever2010 deanforwever2...@gmail.com wrote: it only happened when i put same data twice more in the column any ideas?

Re: Load balancer repeatedly close and open region in the same regionserver.

2012-07-18 Thread Suraj Varma
You can use pastebin.com or similar services to cut/paste your logs. --S On Tue, Jul 17, 2012 at 7:11 PM, Howard rj03...@gmail.com wrote: this problem just only once,Because it happens two day before,I remember I check the master-status and only always see regions is pending open in Regions

Re: HBase Fault tolerance

2012-07-18 Thread Suraj Varma
My question is how is this HLog file different from a StoreFile? Why is it faster to write to an HLog file and not write directly to a StoreFile? Read this: http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html and

Re: Mapred job failing with LeaseException

2012-07-11 Thread Suraj Varma
The reason you get LeaseExceptions is that the time between two scanner.next() calls exceeded your hbase.regionserver.lease.period setting which defaults to 60s. Whether it is your client or your map task, if it opens a Scan against HBase, scanner.next() should continue to get invoked within this

Re: HBaseClient recovery from .META. server power down

2012-07-10 Thread Suraj Varma
this... I don't know. N. On Mon, Jul 9, 2012 at 6:16 PM, Suraj Varma svarma...@gmail.com wrote: Hello: I'd like to get advice on the below strategy of decreasing the ipc.socket.timeout configuration on the HBase Client side ... has anyone tried this? Has anyone had any issues with configuring

Re: HBaseClient recovery from .META. server power down

2012-07-10 Thread Suraj Varma
Created https://issues.apache.org/jira/browse/HBASE-6364 for this issue. Thanks, --Suraj On Tue, Jul 10, 2012 at 9:46 AM, Suraj Varma svarma...@gmail.com wrote: I will create a JIRA ticket ... The only side-effect I could think of is ... if a RS is having a GC of a few seconds, any _new_

Re: HBaseClient recovery from .META. server power down

2012-07-10 Thread Suraj Varma
initial :-). This said there is a retry on error... On Tue, Jul 10, 2012 at 6:46 PM, Suraj Varma svarma...@gmail.com wrote: I will create a JIRA ticket ... The only side-effect I could think of is ... if a RS is having a GC of a few seconds, any _new_ client trying to connect would get connect

Re: HBASE -- YCSB ?

2012-07-10 Thread Suraj Varma
Search for hadoop-dns-checker in http://hbase.apache.org/book.html That tool might help figure out if your cluster networking is all right. --S On Mon, Jul 9, 2012 at 3:03 PM, Dhaval Shah prince_mithi...@yahoo.co.in wrote: There is definitely a debug flag on hbase.. You can find out details on

Re: HBaseClient recovery from .META. server power down

2012-07-09 Thread Suraj Varma
Hello: I'd like to get advice on the below strategy of decreasing the ipc.socket.timeout configuration on the HBase Client side ... has anyone tried this? Has anyone had any issues with configuring this lower than the default 20s? Thanks, --Suraj On Mon, Jul 2, 2012 at 5:51 PM, Suraj Varma

Re: Blocking Inserts

2012-07-03 Thread Suraj Varma
In your case, likely you are hitting the blocking store files (hbase.hstore.blockingStoreFiles default:7) and/or hbase.hregion.memstore.block.multiplier - check out http://hbase.apache.org/book/config.files.html for more details on this configurations and how they affect your insert performance.

Re: HBASE -- Regionserver and QuorumPeer ?

2012-07-02 Thread Suraj Varma
The error you are getting is: 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server devrackA-05/172.18.0.6:2181 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection

Re: regions unlinked from table

2012-07-02 Thread Suraj Varma
Try doing an hbase hbck to see if it reports inconsistency. And do an hbase hbck -fix to see if it fixes it for you. See http://hbase.apache.org/book.html#hbck.in.depth Note that since 0.90.4 is old, some of the documented options won't be available ... but hbase hbck -fix will be available. --S

Re: hbase-site.xml Content is not allowed in prolog.

2012-07-02 Thread Suraj Varma
It would be good to update the thread on how you fixed it ... for users who tread the same path tomorrow. :) Was it dos2unix on your conf / bin directories that fixed it? --S On Mon, Jul 2, 2012 at 6:49 AM, syed kather in.ab...@gmail.com wrote: Thanks Marcin Cylke now its is working

Re: HBASE -- HMaster Aborts after 28 minutes.

2012-07-02 Thread Suraj Varma
Session expired usually results from a long GC that exceeds the zookeeper.session.timeout. 2012-07-01 18:20:00,961 FATAL org.apache.hadoop.hbase.master.HMaster:master:6-0x238444cf77e master:6-0x238444cf77e received expired from ZooKeeper, aborting

Re: Finding the correct region server

2012-07-02 Thread Suraj Varma
If I understand you right, you are asking about how region splitting works ... See http://hbase.apache.org/book/regions.arch.html section 9.7.4 In a nutshell, the parent region on your RS1 will split into two daughter regions on the same RS1. If you have load balancer turned on, the master can

Re: HBASE -- Regionserver and QuorumPeer ?

2012-07-02 Thread Suraj Varma
for the help. --- Jay Wilson On 7/2/2012 2:43 PM, Suraj Varma wrote: The error you are getting is: 2012-07-02 12:39:02,205 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server devrackA-05/172.18.0.6:2181 2012-07-02 12:39:02,211 WARN org.apache.zookeeper.ClientCnxn

HBaseClient recovery from .META. server power down

2012-07-02 Thread Suraj Varma
Hello: We've been doing some failure scenario tests by powering down a .META. holding region server host and while the HBase cluster itself recovers and reassigns the META region and other regions (after we tweaked down the default timeouts), our client apps using HBaseClient take a long time to

Re: regions unlinked from table

2012-07-02 Thread Suraj Varma
there. Thanks again! On Mon, Jul 2, 2012 at 7:36 PM, Suraj Varma svarma...@gmail.com wrote: Are you using apache hbase 0.90.4 ... or the one from CDH3? Check what other hbck options you have (do you have -fixMeta?) on the version you are on. What the uber-hbck (part of hbase-0.90.6 and later releases

Re: HBaseClient recovery from .META. server power down

2012-07-02 Thread Suraj Varma
By power down below, I mean powering down the host with the RS that holds the .META. table. (So - essentially, the host IP is unreachable and the RS/DN is gone.) Just wanted to clarify my below steps ... --S On Mon, Jul 2, 2012 at 5:36 PM, Suraj Varma svarma...@gmail.com wrote: Hello: We've

Re: HBASE -- Regionserver and QuorumPeer ?

2012-07-02 Thread Suraj Varma
/2/2012 4:43 PM, Suraj Varma wrote: Ok - thanks for checking connectivity. I presume you already have doublechecked the hbase-site.xml in your region server that points to the zookeeper and hdfs-site.xml pointed to the namenode. I once got a similar error when HBase was picking up a stray

Re: understanding the client code

2012-06-01 Thread Suraj Varma
The way thrift and avro fits in here is ... Thrift Client (your code) - (thrift on the wire) - Thrift Server (provided by HBase) - (uses HTable) - HBase Cluster. Same with Avro. So - use HTable if you want to interact with the cluster using a Java API ... use the others if you want non-Java

Re: Version issue

2012-05-25 Thread Suraj Varma
I added this to hbase-site.xml, and that got hbase started but trying to run a program to Put rows throws the above error. This seems to indicate that your program is picking up a different version of hbase jars than your hbase cluster, perhaps? Check your classpath to ensure that the versions

Re: Combining filterlists

2012-05-06 Thread Suraj Varma
I think the problem is this line in your code: FilterList listOfFilters = new FilterList (FilterList.Operator.MUST_PASS_ALL); Can you try with the top level filter being FilterList listOfFilters = new FilterList (FilterList.Operator.MUST_PASS_ONE); and see if that satisfies your requirement? I

Re: Not able to Disable table : ERROR: org.apache.hadoop.hbase.RegionException: Retries exhausted, it took too long to wait for the table testtable2 to be disabled.

2012-04-17 Thread Suraj Varma
See this: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/13441 --Suraj On Mon, Apr 16, 2012 at 9:20 PM, Narayanan K knarayana...@gmail.com wrote: Hi all, Any advise on this? Thanks, Narayanan On Mon, Apr 16, 2012 at 3:27 PM, Narayanan K knarayana...@gmail.com wrote: Hi,

Re: hbase map/reduce questions

2012-04-09 Thread Suraj Varma
on this node programmatically? Le 08/04/2012 18:37, Suraj Varma a écrit : if i do a custom input that split the table by 100 rows, can i distribute manually each part  on a node   regardless where the data is ? Yes - if you do a custom split, and have sufficient map slots in your cluster, you

Re: hbase map/reduce questions

2012-04-08 Thread Suraj Varma
if i do a custom input that split the table by 100 rows, can i distribute manually each part on a node regardless where the data is ? Yes - if you do a custom split, and have sufficient map slots in your cluster, you can parallelize the map tasks to run on other nodes as well. But if you

Re: Starting Abnormally After Shutting Down For Some Time

2012-03-28 Thread Suraj Varma
Bing: Your pid file location can be setup via hbase-env.sh; default is /tmp ... # The directory where pid files are stored. /tmp by default. # export HBASE_PID_DIR=/var/hadoop/pids On Wed, Mar 28, 2012 at 3:04 PM, Peter Vandenabeele pe...@vandenabeele.com wrote: On Wed, Mar 28, 2012 at 9:53

Re: Scan startRow/stopRow vs. filter

2012-03-15 Thread Suraj Varma
According to http://hbase.apache.org/book.html#client.filter.row, in general it is preferable to use start/stopRow rather than RowFilter. I believe with a RowFilter, you would be doing a full table scan ... --Suraj On Thu, Mar 15, 2012 at 11:48 AM, Andy Lindeman alinde...@gmail.com wrote: Hi

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Suraj Varma
Perhaps your HADOOP_CLASSPATH is not getting set properly. export HADOOP_CLASSPATH=`hbase classpath`:$ZK_CLASSPATH:$HADOOP_CLASSPATH Can you set the absolute path to hbase above? Also - try echo-ing the hadoop classpath to ensure that HADOOP_CLASSPATH indeed has the hbase jars conf directory.

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Suraj Varma
these jars to distributed cache myself? thanks Vrushali  From: Suraj Varma svarma...@gmail.com To: user@hbase.apache.org; Vrushali C vrush...@ymail.com Sent: Wednesday, February 8, 2012 12:26 AM Subject: Re: hbase - CassNotFound while connecting through mapper

Re: hbase - CassNotFound while connecting through mapper

2012-02-08 Thread Suraj Varma
getConf() in the run function instead of creating a new configuration again.  This new conf was overriding the libjars parameters of conf created in main. Thanks Vrushali  From: Vrushali C vrush...@ymail.com To: Suraj Varma svarma...@gmail.com; user

Re: Re: How to solve the problem of bulk load data is overwritten

2012-01-07 Thread Suraj Varma
I'm interpreting your question as I bulk loaded multiple versions of a row, but when I issue a get I only get one version back If so - use Get#setMaxVersions() api to the required number to get multiple versions back. If the above interpretation is wrong ... please clarify what you were

Re: java.lang.UnsatisfiedLinkError

2011-12-23 Thread Suraj Varma
Check this thread - let us know if this solves your issue as well. http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/20303 --Suraj On Thu, Dec 22, 2011 at 9:00 PM, Greg Pelly gfpe...@gmail.com wrote: Hi, I'm trying to setup HBase on a Ubuntu 11.04 virtual server using jdk1.6.0_29,

Re: Trouble with Boolean,integer and timestamp

2011-12-23 Thread Suraj Varma
1. You're using hbql library to interact with hbase not the java client. I don't know how up-to-date hbql is with hbase versions. From their site it appears to be close to a year old (http://hbql.com/download.html) 2. I'm not sure why you want to store a number and retrieve that as a boolean. If

Re: Aggregations in HBase

2011-12-11 Thread Suraj Varma
Coprocessors are available with 0.92 which now has a release candidate (RC0). So - you can probably try and build 0.92 RC0 to test this functionality out. --Suraj On Sun, Dec 11, 2011 at 10:52 AM, Royston Sellman royston.sell...@googlemail.com wrote: I'm a newbie learning HBase using 0.90.4.

Re: zookeeper quorum verification

2011-12-04 Thread Suraj Varma
a different way of handling the situation I described. On Sat, Dec 3, 2011 at 4:45 PM, Suraj Varma svarma...@gmail.com wrote: Yes - this makes sense. But, I thought what Rita suggested was a single appquorum dns entry ... which was surprising. Hence my question. --Suraj On Sat, Dec

Re: zookeeper quorum verification

2011-12-04 Thread Suraj Varma
AM, Suraj Varma svarma...@gmail.com wrote: Thanks for summarizing this - ok, so I see the setup. I'm wondering what the implications are: So - let's say you decide to add more zookeeper nodes or to replace a zookeeper node due to failure or whatever. 1) Addition case: a) You would add 6

Re: zookeeper quorum verification

2011-12-03 Thread Suraj Varma
J-D: Did you mean that a _single_ dns entry returns all five ips belonging to individual zk nodes? Is this used only by clients ... or even within the cluster? And ... the zk nodes self-identify by IP ... and is this how region server nodes reach out specifically to the leader zk node? --Suraj

Re: zookeeper quorum verification

2011-12-03 Thread Suraj Varma
cluster per DC.  Our configs then just point to value0.zookeeper,1.zookeeper,2.zookeeper,3.zookeeper,4.zookeeper/value --Dave On Sat, Dec 3, 2011 at 6:15 AM, Suraj Varma svarma...@gmail.com wrote: J-D: Did you mean that a _single_ dns entry returns all five ips belonging to individual zk nodes

Re: HRegionserver daemon is not running on region server node

2011-11-28 Thread Suraj Varma
) , its running fine on all machines, in distributed mode. Hbase is perfectly running on master node ONLY. This is my present situation, i am feeling like just struck with this problem, please help. On Mon, Nov 28, 2011 at 2:04 AM, Suraj Varma svarma...@gmail.com wrote: So - first of all

Re: HRegionserver daemon is not running on region server node

2011-11-26 Thread Suraj Varma
Vamshi: What OS are you trying this on? Is it Linux / Windows? You can use the *.dns.interface configurations to use the specific network interface that you want. That is, set the following configurations in your hbase-site.xml on all hbase nodes hbase.zookeeper.dns.interface

Re: Schema design question - Hot Key concerns

2011-11-18 Thread Suraj Varma
-in-an-HBase-row I will let the experts comment further. On Fri, Nov 18, 2011 at 9:33 AM, Suraj Varma svarma...@gmail.com wrote: I have an HBase schema design question that I wanted to discuss with the list. Let's say we have a wide table design that has a table with one column family

Re: upgrade to 0.92 from 0.90.x

2011-11-08 Thread Suraj Varma
Hi Ted: Would a 0.92 client be able to talk to a 0.90.x cluster? (i.e. the other way around?) --Suraj On Mon, Nov 7, 2011 at 9:45 AM, Ted Yu yuzhih...@gmail.com wrote: HBASE-3581 would disallow 0.90 HBase client to talk to 0.92 HBase server. So in order to utilize the new cluster, you should

Re: readonly performance?

2011-11-02 Thread Suraj Varma
I don't think there is any performance benefit as such. From the source code, it appears that the only use is to avoid accidentally updating / mutating a table that you want to keep immutable. So - setting the read only flag will prevent accidental updates to that table. --Suraj On Wed, Nov 2,

Re: Building HBase trunk on windows using cygwin

2011-10-10 Thread Suraj Varma
on windows). Thanks Suraj, St.Ack On Sun, Oct 9, 2011 at 4:39 PM, Suraj Varma svarma...@gmail.com wrote: Just like trunk, it also fails for 0.92-SNAPSHOT on cygwin; a hack similar to the below was needed to get it built. I would certainly like the maven build to work cross-platform. I can

Re: Building HBase trunk on windows using cygwin

2011-10-09 Thread Suraj Varma
Just like trunk, it also fails for 0.92-SNAPSHOT on cygwin; a hack similar to the below was needed to get it built. I would certainly like the maven build to work cross-platform. I can open a Jira if there are no objections ... --Suraj On Wed, Oct 5, 2011 at 5:43 AM, Mayuresh

Re: MR on HBase - java.io.IOException: Pass a Delete or a Put

2011-08-03 Thread Suraj Varma
);        } } Then why am I getting this exception *java.io.IOException: Pass a Delete or a Put*? Any insights into what I am missing here would be really helpful. Thanks, Narayanan On Wed, Jul 27, 2011 at 12:07 AM, Suraj Varma svarma...@gmail.com wrote: I found

Re: data loss due to regionserver going down

2011-07-27 Thread Suraj Varma
When you shutdown the region server, check the master logs to see if master has detected this condition. I've seen weird things happen if dns is not setup correctly - so, check if master (logs ui) is correctly detecting that the region server is down after step 2. --Suraj 2011/7/27 吴限

Re: MR on HBase - java.io.IOException: Pass a Delete or a Put

2011-07-26 Thread Suraj Varma
I found this older thread that _might_ help you ... but as Stack says, better to upgrade to 0.90.x if possible. http://search-hadoop.com/m/egk1n1T1Sw8/java.io.IOException%253A+Pass+a+Delete+or+a+Putsubj=Re+Type+mismatch --Suraj On Tue, Jul 26, 2011 at 11:25 AM, Stack st...@duboce.net wrote: On

Re: Multiple VM Arguments for daemons

2011-06-23 Thread Suraj Varma
Check your regionserver's hbase-env.sh to see if you have accidentally duplicated the jmx lines in there. --Suraj On Thu, Jun 23, 2011 at 8:45 AM, sulabh choudhury sula...@gmail.com wrote: While monitoring JMX attributes via JConsole I observed that there are some VM arguments being reported

Re: iew data in hbase tables

2011-06-07 Thread Suraj Varma
See these threads: http://search-hadoop.com/m/vZwR4BIimP/hbaseexplorersubj=Re+HBase+Admin+Web+UI http://search-hadoop.com/m/oQ4po1kEoBZ/hbaseexplorersubj=Web+based+Hbase+Query+Tool+ http://blog.sematext.com/2010/09/06/hbase-digest-august-2010/ --Suraj On Tue, Jun 7, 2011 at 7:43 AM, abhay

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

2011-04-25 Thread Suraj Varma
With CDH3B4, the hadoop processes run as separate users (like hdfs, mapred, etc). Did you set the CDH3B4 directory permissions correctly as described in the install document? See: https://ccp.cloudera.com/display/CDHDOC/Upgrading+to+CDH3 and search for permissions. Also see this:

Re: HBase - Column family

2011-04-24 Thread Suraj Varma
If you only want some of the columns, you could return a subset by using server side Filters. Your schema can be designed in multiple ways - it all depends on what your access patterns are. Here's a good thread on various schema design alternatives for one-to-many relationships. There are many

Re: Should I be afraid by 'put','get'...

2011-04-24 Thread Suraj Varma
The only real way to find this out for your application is to do your own performance testing - this is because the numbers are totally dependent on your data size, number of requests, hardware, network, etc etc. If you are just looking to get a rough idea, search for hbase performance benchmarks

Re: Should I be afraid by 'put','get'...

2011-04-24 Thread Suraj Varma
Here's a recent thread that also gives you some numbers from OpenTSDB. http://search-hadoop.com/m/Rwf0d16qeo81/hbase+performancesubj=HBase+Performance --Suraj On Sun, Apr 24, 2011 at 12:20 AM, Suraj Varma svarma...@gmail.com wrote: The only real way to find this out for your application is to do

Re: Using split command in shell

2011-03-24 Thread Suraj Varma
It is the full region name - something like this: TestTable,262335,1300510101703.372a66d40705a4f2338b0219767602d3. If you go to the master web UI and click on the table name and you will see all the regions for that table. It is the string under Table Regions / Name column. The other thing

Re: PerformanceEvaluation

2011-03-21 Thread Suraj Varma
Look at the Tool Description section - the current usage is detailed there. There is no shell script as such - if that's what you're looking for. You just kick off the command as: bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation With no arguments, it will give you the usage and options.

Re: hbase bulk loading and update operations

2011-03-21 Thread Suraj Varma
Yes, it supports incremental updates as well (with 0.90.x). See this: https://issues.apache.org/jira/browse/HBASE-1923 --Suraj On Mon, Mar 21, 2011 at 4:49 AM, Oleg Ruchovets oruchov...@gmail.com wrote: Hi ,  Assuming ,   I inserted data to the hbase using bulk loading. What is the way to

Re: hbase heap size

2011-03-20 Thread Suraj Varma
Oleg: Instead of setting the heap size using the common HBASE_HEAP_SIZE, use the process specific OPTS to set it. As Stack says, for instance to set zookeeper specific heap size, you can uncomment and set the heap size export HBASE_ZOOKEEPER_OPTS=-Xmx1000m $HBASE_JMX_BASE

Re: Bulk Load question.

2011-03-20 Thread Suraj Varma
Is there a way to split this across regions in the beginning? Since you didn't mention your HBase version, I'm assuming you are using 0.90.1 or later. If so, yes, there is a way to pre-split the regions. See this: http://hbase.apache.org/book/important_configurations.html#d0e1975 Also - as

Re: zookeeper lost connection error

2011-03-20 Thread Suraj Varma
Are you trying to install HBase in a standalone mode? Can you give some more details like what your local system looks like (cygwin / Mac / Linux etc) Try the following things: 1) Start hbase and use the master web ui (http://localhost:60010/master.jsp) to ensure that *all* hostnames are

Re: Is there any influence to the performance of hbase if we use TTL to clean data?

2011-03-16 Thread Suraj Varma
So, yes, a major compaction is disk io intensive and can influence performance. Here's a thread on this http://search-hadoop.com/m/PI1dl1pXgEg2 And here's a more recent one: http://search-hadoop.com/m/BNxKZeI8z --Suraj On Wed, Mar 16, 2011 at 7:49 PM, Zhou Shuaifeng zhoushuaif...@huawei.com

Re: hbase 0.90.1 upgrade issue - mapreduce job

2011-03-16 Thread Suraj Varma
Does this help?: http://search-hadoop.com/m/JI3ro1EKY0u --Suraj On Tue, Mar 15, 2011 at 7:39 PM, Venkatesh vramanatha...@aol.com wrote:  Hi When I upgraded to 0.90.1, mapreduce fails with exception.. system/job_201103151601_0121/libjars/hbase-0.90.1.jar does not exist. I have the jar file

Re: major hdfs issues

2011-03-12 Thread Suraj Varma
to:java.lang.OutOfMemoryError: unable to create new native thread This indicates that you are oversubscribed on your RAM to the extent that the JVM doesn't have any space to create native threads (which are allocated outside of the JVM heap.) You may actually have to _reduce_ your heap sizes to

Re: Hbase shell does not start

2011-03-11 Thread Suraj Varma
Thanks for posting back - nice to know that it worked. --Suraj On Thu, Mar 10, 2011 at 7:39 PM, Sumeet M Nikam sumni...@in.ibm.com wrote: Hi Suraj, Thanks, it worked, replaced jruby-complete-1.0.3 jar with jruby-complete-1.2.0.jar and shell started without any exception.

Re: Modeling Multi-Valued Fields

2011-03-11 Thread Suraj Varma
ease of storage. --Suraj On Fri, Mar 11, 2011 at 4:10 PM, Rickm ricardo_maur...@yahoo.com wrote: Suraj Varma svarma.ng@... writes: It is a bit unusual, I think. To begin with, the number of versions is set when you create a ColumnFamily - so, you are signing up for every column

  1   2   >