Re: Install hbase on Ubuntu 11.10

2012-07-23 Thread Debarshi Bhattacharya
I am getting the same type of error while installing HBASE. In have Apache Hadoop 0.20.2 installed in my system(ubuntu 11.10). Hbase is getting installed, but when I try to create a table I am getting the following error :- THE COMPLETE STACKTRACE : ERROR:

Re: Install hbase on Ubuntu 11.10

2012-07-23 Thread Jean-Marc Spaggiari
Hi Debarshi, You can use JPS to list the java deamons running on your different machines. You should see at least those one: 3420 QuorumPeerMain 2564 TaskTracker 2252 NameNode 2456 DataNode 2360 JobTracker 2703 HMaster 2815 HRegionServer The number is the process number so will change each

Re: Hbase bkup options

2012-07-23 Thread Michael Segel
Amian, Like always the answer to your question is... it depends. First, how much data are we talking about? What's the value of the underlying data? One possible scenario... You run a M/R job to copy data from the table to an HDFS file, that is then copied to attached storage on an edge

host:port problem

2012-07-23 Thread Rajendra Manjunath
i have hbase configured in pseudo distributed mode and i am accessing it through a web service. till now every thing was running fine but from the morning i am getting this error 8774@rajendrarajendra,60020,1343050590428host:port pair: � all the process are running fine and i am able to access my

RE: Hbase bkup options

2012-07-23 Thread Amlan Roy
Hi Michael, Thanks a lot for the reply. What I want to achieve is, if my cluster goes down for some reason, I should be able to create a new cluster and should be able to import all the backed up data. As I want to store all the tables, I expect the data size to be huge (in order of Tera Bytes)

Re: host:port problem

2012-07-23 Thread Amandeep Khurana
This is most likely because of a mismatch in the ZK library version between your web service and the HBase install. Can you confirm you got the same version in both places? On Monday, July 23, 2012 at 8:31 AM, Rajendra Manjunath wrote: i have hbase configured in pseudo distributed mode and

Re: host:port problem

2012-07-23 Thread Mohammad Tariq
Hi Rajendra, If the web service core was written with the zookeeper jar included in some older Hbase release, and you have now upgraded your Hbase version, then this could happen. Try adding the new jar in your web service core. Regards, Mohammad Tariq On Mon, Jul 23, 2012 at 9:03 PM,

Shell - scripts

2012-07-23 Thread Claudiu Olteanu
Is there a security measure for running scripts like while(1) do;end in the shell? How is the thread closed?

Re: Hbase bkup options

2012-07-23 Thread Alok Kumar
Hello everyone, I too have similar use-case, where I've setup a separate HBase Replica Cluster. and enabled 'Replciation_Scope' for tables. Q. Do I need to create 'table + ColFamily' in backup cluster everytime a new *table* gets created in 'production' cluster? or Is there a way where table

Re: Use of MD5 as row keys - is this safe?

2012-07-23 Thread Jonathan Bishop
Hi, Thanks everyone for the informative discussion on this topic. I think that for project I am involved in I must remove the risk, however small, of a row key collision, and append the original id (in my case a long) to the hash, whatever hash I use. I don't want to be in the situation where

Basic Question on Partitioner,Combiner and Co-Processor

2012-07-23 Thread syed kather
Hi , I am very much interested to know how to implement the custom Partitioner . Is there any blog let me know . As i knew the number of reducer is depends upon the partitioner . Correct me if i am wrong How to implement Co-Processor(Min,Max) . Is there any tutorial available on

Re: Basic Question on Partitioner,Combiner and Co-Processor

2012-07-23 Thread shashwat shriparv
Check out this link may be will help you.. http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning http://www.ashishpaliwal.com/blog/2012/05/hadoop-recipe-implementing-custom-partitioner/ Regards ∞ Shashwat Shriparv On Mon, Jul 23, 2012 at 11:22 PM, syed kather

Re: Basic Question on Partitioner,Combiner and Co-Processor

2012-07-23 Thread syed kather
Thanks Shashwat Shriparv.. Is there any interface or abstract class partitioner avaliable for hbase specifically .. Thanks and Regards, S SYED ABDUL KATHER On Mon, Jul 23, 2012 at 11:54 PM, shashwat shriparv dwivedishash...@gmail.com wrote: Check out this link may be will

Re: Use of MD5 as row keys - is this safe?

2012-07-23 Thread Amandeep Khurana
On Mon, Jul 23, 2012 at 9:58 AM, Jonathan Bishop jbishop@gmail.comwrote: Hi, Thanks everyone for the informative discussion on this topic. I think that for project I am involved in I must remove the risk, however small, of a row key collision, and append the original id (in my case a

Re: Basic Question on Partitioner,Combiner and Co-Processor

2012-07-23 Thread syed kather
In my use case i have Mapper function which emit userid,seqid. Number of Map will be around 10 billion X 2000 user . which is best partition method which i can follow for my use case ?. Sorry i am first time writing my partition . So i have this doubt . Thanks in advance S SYED ABDUL

Re: Hbase bkup options

2012-07-23 Thread Minh Duc Nguyen
Once your backup data has been put back into HDFS, you can import it into HBase using this command: bin/hbase org.apache.hadoop.hbase.mapreduce.Import tablename inputdir See http://hbase.apache.org/book/ops_mgt.html#import for more information. HTH, Minh On Mon, Jul 23, 2012 at 11:33 AM, Amlan

Re: Basic Question on Partitioner,Combiner and Co-Processor

2012-07-23 Thread shashwat shriparv
Check out this link too.. http://sharepointorange.blogspot.in/2012/07/how-to-working-with-hbase-coprocessor.html On Tue, Jul 24, 2012 at 12:09 AM, syed kather in.ab...@gmail.com wrote: In my use case i have Mapper function which emit userid,seqid. Number of Map will be around 10 billion X

hbase threw NotServingRegionException

2012-07-23 Thread Ey-Chih chow
Hi, We got a Map/Reduce job that threw NotServingRegionException when the reducer was about to insert data into a Hbase table. The error message is as follows. I also copied the corresponding region server log at the end of the message. Also, we browsed through the hbase administrative page

Re: hbase threw NotServingRegionException

2012-07-23 Thread Ey-Chih chow
Sorry I pasted the wrong portion of the region server log. The right portion should be as follows: == 2012-07-22T00:48:57.147-0700: [GC [ParNew: 18863K-2112K(19136K), 0.0029870 secs] 57106K-42831K(97048K) icms_dc=0 , 0.0030480 secs] [Times: user=0.01

Re: Index building process design

2012-07-23 Thread Eric Czech
Hmm, maybe that was too long -- I'll keep this one shorter I swear: Would it make sense to build indexes with two Hadoop/Hbase clusters by simply pointing client traffic at the cluster that is currently NOT building indexes via M/R jobs? Basically, has anyone ever tried switching back and forth

drop table

2012-07-23 Thread Mohit Anchlia
I am trying to drop one of the tables but on the shell I get run major_compact. I have couple of questions: 1. How to see if this table has more than one region? 2. And why do I need to run major compact hbase(main):010:0* drop 'SESSION_TIMELINE' ERROR: Table SESSION_TIMELINE is enabled.

Re: hbase threw NotServingRegionException

2012-07-23 Thread Mohammad Tariq
Hello sir, A possible reason could be, your client is contacting the given regionserver, and that regionserver kept on rejecting the requests. Are you sure your table and all regions online? Use hbck once and see if you find anything interesting. Regards, Mohammad Tariq On Tue, Jul

Re: drop table

2012-07-23 Thread Jean-Marc Spaggiari
Hi Mohit, You have the respons int your question ;) Simple type: major_compact .META. On the shell. To drop your table, just do: disable 'SESSION_TIMELINE' drop 'SESSION_TIMELINE' -- JM 2012/7/23, Mohit Anchlia mohitanch...@gmail.com: I am trying to drop one of the tables but on the shell

Re: drop table

2012-07-23 Thread Mohammad Tariq
Hi Mohit, A table must be disabled first in order to get deleted. Regards, Mohammad Tariq On Tue, Jul 24, 2012 at 1:38 AM, Mohit Anchlia mohitanch...@gmail.com wrote: I am trying to drop one of the tables but on the shell I get run major_compact. I have couple of questions: 1. How

Re: drop table

2012-07-23 Thread Mohit Anchlia
Thanks! but I am still trying to understand these 2 questions: 1. How to see if this table has more than one region? 2. And why do I need to run major compact if I have more than one region? On Mon, Jul 23, 2012 at 1:14 PM, Mohammad Tariq donta...@gmail.com wrote: Hi Mohit, A table

Re: drop table

2012-07-23 Thread Jean-Marc Spaggiari
1) http://URL_OF_YOUR_MASTER:60010/table.jsp?name=NAME_OF_YOUR_TABLE will should you all the regions of your table. 2) I have no clue ;) 2012/7/23, Mohit Anchlia mohitanch...@gmail.com: Thanks! but I am still trying to understand these 2 questions: 1. How to see if this table has more than one

Re: drop table

2012-07-23 Thread Rob Roland
You don't have to run the major compaction - the shell is doing that for you. You must disable the table first, like: disable 'session_timeline' drop 'session_timeline' See the admin.rb file: def drop(table_name) tableExists(table_name) raise ArgumentError, Table #{table_name}

Re: drop table

2012-07-23 Thread Mohammad Tariq
The HBase processes exposes a web-based user interface (in short UI), which you can use to gain insight into the cluster's state, as well as the tables it hosts. Just point your web browser to http://hmaster:60010;. Although majority of the functionality is read-only, but there are a few selected

Re: drop table

2012-07-23 Thread Mohammad Tariq
Also, we don't have to worry about compaction under normal conditions. When something is written to HBase, it is first written to an in-memory store (memstore), once this memstore reaches a certain size, it is flushed to disk into a store file (everything is also written immediately to a log file

Re: drop table

2012-07-23 Thread Mohit Anchlia
Thanks everyone for your help On Mon, Jul 23, 2012 at 1:40 PM, Mohammad Tariq donta...@gmail.com wrote: Also, we don't have to worry about compaction under normal conditions. When something is written to HBase, it is first written to an in-memory store (memstore), once this memstore reaches a

Re: hbase threw NotServingRegionException

2012-07-23 Thread Ey-Chih chow
Thanks. But if we do a scan on the table via the hbase shell, data in the table did show up. Ey-Chih On Jul 23, 2012, at 1:10 PM, Mohammad Tariq wrote: Hello sir, A possible reason could be, your client is contacting the given regionserver, and that regionserver kept on rejecting

Re: hbase threw NotServingRegionException

2012-07-23 Thread Elliott Clark
hbck should help expose more problems than a single scan would. With that said the logs are the best bet in understanding what is going on with the cluster at the time of the issue. You posted logs that seem to only really contain GC info. Do you have more information about the cluster state

Insert blocked

2012-07-23 Thread Mohit Anchlia
I am writing a stress tool to test my specific use case. In my current implementation HTable is a global static variable that I initialize just once and use it accross multiple threads. Is this ok? My row key consists of (timestamp - (timestamp % 1000)) and cols are counters. What I am seeing is

Re: Insert blocked

2012-07-23 Thread Elliott Clark
HTable is not thread safe[1]. It's better to use HTablePool if you want to share things across multiple threads.[2] 1 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html 2 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTablePool.html On Mon, Jul 23, 2012

Re: Hbase bkup options

2012-07-23 Thread Michael Segel
There are a couple of nits... 1) Compression. This will help a bit when moving the files around. 2) Data size. You may have bandwidth issues. Moving TBs of data over a 1GBe network can impact your cluster's performance. (Even with compression) Depending on your cluster(s) and

Re: Efficient read/write - Iterative M/R jobs

2012-07-23 Thread Ioakim Perros
Thank you very much for responding :-) I also found this one : http://www.deerwalk.com/bulk_importing_data , which seems very informative. The thing is that I tried to create and run a simple (custom) bulk loading job and I tried to run it locally (in pseudo-distributed mode) - and the

Re: Efficient read/write - Iterative M/R jobs

2012-07-23 Thread Jean-Daniel Cryans
... INFO mapred.JobClient: Task Id : attempt_201207232344_0001_m_00_0, Status : FAILED java.lang.IllegalArgumentException: *Can't read partitions file* at org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111) ... I followed

Re: Index building process design

2012-07-23 Thread Michael Segel
Ok, I'll take a stab at the shorter one. :-) You can create a base data table which contains your raw data. Depending on your index... like an inverted table, you can run a map/reduce job that builds up a second table. And a third, a fourth... depending on how many inverted indexes you want.

Re: Insert blocked

2012-07-23 Thread Mohit Anchlia
On Mon, Jul 23, 2012 at 3:54 PM, Elliott Clark ecl...@stumbleupon.comwrote: HTable is not thread safe[1]. It's better to use HTablePool if you want to share things across multiple threads.[2] 1 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html 2

Re: hbase threw NotServingRegionException

2012-07-23 Thread Ey-Chih chow
Besides GC info, the master log is as follows. Ey-Chih === 12/07/22 00:47:18 WARN master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in

Re: Insert blocked

2012-07-23 Thread Mohit Anchlia
I am now using HTablePool but still the call hangs at put. My code is something like this: hTablePool = *new* HTablePool(config,*MAX_POOL_SIZE*); result = *new* SessionTimelineDAO(hTablePool.getTable(t.name()), ColumnFamily.*S_T_MTX*); public SessionTimelineDAO(HTableInterface

Re: Efficient read/write - Iterative M/R jobs

2012-07-23 Thread Ioakim Perros
Update (for anyone ending up here after a possible google search on the issue) : Finally, running M/R job in order to bulk import data in a pseudo-distributed is feasible (for testing purposes) . The error concerning TotalOrderPartitioner had something to do with a trivial bug at the keys I

Re: Insert blocked

2012-07-23 Thread lars hofhansl
Or you can pre-create your HConnection and Threadpool and use the HTable constructor that takes these as arguments. That is faster and less byzantine compared to the HTablePool monster. Also see here (if you don't mind the plug):