Save the bandwidth usage

2013-11-14 Thread Jia Wang
Hi Folks We are tuning a HBase cluster, it seems the current limitation is on network bandwidth usage during a performance test, the bidirectional bandwidth usage(sending+receiving) between our nodes is around 1Gb and almost hit the limitation(we had a pure network test before), so any ideas on

Re: Uneven write request to regions

2013-11-14 Thread Jia Wang
Hi Are the regions from the same table? If it was, check your row key design, you can find the start and end row key for each region, from which you can know why your request with a specific row key doesn't hit a specified region. If the regions are for different table, you may consider to

Re: 答复: Save the bandwidth usage

2013-11-14 Thread Jia Wang
Yes, the SNAPPY compression has been enabled already, which i don't think help too much cause we are generating random characters. The duplication factor is 3 by default in Hadoop, we have a 4 servers cluster, three of them shared with RegionServer, I have disable auto split policy for my table,

Re: Get HBase Read and Write requests per second separately

2013-11-14 Thread Jia Wang
I don't think so. Thanks Ramon On Thu, Nov 14, 2013 at 12:07 PM, Sandeep L sandeepvre...@outlook.comwrote: Is it possible to get from api instead of hbase_metrics. Thanks,Sandeep. Date: Wed, 13 Nov 2013 17:07:00 +0800 Subject: Re: Get HBase Read and Write requests per second separately

Frequent scan time outs in multiple region servers in same time intervals

2013-11-14 Thread Sandeep L
Hi, In recent times we are seeing frequent scan time outs in multiple region servers of our production cluster. Due to this our HBase cluster is not responding for 5 to 10 minutes for any queries that causing huge problem for us. We are getting following error in multiple region servers almost

Re: Frequent scan time outs in multiple region servers in same time intervals

2013-11-14 Thread Jia Wang
Is it a Scan for Map/Reduce? Or it's only a regular scan in which case maybe you could consider to set the start and end position of your scan to limit the operation. Thanks Ramon On Thu, Nov 14, 2013 at 6:23 PM, Sandeep L sandeepvre...@outlook.comwrote: Hi, In recent times we are seeing

RE: Frequent scan time outs in multiple region servers in same time intervals

2013-11-14 Thread Sandeep L
Its not a Scan for Map/Reduce. Internally we are using HBase for one of our application where our application scans data from HBase tables and we use the scan results.Due to regular scan time outs our service getting interrupted. Thanks,Sandeep. Date: Thu, 14 Nov 2013 18:54:28 +0800

Re: copyTable from 0.94 to 0.96?

2013-11-14 Thread Jean-Marc Spaggiari
Hum. I let is run over night and got that: 13/11/13 22:24:17 INFO zookeeper.ClientCnxn: Session establishment complete on server hbasetest1/192.168.23.51:2181, sessionid = 0x1423ef50f7d0241, negotiated timeout = 4 13/11/13 23:24:41 ERROR mapreduce.TableOutputFormat:

Re: copyTable from 0.94 to 0.96?

2013-11-14 Thread Stack
On Thu, Nov 14, 2013 at 9:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hum. I let is run over night and got that: 13/11/13 22:24:17 INFO zookeeper.ClientCnxn: Session establishment complete on server hbasetest1/192.168.23.51:2181, sessionid = 0x1423ef50f7d0241, negotiated

Re: copyTable from 0.94 to 0.96?

2013-11-14 Thread Jean-Marc Spaggiari
Thanks for looking a it St.Ack. Here are my regions on the source table: Table RegionsNameRegion ServerStart KeyEnd Key Requests dns,,1379202070789.bb65f685cdefc4f2491d246f376fc1f0. node3:60030 gestion-v.com 0 dns,gestion-v.com ,1379202070789.d02ce8e3fa1a200c7f034b349acf8cc8. buldo:60030

Re: Frequent scan time outs in multiple region servers in same time intervals

2013-11-14 Thread Ted Yu
Which version of HBase are you using ? If this problem only started to appear recently, was there noticeable change in load on the cluster ? Thanks On Nov 14, 2013, at 3:08 AM, Sandeep L sandeepvre...@outlook.com wrote: Its not a Scan for Map/Reduce. Internally we are using HBase for one of

Re: hbase suitable for churn analysis ?

2013-11-14 Thread Jean-Marc Spaggiari
Hi Sam, So are you saying that you will have about 30 column families? If so I don't think tit's a good idea. JM 2013/11/13 Sam Wu swu5...@gmail.com Hi all, I am thinking about using Random Forest to do churn analysis with Hbase as NoSQL data store. Currently, we have all the user

Re: hbase suitable for churn analysis ?

2013-11-14 Thread sam wu
Thanks for the advise. What about key is userId + no_day(since user registered), and column family is each typeEvent, and qualifier is the detailed trxs. On Thu, Nov 14, 2013 at 8:51 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Sam, So are you saying that you will have about 30

Re: hbase suitable for churn analysis ?

2013-11-14 Thread Pradeep Gollakota
I'm a little curious as to how you would be able to use no_of_days as a column qualifier at all... it changes everyday for all users right? So how will you keep your table updated? On Thu, Nov 14, 2013 at 9:07 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: You can use your no_day as a

Re: hBase - the server has too many connections (maxClientConn property set to 0 does not help)

2013-11-14 Thread Renato Marroquín Mogrovejo
Hi, Did you give this [1] a look? Looks like the error you are getting. Renato M. [1] https://wiki.apache.org/nutch/ErrorMessagesInNutch2 2013/11/14 Jean-Marc Spaggiari jean-m...@spaggiari.org Hi, JobTracker is MapReduce... So you should look on the MapReduce side. Few links which

Re: hbase suitable for churn analysis ?

2013-11-14 Thread sam wu
we ingest data from log (one file/table, per event, per date) into HBase offline on daily basis. So we can get no_day info. My thoughts for churn analysis based on two types of user. green (young, maybe 7 days in system), predict churn based on first 7? days activity, ideally predict while the

Re: hbase suitable for churn analysis ?

2013-11-14 Thread James Taylor
We ingest logs using Pig to write Phoenix-compliant HFiles, load those into HBase and then use Phoenix (https://github.com/forcedotcom/phoenix) to query directly over the HBase data through SQL. Regards, James On Thu, Nov 14, 2013 at 9:35 AM, sam wu swu5...@gmail.com wrote: we ingest data

Re: hBase - the server has too many connections (maxClientConn property set to 0 does not help)

2013-11-14 Thread glumet
Hi, I don't think that this is the same error as I have. My error and error from [1] have in common this warning: /HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default)/ but nothing else.

Re: hbase suitable for churn analysis ?

2013-11-14 Thread sam wu
Thanks for the great info On Thu, Nov 14, 2013 at 9:40 AM, James Taylor jtay...@salesforce.comwrote: We ingest logs using Pig to write Phoenix-compliant HFiles, load those into HBase and then use Phoenix (https://github.com/forcedotcom/phoenix) to query directly over the HBase data through

Re: hBase - the server has too many connections (maxClientConn property set to 0 does not help)

2013-11-14 Thread Ted Yu
You should modify mapred properties in mapred-site.xml Cheers On Thu, Nov 14, 2013 at 9:44 AM, glumet jan.bouch...@gmail.com wrote: Hi, I don't think that this is the same error as I have. My error and error from [1] have in common this warning: /HBase is able to connect to ZooKeeper but

Re: hBase - the server has too many connections (maxClientConn property set to 0 does not help)

2013-11-14 Thread glumet
I see. But I don't have such a file in my file system because I use only hbase, not hadoop. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/hBase-the-server-has-too-many-connections-maxClientConn-property-set-to-0-does-not-help-tp4052728p4052823.html Sent from the

Re: HBase with multiple interfaces

2013-11-14 Thread Jean-Marc Spaggiari
Hum. 2502 is a pretty old JIRA. And is not even fixed. I have been able to find this property in the default conf file but not in the code yet. It seems to be a left-over... I don't think this will have any effect. And I don't know how to achieve that, sorry. JM 2013/11/13 Sudarshan Kadambi

Re: copyTable from 0.94 to 0.96?

2013-11-14 Thread Jean-Marc Spaggiari
I have added some logs on the TableOutputFormat class and it seems to receive the right parameters. However, it's still not able to connect to the other cluster. 13/11/14 20:38:42 mapreduce.TableOutputFormat: address=hbasetest1.distparser.com:2181:/hbase 13/11/14 20:38:42

Re: Uneven write request to regions

2013-11-14 Thread Jia Wang
Then the case is simple, as i said check your row key design, you can find the start and end row key for each region, from which you can know why your request with a specific row key doesn't hit a specified region Cheers Ramon On Thu, Nov 14, 2013 at 8:47 PM, Asaf Mesika asaf.mes...@gmail.com

Re: Uneven write request to regions

2013-11-14 Thread Bharath Vissapragada
How about forcing a region split and moving the splits to RSs with less load ? On Fri, Nov 15, 2013 at 7:21 AM, Jia Wang ra...@appannie.com wrote: Then the case is simple, as i said check your row key design, you can find the start and end row key for each region, from which you can know why

Re: HBase with multiple interfaces

2013-11-14 Thread Asaf Mesika
We are using both of the following properties: hbase.regionserver.dns.interface, base.master.dns.interface. Both set to the interface name we want. We have two interfaces as you described - one for inner communication and one for external. What exactly is not working for you? On Wed, Nov 13,

Re: Not able to get data if Master cluster goes down in case of data replication

2013-11-14 Thread Hanish Bansal
Thanks for Response Demai :) On Wed, Nov 13, 2013 at 6:00 PM, Demai Ni nid...@gmail.com wrote: Hanish, I guess you are looking for hbase to automatically switch to the 2nd cluster. Unfortunately, that is not the architecture design of replication. And I think many of such design will rely