Re: Hbase 0.94

2012-02-09 Thread Jean-Daniel Cryans
(please don't leave trailing discussions if you do a reply on another email just to get the email of this mailing list, in any case I removed it) 0.94 isn't even branched yet from trunk, are you talking about 0.90.4? If so, there's no migration there. J-D On Thu, Feb 9, 2012 at 2:52 PM, Dalia

Re: hbasecon date at the website

2012-02-08 Thread Jean-Daniel Cryans
(please don't cross-post) Stack corrected the date he gave for the CFP (20th instead of 14th), not the conference. J-D On Wed, Feb 8, 2012 at 6:03 PM, Dani Rayan dani.ra...@gmail.com wrote: Hi, Could someone correct the date at http://www.hbasecon.com/ ? Some of us are considering to

Re: xceiver count, regionserver shutdown

2012-02-07 Thread Jean-Daniel Cryans
up, this did indeed fix the problem I was having with hitting the xceiver limit. Thanks a bunch for the help, I have a much better understanding of how heap, memstore size, and number of regions all play a role in performance and resource usage. On Feb 6, 2012, at 5:03 PM, Jean-Daniel Cryans

Re: randomWrite tests gives random results

2012-02-06 Thread Jean-Daniel Cryans
If you didn't configure anything more than the heap, PE will by default create a table with 1 region and a low (albeit default) memstore size. This means it's spending its time waiting on splits and it's recompacting your data all the time which wastes a lot of iops. You didn't tell use which

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
The number of regions is the first thing to check, then it's about the actual number of blocks opened. Is the issue happening during a heavy insert? In this case I guess you could end up with hundreds of opened files if the compactions are piling up. Setting a bigger memstore flush size would

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
to 1000 more rows (which may be the same rows or different than the previous batch), and so forth. BTW, I tried upping the xcievers parameter to 8192 but now I'm getting a Too many open files error. I have the file limit set to 32k. On Feb 6, 2012, at 11:59 AM, Jean-Daniel Cryans wrote

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
6, 2012, at 1:33 PM, Jean-Daniel Cryans wrote: Ok this helps, we're still missing your insert pattern regarding but I bet it's pretty random considering what's happening to your cluster. I'm guessing you didn't set up metrics else you would have told us that the compaction queues are through

Re: xceiver count, regionserver shutdown

2012-02-06 Thread Jean-Daniel Cryans
On Mon, Feb 6, 2012 at 4:47 PM, Bryan Keller brya...@gmail.com wrote: I increased the max region file size to 4gb so I should have fewer than 200 regions per node now, more like 25. With 2 column families that will be 50 memstores per node. 5.6gb would then flush files of 112mb. Still not

Re: Thrift hang ups with no apparent reason

2012-02-02 Thread Jean-Daniel Cryans
It seems like the thrift servers are doing something, I see they are reading inputs from your application and one is scanning. Earlier you mentioned that setting bigger heaps only delayed the issue, so it seems there's a memory leak. Which HBase version are you using? Earlier 0.90 versions had

Re: One table or multiple tables?

2012-02-02 Thread Jean-Daniel Cryans
You're not telling us much about your read patterns and data distribution, but I would go with the former solution for the sake of simplicity. You'd want to write your row keys in the same format as OpenTSDB does: http://opentsdb.net/schema.html J-D On Wed, Feb 1, 2012 at 8:59 AM, Mark

Re: HBase 0.92.0 Master won't start

2012-02-02 Thread Jean-Daniel Cryans
Actually hbase.rootdir must really be set to a directory, not just the location of the Namenode. The appended .oldlogs is the master opening up a path to where the old write ahead logs are archived but hdfs://10.40.0.156:9000 is not a folder. At the very least OP should use this config if he

Re: count the rows in a table from java

2012-01-26 Thread Jean-Daniel Cryans
That, or use the count command in the shell if your table is on the small side. J-D On Wed, Jan 25, 2012 at 7:49 PM, shashwat shriparv dwivedishash...@gmail.com wrote: What i found out are as follows: 12.1.8. RowCounter RowCounter is a utility that will count all the rows of a table. This

Re: Questions about compression

2012-01-26 Thread Jean-Daniel Cryans
On Thu, Jan 26, 2012 at 10:54 AM, Yves Langisch y...@langisch.ch wrote: Hi, I've a few questions concerning compression: 1) What are the steps to change the compression algorithm (e.g. LZO - Snappy) for an existing table? Change the schema, make sure you have the libs. 2) How can I copy

Re: early memstore flushings

2012-01-23 Thread Jean-Daniel Cryans
It flushes when it reaches the memstore size, not plus the global max memstore size. J-D On Mon, Jan 23, 2012 at 2:50 PM, Yves Langisch y...@langisch.ch wrote: Hi, I'm currently looking through all the metrics hbase provides and I don't understand the memstore flushing behavior I see. I

Re: Slides from meetup @ EBay

2012-01-20 Thread Jean-Daniel Cryans
Thanks to everyone who came and special thanks to eBay for hosting, Ted for organizing and Stack for gluing it all together. Just like releasing new versions, we should have meetups more often! J-D On Fri, Jan 20, 2012 at 7:16 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, We had a nice meetup

Re: HBase 0.92rc3 rest performance

2012-01-17 Thread Jean-Daniel Cryans
This seems to single out the REST server since thrift and native clients stayed the same. Can you provide us your test so we can do testing on our side too? Maybe doing a few jstacks on the REST server could point out the obvious bottlenecks. J-D On Tue, Jan 17, 2012 at 11:25 AM, Ben West

Re: client thread stuck on HBaseClient.call

2012-01-17 Thread Jean-Daniel Cryans
That stack trace is really just a debug message left in the Hadoop code (not even HBase!). Also it's surprising that we create a Configuration there, but that's another issue... So there's something weird with that row, or maybe the following rows too? Could you start a scanner after that row and

Re: EC2 remote client woes

2012-01-13 Thread Jean-Daniel Cryans
: Retrying zk create for another 6603ms; set 'hbase.zookeeper.recoverable.waittime' to change wait time); KeeperErrorCode = ConnectionLoss for /hbase On 1/12/12 5:36 PM, Jean-Daniel Cryans wrote: Interesting, could you start the shell with -d and pastebin all the debug that comes out after

Re: EC2 remote client woes

2012-01-13 Thread Jean-Daniel Cryans
.  Is this a problem with 'hbase', or did I mis-understand? Thanks P On 1/13/12 12:43 PM, Jean-Daniel Cryans wrote: Sorry what I meant by pastebin all the debug was to use a service like pastebin.com to keep the emails short. So in there I see: 12/01/13 02:21:22 INFO zookeeper.ClientCnxn

Re: heavy writing and compaction storms

2012-01-12 Thread Jean-Daniel Cryans
Hi, First you should consider using bulk import instead of a massive MR job. If you decide against that, then - make sure you pre-split: http://hbase.apache.org/book/important_configurations.html#disable.splitting - regarding major compactions, usually people switch off the automatic mode and

Re: EC2 remote client woes

2012-01-12 Thread Jean-Daniel Cryans
Your config file on the remote machine has: ip-XX-YYY-Z-QQQ.ec2.internal.ec2.internal You sure about the extra ec2.internal? J-D On Thu, Jan 12, 2012 at 9:26 AM, Peter Wolf opus...@gmail.com wrote: Oh yeah!  The code did it :-D For those that come after, I guess 'hbase shell' is broken for

Re: EC2 remote client woes

2012-01-12 Thread Jean-Daniel Cryans
shell', which is odd as I would have thought it sat on top of the Java API. P On 1/12/12 1:22 PM, Jean-Daniel Cryans wrote: Your config file on the remote machine has: ip-XX-YYY-Z-QQQ.ec2.internal.ec2.internal You sure about the extra ec2.internal? J-D On Thu, Jan 12, 2012 at 9:26 AM

Re: EC2 remote client woes

2012-01-12 Thread Jean-Daniel Cryans
in config files and code. P On 1/12/12 4:24 PM, Jean-Daniel Cryans wrote: Yes, it's the same thing, which is why I think the additional ec2.internal in your hbase-site is suspicious. Let me reiterate: This works: echo stat|nc ip-XX-YYY-Z-QQQ.ec2.internal 2181 But this config doesn't

Re: upgrade 0.90 to 0.92 - HFile v2

2012-01-12 Thread Jean-Daniel Cryans
It's automatic when you start with the 0.92 binaries and it does a compaction. There's no rollback. Stack is supposed to be writing the upgrade documentation (no pressure dude! wink wink) J-D On Thu, Jan 12, 2012 at 2:44 PM, Neil Yalowitz neilyalow...@gmail.com wrote: Is anyone familiar with

Re: bulk import and counting increments

2012-01-12 Thread Jean-Daniel Cryans
You could MR the data while it's still in HDFS, a simple count, and then insert those counts separately from the data. It would also reduce the number of increment calls (unless you have a number of incremented cells that is close to the number of increments you have to do). J-D On Thu, Jan 12,

Re: Multiple Clients and Standalone HBase

2012-01-10 Thread Jean-Daniel Cryans
On Tue, Jan 10, 2012 at 9:42 AM, Peter Wolf opus...@gmail.com wrote: Another Standalone question- Can I have multiple clients, running on multiple machines access a Standalone HBase? Yes. Is there a problem with having 10's of clients hitting a Standalone system? Now that's more than one

Re: How does HBase treat end keys?

2012-01-09 Thread Jean-Daniel Cryans
From Scan's javadoc: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setStopRow(byte[]) stopRow - row to end at (exclusive) Hope this helps, J-D On Mon, Jan 9, 2012 at 12:14 PM, Lewis John Mcgibbney lewis.mcgibb...@gmail.com wrote: Hi, Whilst working on some tests

Re: information, whether a GET Request inside Map-Task is data local or not

2012-01-09 Thread Jean-Daniel Cryans
Short answer: no. Painful way to get around the problem: You *could* by looking up the machines hostname when the job starts and then from the HConnection that HTables can give you through getConnection() do getRegionLocation for the row you are going to Get and then get the hostname by

Re: information, whether a GET Request inside Map-Task is data local or not

2012-01-09 Thread Jean-Daniel Cryans
report back when i get some usable results. Maybe some more people are interested in that. Christopher Am 09.01.2012 23:15, schrieb Jean-Daniel Cryans: Short answer: no. Painful way to get around the problem: You *could* by looking up the machines hostname when the job starts

Re: hbase RegionServer dead suddenly

2012-01-03 Thread Jean-Daniel Cryans
An exception rarely comes alone, look up your region server log and look for the first exceptions that showed up before that. Please pastebin.com the whole thing if you need help figuring it out. J-D On Fri, Dec 30, 2011 at 7:32 PM, xiaochao cschy.2...@163.com wrote: Hello: hbase RegionServer

Re: NotReplicatedYetException when gracefully shutting down a region server.

2012-01-03 Thread Jean-Daniel Cryans
Since you did the region mover thing then all the region should be off from that region server, so data loss is highly unlikely. My theory would be that a compaction was underway and got cancelled when the region server was told to close the region, but the DFSOutputStream thread hasn't been

Re: HBase Custom Comparator

2012-01-03 Thread Jean-Daniel Cryans
Is your comparator on the region server's classpath? Else it won't be able to guess what that comparator looks like :) Also your region server log should tell you it's not able to find CustomComparator. J-D On Fri, Dec 30, 2011 at 12:56 AM, Nageswaran Surendhar nagesfor...@gmail.com wrote:

Re: Failed to start master in standalone mode

2012-01-03 Thread Jean-Daniel Cryans
(This is a user question, sending to user@ and putting dev@ in BCC) If you read what the exceptions tell you (which might be easy to miss in between the other stuff, I agree), you'll see the important bit which is: Caused by: java.net.BindException: Problem binding to /8.15.7.117:0 : Cannot

Re: warning in log - question

2011-12-21 Thread Jean-Daniel Cryans
That line means that appending to the log took x ms, we print it out when it takes more than 1 second. I'd be worried to see that a lot in my logs if I had a realtime-ish system because it would mean that my writes are taking seconds. The usual reasons: - Heavy IO, I would expect those lines if

Re: What version of HDFS is compatible with HBase?

2011-12-21 Thread Jean-Daniel Cryans
I added dfs.support.append=true configuration everywhere but I get the same got version 3 expected version 4 problem. Setting this won't solve your version problem. How do I update the jars in the hbase lib directory? Remove the hadoop jar that's in there, replace it with the 0.20.205 one.

Re: ANN: HBase 0.90.5RC0 available for download

2011-12-20 Thread Jean-Daniel Cryans
+1 We did a rolling on all our clusters, this is running in production and hasn't caused any issues. J-D On Fri, Dec 9, 2011 at 12:35 PM, Stack st...@duboce.net wrote: The first hbase 0.90.5 release candidate is available for download:  

Re: RegionServer unable to connect to master

2011-12-15 Thread Jean-Daniel Cryans
Hi, A few notes: Remove the 127.0.1.1 lines, they usually mess things up. The hbase.master configuration has been removed from the HBase code more than 2 years ago, you can remove it too. Setting hbase.master.dns.interface alone without hbase.master.dns.nameserver doesn't do anything if I

Re: Delivery Status Notification (Failure)

2011-12-13 Thread Jean-Daniel Cryans
Do you have a more specific question? Have you tried anything yet? Thanks for helping us helping you, J-D On Tue, Dec 13, 2011 at 5:24 AM, shashwat shriparv dwivedishash...@gmail.com wrote: We are putting data into HBase in specific format, since the data in HBase will be very large hence we

Re: No changes or progress status on web UI during mapreduce program running

2011-12-12 Thread Jean-Daniel Cryans
That setting also needs to be in your job's classpath, it won't guess it. J-D On Thu, Dec 8, 2011 at 10:14 PM, Vamshi Krishna vamshi2...@gmail.com wrote: Hi harsh, ya, i no jobs are seen in that jobtracker page, under RUNNING JOBS it is none, under FINISHED JOBS it is none,FAILED JOBS it is

Re: hbase mapreduce running though command line

2011-12-09 Thread Jean-Daniel Cryans
You don't need the conf dir in the jar, in fact you really don't want it there. I don't know where that alert is coming from, would be nice if you gave more details. J-D On Fri, Dec 9, 2011 at 6:45 AM, Vamshi Krishna vamshi2...@gmail.com wrote: Hi, i want to run mapreduce program to insert

Re: Hbase export / import Why doubling the Table Size ?

2011-12-09 Thread Jean-Daniel Cryans
How are you measuring the size? hadoop dfs -dus /hbase or only that table's folder? J-D On Fri, Dec 9, 2011 at 1:50 PM, Lord Khan Han khanuniver...@gmail.com wrote: Hi , We are usng CDH3B4  and want to upgrade to CDH3u2.  Before doing this we make a separate cluster with same config and

Re: Hbase export / import Why doubling the Table Size ?

2011-12-09 Thread Jean-Daniel Cryans
thing table with lzo  size bigger than the exported FILE size also.. really strange... On Fri, Dec 9, 2011 at 11:53 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: How are you measuring the size? hadoop dfs -dus /hbase or only that table's folder? J-D On Fri, Dec 9, 2011 at 1:50 PM

Re: Hbase export / import Why doubling the Table Size ?

2011-12-09 Thread Jean-Daniel Cryans
No like I wrote they are at /hbase/.logs J-D On Fri, Dec 9, 2011 at 2:00 PM, Lord Khan Han khanuniver...@gmail.com wrote: is this logs files inside the tables directory ? On Fri, Dec 9, 2011 at 11:58 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: The region servers store their write

Re: CopyTable to remote cluster runs OK but doesn't copy anything

2011-12-07 Thread Jean-Daniel Cryans
It would most likely be this bug: https://issues.apache.org/jira/browse/HBASE-4614 On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus jorn.arg...@ephorus.com wrote: Hi all, I'm trying to copy a table from one cluster to another cluster but this does not seem to do what I expect it to

Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory

2011-12-06 Thread Jean-Daniel Cryans
successfull setup. Soon i want to start a blog in which i would post clear setup of hbase and problems along with corresponding solutions i faced during setting up all these days. On Fri, Dec 2, 2011 at 11:38 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: i could not find 10.0.1.54

Re: EOFException in HBase 0.94

2011-12-06 Thread Jean-Daniel Cryans
HBase prints the classpath when you start it, make sure you see the jars in there. For example, my local HBase 0.90.4 that runs on 0.20.205.0 has:

Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory

2011-12-02 Thread Jean-Daniel Cryans
set up. On Thu, Dec 1, 2011 at 11:28 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: So since I don't see the rest of the log I'll have to assume that the region server was never able to connect to the master. Connection refused could be a firewall, start the master and then try to telnet

Re: HBase and Consistency in CAP

2011-12-02 Thread Jean-Daniel Cryans
No, data is only served by one region server (even if it resides on multiple data nodes). If it dies, clients need to wait for the log replay and region reassignment. J-D On Fri, Dec 2, 2011 at 11:57 AM, Mohit Anchlia mohitanch...@gmail.com wrote: Why is HBase consisdered high in consistency

Re: HBase and Consistency in CAP

2011-12-02 Thread Jean-Daniel Cryans
questions, but I want to read more specific information about how it works and why it's designed that way. On Fri, Dec 2, 2011 at 11:59 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: No, data is only served by one region server (even if it resides on multiple data nodes). If it dies, clients

Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory

2011-12-01 Thread Jean-Daniel Cryans
for evry one minute. But i am not understanding where to check and modify the things.. please help. i feel all connections are OK. On Thu, Dec 1, 2011 at 12:28 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: stop-hbase.sh only tells the master to stop, which in turn will tell the region

Re: Strategies for aggregating data in a HBase table

2011-12-01 Thread Jean-Daniel Cryans
support this natively i.e. one more level of partitioning above the row key , but below a table can be beneficial for use cases like these ones. Comments ... ? On Wed, Nov 30, 2011 at 11:53 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: Inline. J-D On Mon, Nov 28, 2011 at 1:55 AM

Re: Constant error when putting large data into HBase

2011-12-01 Thread Jean-Daniel Cryans
Here's my take on the issue. I monitored the process and when any node fails, it has not used all the heaps yet. So it is not a heap space problem. I disagree. Unless you load a region server heap with more data than there's heap available (loading batches of humongous rows for example), it

Re: Failed in connection to self

2011-11-30 Thread Jean-Daniel Cryans
On Tue, Nov 29, 2011 at 10:28 PM, Mikael Sitruk mikael.sit...@gmail.com wrote: First thanks for answering, second i still have some question see inline Thanks Mikael.S --- Shouldn't be an added value to add some information in the original message like (failed to load ... will try later),

Re: Strategies for aggregating data in a HBase table

2011-11-30 Thread Jean-Daniel Cryans
Inline. J-D On Mon, Nov 28, 2011 at 1:55 AM, Steinmaurer Thomas thomas.steinmau...@scch.at wrote: Hello, ... While it is an option processing the entire HBase table e.g. every night when we go live, it probably isn't an option when data volume grows over the years. So, what options are

Re: How to retrieve qualifier informatio​n from TRowResult in Thrift(v0.​80) Hbase(v0.9​0) Php

2011-11-30 Thread Jean-Daniel Cryans
This looks like the right way to do it, are you sure you are using the qualifiers correctly? J-D On Fri, Nov 25, 2011 at 8:56 PM, Zebra Zhao zebra2mea...@gmail.com wrote: Hi All, I created a Hbase table with following schema: row_id  --  {cf_id: [qualifier_id1,qualifier_id2,...]} Row record

Re: zookeeper quorum verification

2011-11-30 Thread Jean-Daniel Cryans
It's pretty much what we do, works well. J-D On Wed, Nov 30, 2011 at 3:49 PM, Rita rmorgan...@gmail.com wrote: Hello, Previously, I assigned 5 servers as part of the zookeeper quorum. Everything works fine but I was hard coding these 5 servers everywhere and I was thinking of creating a

Re: Failed in connection to self

2011-11-30 Thread Jean-Daniel Cryans
, nothing can really be done to clarify the confusion untill the user get use to. Thanks. Mikael.S On Wed, Nov 30, 2011 at 8:38 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: On Tue, Nov 29, 2011 at 10:28 PM, Mikael Sitruk mikael.sit...@gmail.com wrote: First thanks for answering

Re: Failed in connection to self

2011-11-29 Thread Jean-Daniel Cryans
Inline. J-D On Tue, Nov 29, 2011 at 12:46 AM, Mikael Sitruk mikael.sit...@gmail.com wrote: Hi I have a strange error in log of HBase (0.90.2) 2011-11-28 15:12:53,049 INFO This part right up here is important, it's INFO level. then few line later i can see 2011-11-28 15:12:53,139 INFO

Re: Use single cluster or two clusters for log analysis and HBase?

2011-11-29 Thread Jean-Daniel Cryans
At StumbleUpon we use two clusters because high throughput (like MapReduce) will always kill low latency (like serving random reads). J-D On Mon, Nov 28, 2011 at 11:37 PM, jingguo yao yaojing...@gmail.com wrote: I want to set up Hadoop clusters. There are two workloads. One is log analysis

Re: hbase-regionserver1: bash: {HBASE_HOME}/bin/hbase-daemon.sh: No such file or directory

2011-11-29 Thread Jean-Daniel Cryans
You posted this in two threads, please refrain from doing this in the future. From what I can read in there, it tried to get the master address that's supposed to be in zookeeper but it failed because it was missing and then died. The way it's handled is a bit ugly but the effect is the same,

Re: hbase scan job property

2011-11-29 Thread Jean-Daniel Cryans
Well if you use the java API then you can pass your own prefix to the Scan object, but it seems that this is missing for streaming. Actually pretty much every filter is missing from the streaming API, which kinda makes sense since specifying this on the command line would be pretty awkward and

Re: Problem in host resolving

2011-11-29 Thread Jean-Daniel Cryans
Sounds like the typical Ubuntu issue where by default localhost binds to the local loopback interface and you are trying to reach it via an IP. Change either your /etc/hosts configuration or have fs.default.name listen on 192.168.2.106:9000 (or the machine's hostname if it isn't binding on lo

Re: Why the map reduce job shows 17125 records wrote but hbase table only have 6790 rows?

2011-11-29 Thread Jean-Daniel Cryans
If it really reports at the MapReduce level that it has inserted less rows than you expected, then it's most probably an error in your code or a misunderstanding of some sort. Looking at the code itself might help us helping you. J-D On Sat, Nov 26, 2011 at 9:17 AM, R.L. xmur...@gmail.com

Re: Flushing memstore

2011-11-29 Thread Jean-Daniel Cryans
It's handled by the region server when it closes each region. J-D On Tue, Nov 29, 2011 at 4:49 PM, Mark static.void@gmail.com wrote: Before shutting down a regionserver should one flush the memstore or is this automatically handled? If not, how would one manually flush it? I also noticed

Re: About the feasibility of hbase in cloud storage

2011-11-29 Thread Jean-Daniel Cryans
Doable but generally not recommended, for example look at the discussion that happened on this mailing list 2 days ago regarding S3: http://search-hadoop.com/m/1XSnhmWRhH1 J-D 2011/11/29 庄阳 zhuangy...@asiainfo-linkage.com: Hi, Our team would like to use Hbase in cloud storage.Please introduce

Re: Zookeeper Connection Issues in Pseudo Distributed Mode

2011-11-29 Thread Jean-Daniel Cryans
So... what's in that ZK log? Also try running the shell with -d as an option and post the stack traces as well. Thx, J-D On Sun, Nov 27, 2011 at 8:12 PM, Sid Kumar sqlsid...@gmail.com wrote: I installed Hbase to run in pseudo distributed mode and was able to start the shell, but when I try

Re: Can hbase delete the old regions after region split automatically?

2011-11-23 Thread Jean-Daniel Cryans
Inline. J-D On Tue, Nov 22, 2011 at 10:00 PM, 吕鹏 lvpengd...@gmail.com wrote: Thanks so much for your advise. I have some other problems: 1 Can the patch

Re: Can hbase delete the old regions after region split automatically?

2011-11-22 Thread Jean-Daniel Cryans
It's supposed to but there are a few leaks such as: https://issues.apache.org/jira/browse/HBASE-4799 https://issues.apache.org/jira/browse/HBASE-4238 J-D On Mon, Nov 21, 2011 at 7:32 PM, 吕鹏 lvpengd...@gmail.com wrote: I set hbase.hregion.max.filesize to 9223372036854775807. When an average

Re: run hbase using specified username/password

2011-11-22 Thread Jean-Daniel Cryans
It will run as the current user you are logged in with (the old permissions system was pretty broken). If you really really want to run with different users, set dfs.permissions to false, but I would recommend just using the same user to start/stop hbase, J-D On Tue, Nov 22, 2011 at 2:55 AM,

Re: importtsv bulk upload fail

2011-11-22 Thread Jean-Daniel Cryans
Same answer as last time this was asked: http://search-hadoop.com/m/rUV9on6kWA1 You can't do this without a fully distributed setup. J-D On Tue, Nov 22, 2011 at 10:33 AM, Ales Penkava ales.penk...@neuralitic.com wrote: Hello, I am on CDH3 trying to perform bulk upload but following error

Re: network spike every 2 hours

2011-11-22 Thread Jean-Daniel Cryans
Look for the compaction queue metrics, see if it fits. Could be major compactions. J-D On Tue, Nov 22, 2011 at 3:55 PM, Jeff Whiting je...@qualtrics.com wrote: We have a hadoop cluster that is only running hbase.  We recently installed ganglia to monitor those servers.  Looking over the

Re: network spike every 2 hours

2011-11-22 Thread Jean-Daniel Cryans
but are several times slower).  Is there any way to even those out? Or prevent any compactions from impacting the rpc calls? ~Jeff On 11/22/2011 5:00 PM, Jean-Daniel Cryans wrote: Look for the compaction queue metrics, see if it fits. Could be major compactions. -- Jeff Whiting Qualtrics

Re: java.io.IOException: Connection reset by peer

2011-11-18 Thread Jean-Daniel Cryans
That's a nice exception you got there, is there anything in particular you'd like to discuss about it? J-D On Thu, Nov 17, 2011 at 10:01 PM, 陈加俊 cjjvict...@gmail.com wrote: 2011-11-18 13:35:06,252 INFO org.apache.hadoop.hbase.regionserver.HRegion: completed compaction on region EH,

Re: block caching

2011-11-17 Thread Jean-Daniel Cryans
And it will probably evict everyone else that was already present. Hello latency. J-D On Thu, Nov 17, 2011 at 2:08 PM, lars hofhansl lhofha...@yahoo.com wrote: Hi Sam, The idea is that the entire result of the scan will not fit into the cache if the scan scans a reasonable number of cells,

Re: HRegionserver daemon is not running on one machine.

2011-11-17 Thread Jean-Daniel Cryans
My understanding of your problem is that the folder needs to be exactly at the same place as on the machine where you're starting HBase from, even if you set HBASE_HOME to a different to a different folder on hbase-regionserver1 it's not going to use it. Hope this helps, J-D On Thu, Nov 17,

Re: Scan with Stop/Start or Prefix?

2011-11-17 Thread Jean-Daniel Cryans
Using a filter requires a few more CPU cycles. J-D On Thu, Nov 17, 2011 at 3:38 PM, Mark static.void@gmail.com wrote: Are they both the same in terms of performance? On 11/17/11 2:13 PM, Jean-Daniel Cryans wrote: Use a prefix when you don't want to invent a stop row key (in your case

Re: n00b trying to run HBase example code

2011-11-16 Thread Jean-Daniel Cryans
-Daniel Cryans Sent: 15 November 2011 21:30 To: user@hbase.apache.org Subject: Re: n00b trying to run HBase example code If I remember correctly, this NPE doesn't come alone but you have to go into that tasks' log to find the rest as the job output you have only shows what ended up killing the task

Re: Not able to change the VERSION of hbase row

2011-11-16 Thread Jean-Daniel Cryans
You need to tell the Get or Scan to fetch more versions. For example, the help for the get commands gives this example: hbase get 't1', 'r1', {COLUMN = 'c1', TIMERANGE = [ts1, ts2], VERSIONS = 4} In the API you would use

Re: Facing Issues with RowCounter

2011-11-16 Thread Jean-Daniel Cryans
What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the Set column family. Is it the case or not? Without more info from you, it's hard to tell. J-D On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi stutiawas...@hcl.com wrote: Hi, I tried to

Re: zookeeper session establishing time out and stopping

2011-11-15 Thread Jean-Daniel Cryans
On Tue, Nov 15, 2011 at 6:43 AM, Vamshi Krishna vamshi2...@gmail.com wrote: Hi , i am using hadoop-0.20.2 and hbase-0.90.4, both are running, from eclipse i ran one small program creating and inserting a row of data. It is dispalying following on the console and stopping. I added following

Re: What costs must we expect when adding a new column family?

2011-11-15 Thread Jean-Daniel Cryans
None until you actually put data in. J-D On Wed, Nov 9, 2011 at 12:05 PM, Denis Kreis de.kr...@gmail.com wrote: A noob question: What costs schould we expect when adding a new column family to an existing table?

Re: HBase Master dies with an unexpected exception

2011-11-15 Thread Jean-Daniel Cryans
Fixed in https://issues.apache.org/jira/browse/HBASE-3617, upgrade to 0.90.4 J-D On Tue, Nov 8, 2011 at 6:08 PM, Amit Phadke apha...@yahoo-inc.com wrote: Adding right address. On Nov 7, 2011, at 2:45 PM, Amit Phadke wrote: Hey Guys, We are seeing an issue where Master dies with something

Re: storing MB sized files in HBase

2011-11-15 Thread Jean-Daniel Cryans
You *can*. You don't have to adjust the HBase HFile block size since each object will just take exactly one block. You do want to adjust the HDFS block size higher. The region size should always be managed disregard of your application. One thing to keep in mind is that fat cells' not the

Re: n00b trying to run HBase example code

2011-11-15 Thread Jean-Daniel Cryans
If I remember correctly, this NPE doesn't come alone but you have to go into that tasks' log to find the rest as the job output you have only shows what ended up killing the task. Go to the jobtracker's web ui, click on your job, click on the number of failed tasks, look at the log of one of those

Re: daughter region issue

2011-11-15 Thread Jean-Daniel Cryans
It's harmless. J-D On Fri, Nov 11, 2011 at 1:57 PM, Corbin Hoenes cor...@tynt.com wrote: Using hbase 0.90.4 and seeing this in the logs: 2011-11-11 13:36:23,130 WARN org.apache.hadoop.hbase.master.CatalogJanitor: Daughter regiondir does not exist:

Re: daughter region issue

2011-11-15 Thread Jean-Daniel Cryans
case it could be either splitA or B, the log line will tell. J-D On Tue, Nov 15, 2011 at 1:43 PM, Corbin Hoenes cor...@tynt.com wrote: Great news... Is there a way to clean it out of my logs? I see this MSG every 5 minutes. Sent from my iPhone On Nov 15, 2011, at 2:33 PM, Jean-Daniel

Re: mapreduce on two tables

2011-11-07 Thread Jean-Daniel Cryans
You don't really need to store that into another HBase table, just dump it into HDFS (unless you want to do random access on that second table, which acts as a secondary index for documents by authors). It's a workable solution, it's just brute force. J-D On Mon, Nov 7, 2011 at 11:02 AM, Rohit

Re: multiple rows with same unique row+dynamic update

2011-11-07 Thread Jean-Daniel Cryans
What do you mean by dynamic update? Replace one value by another? HBase does that. J-D On Mon, Nov 7, 2011 at 3:29 PM, Jignesh Patel jigneshmpa...@gmail.com wrote: Companies like Facebook and Stumble upon uses HBase for the day to day transaction. Since HBase and HDFS both are not going to

Re: multiple rows with same unique row+dynamic update

2011-11-07 Thread Jean-Daniel Cryans
(rowid) :jigneshmpatel columnfamily:Data comun1name: address 1: Value4 comun2name:address2: value5 column3name:Phone: 5678. -Jignesh On Mon, Nov 7, 2011 at 10:38 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: What do you mean by dynamic update? Replace one value by another? HBase does

Re: multiple rows with same unique row+dynamic update

2011-11-07 Thread Jean-Daniel Cryans
column3name:Phone would have a new version, yes. On Mon, Nov 7, 2011 at 4:02 PM, Jignesh Patel jigneshmpa...@gmail.com wrote: Is that new entry or update in HBase? -Jignesh On Mon, Nov 7, 2011 at 10:46 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: ok that works... what's the issue

Re: multiple rows with same unique row+dynamic update

2011-11-07 Thread Jean-Daniel Cryans
is still same:jigneshmpatel, shouldn't it create conflict if it is a new version. -Jignesh On Mon, Nov 7, 2011 at 11:07 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: column3name:Phone would have a new version, yes. On Mon, Nov 7, 2011 at 4:02 PM, Jignesh Patel jigneshmpa...@gmail.com

Re: Getting EOF Exception when starting HBASE

2011-11-02 Thread Jean-Daniel Cryans
Without more info about your setup or logs, I would guess that you forgot to replace the hadoop jar in hbase's lib folder per this documentation: http://hbase.apache.org/book/hadoop.html J-D On Wed, Nov 2, 2011 at 5:14 PM, LoveIR shiva2...@gmail.com wrote: Hi, I am using Hbase 0.90.4 version

Re: Data node problem after reinstall

2011-11-01 Thread Jean-Daniel Cryans
I don't see anything wrong in that log, do you actually have WARN or ERROR level log lines? Those might be a better start. Also please explain how you figured that the DN can only accept replicas and nothing from HBase, hopefully with evidence. This will greatly help. Thx, J-D On Tue, Nov 1,

Re: region size/count per regionserver

2011-11-01 Thread Jean-Daniel Cryans
These days I think the recommendation is more like 20 regions per region server, and the region size set accordingly. The major caveat is that when you start compacting the bigger store files you can really take a massive IO hit, so most of the time major compactions are tuned to run only every

Re: region size/count per regionserver

2011-11-01 Thread Jean-Daniel Cryans
On Tue, Nov 1, 2011 at 2:34 PM, Sujee Maniyam su...@sujee.net wrote: optimizations for compactions in 0.92. In our case we have a pretty old setup and had way too many regions so we ran a few online merges to bring this down to like 80 regions/RS and it's working pretty well. J-D what is

Re: region size/count per regionserver

2011-11-01 Thread Jean-Daniel Cryans
On Tue, Nov 1, 2011 at 2:46 PM, Sujee Maniyam su...@sujee.net wrote: 20GB, compressed ?  If so is it LZO or Snappy? The region size is expressed in terms of size on disk, in our case it's LZOed. J-D

Re: save video with hbase

2011-10-31 Thread Jean-Daniel Cryans
Why would you use HBase for that? Regarding your two questions: On Mon, Oct 31, 2011 at 4:17 PM, xtliwen xtli...@gmail.com wrote: Hi everybody, When the client visit the video of a website through my website, it will be transcoded with our video codec server.As the time goes on,the

Re: Lease does not exist exceptions

2011-10-26 Thread Jean-Daniel Cryans
on debug on the client side. Regards, Lucian On Mon, Oct 24, 2011 at 8:22 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: So you should see the SocketTimeoutException in your *client* logs (in your case, mappers), not LeaseException. At this point yes you're going to timeout, but if you spend

Re: CopyTable Usage/Exception

2011-10-26 Thread Jean-Daniel Cryans
I remember an issue with CopyTable in early versions of 0.90, can you retry it with 0.90.4 or cdh3u1? Thx, J-D On Wed, Oct 26, 2011 at 1:09 PM, sagar naik sn...@attributor.com wrote: Hi, Hbase version: cdh3-u0/0.90.1 I m trying to use the copytable The arguments are: copytable

Re: Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.

2011-10-26 Thread Jean-Daniel Cryans
You can't do this without running a fully distributed setup, see this very similar thread: http://search-hadoop.com/m/hqc6F1U9S6e J-D On Wed, Oct 26, 2011 at 2:30 PM, danoomistmatiste kkhambadk...@yahoo.com wrote: Hi,  I am facing a strange problem.    I am running a HBase bulkload job and I

<    1   2   3   4   5   6   7   8   9   10   >