Re: which hadoop and zookeeper version should I use with hbase 0.90.1

2011-02-23 Thread Oleg Ruchovets
I found couple hadoop 0.20.0 links: 1) http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/ http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/ 2) https://github.com/facebook/hadoop-20-append https://github.com/facebook/hadoop-20-append 3)

I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
I can't get many versions of the specified column,but only get the latest version of the specified column. Is there anyone help me? //put data by version final Put p = new Put(key); // key final long ts = System.currentTimeMillis(); p.add(FAMILY, q1, ts,v1); p.add(FAMILY, q2,

Re: Trying to contact region Some region

2011-02-23 Thread Jean-Daniel Cryans
It could be due to slow splits, heavy GC, etc. Make sure your machines don't swap at all, that HBase has plenty of memory, that you're not trying to use more CPUs than your machines actually have (like setting 4 maps on a 4 cores machine when also using hbase), etc. Also upgrading to 0.90.1 will

Re: huge .oldlogs

2011-02-23 Thread Ted Yu
Please look for other exceptions. I have been stress testing 0.90.1 and my .oldlogs folder is empty. On Wed, Feb 23, 2011 at 11:18 AM, charan kumar charan.ku...@gmail.comwrote: Hi J-D, There are no NPE's in the log. Thanks, Charan On Wed, Feb 23, 2011 at 11:04 AM, Jean-Daniel Cryans

Re: huge .oldlogs

2011-02-23 Thread Jean-Daniel Cryans
I'll have to trust you on that :) The other possible situation is that you are inserting a ton of data and logs are generated faster than they get cleaned. 0.90.0 has a limiter that was later removed in 0.90.1 by https://issues.apache.org/jira/browse/HBASE-3501 so you should upgrade and see if it

Re: Trying to contact region Some region

2011-02-23 Thread Ryan Rawson
We fixed a lot of the exception handling in 0.90. The exception text is much better. Check it out! -ryan On Wed, Feb 23, 2011 at 11:18 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: It could be due to slow splits, heavy GC, etc. Make sure your machines don't swap at all, that HBase has

Re: huge .oldlogs

2011-02-23 Thread Jean-Daniel Cryans
Yes, you can delete the content of the folder (not the folder itself) safely. J-D On Wed, Feb 23, 2011 at 11:37 AM, charan kumar charan.ku...@gmail.com wrote: I have been inserting a ton of data for the past few days. This looks like the issue. If the issue is related to that, can I delete

Re: huge .oldlogs

2011-02-23 Thread charan kumar
Excellent! Thank you J-D On Wed, Feb 23, 2011 at 11:45 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Yes, you can delete the content of the folder (not the folder itself) safely. J-D On Wed, Feb 23, 2011 at 11:37 AM, charan kumar charan.ku...@gmail.com wrote: I have been inserting a

table creation is failing now and then (CDH3b3)

2011-02-23 Thread Dmitriy Lyubimov
Hi all, from time to time we come to a sitation where .META. table seems to be stuck in some corrupted state. In particular, attempts to create more tables cause ERROR: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in .META. for region

Re: table creation is failing now and then (CDH3b3)

2011-02-23 Thread Ryan Rawson
You should consider upgrading to hbase 0.90.1, a lot of these kinds of issues were fixed. -ryan On Wed, Feb 23, 2011 at 12:02 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: Hi all, from time to time we come to a sitation where .META. table seems to be stuck in some corrupted state. In

Re: TableInputFormat configuration problems with 0.90

2011-02-23 Thread Jean-Daniel Cryans
How do you create the configuration object Dan? Are you doing: Configuration conf = HBaseConfiguration.create(); Job job = new Job(conf, somename); or are you just creating a normal Configuration? BTW the code I wrote is what I expect people do and what I'm doing myself. J-D On Wed, Feb 23,

when does put return to the caller?

2011-02-23 Thread Hiller, Dean (Contractor)
I was wonder if put returns after writing the data into memory on two out of the three nodes letting my client continue so we don't have to wait for the memory to then go to disk. After all, if it is replicated, probably don't need to wait for it to be written to disk(ie. Kind of like the

Re: when does put return to the caller?

2011-02-23 Thread Ryan Rawson
There is a batch put call, should be trivial to use some kind of background thread to invoke callbacks when it returns. Check out the HTable API, javadoc, etc. All available via http://hbase.org ! -ryan On Wed, Feb 23, 2011 at 1:25 PM, Hiller, Dean (Contractor) dean.hil...@broadridge.com

Re: which hadoop and zookeeper version should I use with hbase 0.90.1

2011-02-23 Thread Mike Spreitzer
I have now installed the recommended way --- build Hadoop branch-0.20-append, install and config it, then smash its Hadoop core jar into the HBase lib/. Very light testing revealed no problems. But the testing is still so little that I do not recommend drawing any conclusions about

async table updates?

2011-02-23 Thread Vishal Kapoor
I have two tables called LIVE and MASTER. LIVE reports about the MASTER activity and I need to process records in LIVE almost real time( some business logic) if I need to store the activity of entities reported by LIVE rows in MASTER say in ACTIVITY:LAST_REPORTED I could process my data in LIVE

Re: async table updates?

2011-02-23 Thread Ryan Rawson
In thrift there is a 'oneway' or 'async' or 'fire and forget' call type. I cant recommend those kinds of approaches, since once your system runs into problems you have no feedback. So if you are asking for a one shot, no reply assume it worked call, we don't have one (nor would I wish that hell

Multiple scans vs single scan with filters

2011-02-23 Thread Alex Baranau
Hello, Would be great if somebody can share thoughts/ideas/some numbers on the following problem. We have a reporting app. To fetch data for some chart/report we currently use multiple scans, usually 10-50. We fetch about 100 records with each scan which we use to construct a report. I've

Re: TableInputFormat configuration problems with 0.90

2011-02-23 Thread Dan Harvey
Ah ok, most of the time we were using the default Hadoop configuration object and not the HBase one. I guess that's a change between 0.20 and 0.90? Would it not make sense for the TableMapReduceUtil class to do that for you? As you'll need it in every HBase map reduce job. Anyway, I guess we

Re: TableInputFormat configuration problems with 0.90

2011-02-23 Thread Jean-Daniel Cryans
Yeah it should, also I'm pretty sure you're right to say that this regression comes from HBASE-2036... would you mind opening a jira? Thanks for the report and the digging Dan! J-D On Wed, Feb 23, 2011 at 3:30 PM, Dan Harvey danharve...@gmail.com wrote: Ah ok, most of the time we were using

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
I execute it five times at diffrent time. //put data by version final Put p = new Put(key); // key final long ts = System.currentTimeMillis(); p.add(FAMILY, q1, ts,v1); p.add(FAMILY, q2, ts,v2); p.add(FAMILY, q3, ts,v3); table.put(p); So I can get five versions

RE: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Buttler, David
What is your table schema set to? By default it holds 3 versions. Also, you might iterating over KeyValues instead of using the Map since you don't really care about the organization, just the time. Dave -Original Message- From: 陈加俊 [mailto:cjjvict...@gmail.com] Sent: Wednesday,

Stack Overflow?

2011-02-23 Thread Buttler, David
Hi all, It seems that we are getting a lot of repeated questions now. Perhaps it would be useful to start migrating the simple questions off to stackoverflow (or whichever stack exchange website is most appropriate), and just pointing people there? Obviously there are still a lot of questions

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
Thank you David ! I alter the table schema as follow: alter 'cjjIndexPageModify', {NAME = 'log' , VERSIONS = 5 , METHOD = 'add'} How to iterate over KeyValues? which method in Result? On Thu, Feb 24, 2011 at 9:27 AM, Buttler, David buttl...@llnl.gov wrote: What is your table schema set to?

RE: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Buttler, David
Result.list() ? Putting the hbase source into your IDE of choice (yay Eclipse!) is really helpful Dave -Original Message- From: 陈加俊 [mailto:cjjvict...@gmail.com] Sent: Wednesday, February 23, 2011 5:42 PM To: user@hbase.apache.org Cc: Buttler, David Subject: Re: I can't get many

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
final ListKeyValue list = result.list(); for (final KeyValue it : list) { System.out.println(Bytes.toString(it.getKey())); System.out.println(Bytes.toString(it.getValue())); } I can only get the last version! why ? Is there

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Ryan Rawson
There are test cases for this, the functionality DOES work, something is up... Without full code and full descriptions of your tables, debugging is harder than it needs to be. It's probably a simple typo or something, check your code and table descriptions again. Many people rely on the multi

Number of regions

2011-02-23 Thread Nanheng Wu
What are some of the trade-offs of using larger region files and less regions vs the other way round? Currently each of my host has ~700 regions with the default hfile size, is this an acceptable number? (hosts have 16GB of RAM). Another totally unrelated question: I have Gzip enabled on the hfile

Re: Number of regions

2011-02-23 Thread Ryan Rawson
There have been threads about this lately, check out the search box on hbase.org which searches the list archives. On Feb 23, 2011 6:56 PM, Nanheng Wu nanhen...@gmail.com wrote: What are some of the trade-offs of using larger region files and less regions vs the other way round? Currently each

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
I will check my code and table descriptions again. And the test case is TestGetRowVersions. I believe that I made a mistake. 2011/2/24 Ryan Rawson ryano...@gmail.com There are test cases for this, the functionality DOES work, something is up... Without full code and full descriptions of your

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Tatsuya Kawano
Hi Jiajun, Make sure you don't have the same timestamp on every versions of puts; try to put Thread.sleep() in your test(?) codes when necessary. You might not want to specify the timestamp by yourself but want to let HBase to store appropriate ones. -- Tatsuya Kawano (Mr.) Tokyo, Japan

Re: Stack Overflow?

2011-02-23 Thread Otis Gospodnetic
Hi David, When I see people asking questions that others have asked before (and received answers) I tend to point them to those questions/answers via a tool, so they become aware of the tool, hopefully start using it, and thus check before asking next time around. For Lucene, Solr, etc. I

Re: HBase 0.90.0 cannot be put more data after running hours

2011-02-23 Thread Anty
1) when there are only 2 client threads, vmstat output is procs ---memory-- ---swap-- -io --system-- -cpu-- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 29236 161 557488 12027072 0 0 19 53 0 0 3 1 97 0 0 0 0 29236 1610152 557488 12027076 0 0 0 0

Re: HBase 0.90.0 cannot be put more data after running hours

2011-02-23 Thread Anty
Sorry, the vmstat output for 2) is wrong. 1) when there are only 2 client threads, vmstat output is procs ---memory-- ---swap-- -io --system-- -cpu-- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 29236 161 557488 12027072 0 0 19 53 0 0 3 1 97 0

Re: HBase 0.90.0 cannot be put more data after running hours

2011-02-23 Thread Schubert Zhang
On Sat, Jan 29, 2011 at 1:02 AM, Stack st...@duboce.net wrote: On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang zson...@gmail.com wrote: 1. The .META. table seems ok I can read my data table (one thread for reading). I can use hbase shell to scan my data table. And I can use

Re: HBase 0.90.0 cannot be put more data after running hours

2011-02-23 Thread Schubert Zhang
Currently, with 0.90.1, this issue happen when there is only 8 regions in each RS, and totally 64 regions in all totally 8 RS. Ths CPU% of the client is very high. On Thu, Feb 24, 2011 at 10:55 AM, Schubert Zhang zson...@gmail.com wrote: Now, I am trying the 0.90.1, but this issue is still

Install problem - HBase 0.90.1 cannot connect to zookeeper

2011-02-23 Thread sun sf
I have installed HBase0.20.6 successfully but I met the following problem when try to install HBase0.90.1. It always says zookeepee cannot be connected when we use the same configuration as HBase0.20.6. At last, I reinstalled the CentOS5.5, and start HBase0.90.1 in Stand alone, the following

Re: Stack Overflow?

2011-02-23 Thread Stack
Hey David: Yeah, a few of us have started to refer to the 'two week cycle' where it seems the same questions come around again. Karl Fogels' Producing Open Source Software, http://producingoss.com/en/producingoss.pdf, has a good section on this topic. In it he advocates 'Conspicuous Use of

Re: Install problem - HBase 0.90.1 cannot connect to zookeeper

2011-02-23 Thread sun sf
Thank you for your quick reply. I know there are several different default configurations between HBase0.90.1 and HBase0.20.6. And so I tried pseudo and standalone install, it seems both of them had the same zookeeper error. In the standalone,I only added the root.dir to the hbase-site.xml and

Re: Install problem - HBase 0.90.1 cannot connect to zookeeper

2011-02-23 Thread Stack
Zookeeper ensemble is not running. See logs for why. St.Ack On Wed, Feb 23, 2011 at 9:46 PM, sun sf revlet...@gmail.com wrote: Thank you for your quick reply. I know there are several different default configurations between HBase0.90.1 and HBase0.20.6. And so I tried pseudo and standalone

Re: Install problem - HBase 0.90.1 cannot connect to zookeeper

2011-02-23 Thread sun sf
St.Ack I found the past question you have answerd. I have checked the out log file and it gives the same errors -

Re: Stargate

2011-02-23 Thread Lars George
Hi Mike, The values are Base64 encoded, so you need to use a decoder. HBase ships with one in the REST package that you can use for example. Lars On Wed, Feb 23, 2011 at 7:22 PM, Mike mi...@yesmail.com wrote: I'm having some issues converting the results of a restful call through stargate.  

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Ryan Rawson
Which line is line 89? Also it's preferable to do: assertEquals(3, versionMap.size()); vs: assertTrue(versionMap.size() == 3); since the error messages from the former are more descriptive expected 3 was 2. looking at the code it looks like it should work... On Wed, Feb 23, 2011 at 11:07 PM,

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
line 89:final NavigableMapbyte[], NavigableMapLong, byte[] familyMap = map.get(family); map is null , and strangely I use r.list() instead, final ListKeyValue list = r.list(); r is null ! 2011/2/24 Ryan Rawson ryano...@gmail.com Which line is line 89? Also it's preferable to do:

Re: How to limit the number of logs that producted by DailyRollingFileAppender

2011-02-23 Thread 陈加俊
I uncomment MaxBackupIndex and restart regionserver but warn message as follows: starting regionserver, logging to /app/cloud/hbase/bin/../logs/hbase-uuwatch-regionserver-gentoo_uuwatch_183.out log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.DailyRollingFileAppender. On Thu,

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Ryan Rawson
Does the HTable object have setAutoFlush(false) turned on by any chance? On Wed, Feb 23, 2011 at 11:22 PM, 陈加俊 cjjvict...@gmail.com wrote: line 89:        final NavigableMapbyte[], NavigableMapLong, byte[] familyMap = map.get(family); map is null , and strangely  I use r.list() instead,

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread 陈加俊
HTable object has not setAutoFlush. It's default value is true at my cluster.So I set it true as follows ,but error is still the same. public class GetRowVersionsTest extends TestCase { private final byte[] family= Bytes.toBytes(log); private final byte[] qualifier =

Re: I can't get many versions of the specified column,but only get the latest version of the specified column

2011-02-23 Thread Lars George
What error are you getting? The NPE? As Tatsuya pointed out, you are using the same time stamps: private final long ts2 = ts1 + 100; private final long ts3 = ts1 + 100; That cannot work, you are overriding cells. Lars On Thu, Feb 24, 2011 at 8:34 AM, 陈加俊