Re: ANN: HBase 0.94.2 is available for download

2012-10-15 Thread Stack
On Mon, Oct 15, 2012 at 10:06 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Lars, To install it, can we just remove the .jar on the root directory and replace it with this one? I'm running 0.94.0 so it might be compatible, right? There could be changes other than those bundled

Re: hbase 0.92.1 hadoop compatibility

2012-10-15 Thread Stack
On Mon, Oct 15, 2012 at 9:52 AM, Amit Sela am...@infolinks.com wrote: Hi everyone, I have a cluster running Hadoop 0.20.3-snapshot with HBase 0.90.2. I want to use bulk loading with HFileOutPutFormat, which works for me when writing to 1 CF but fails for more. I know this is solved in

Re: status of bin/rename_table.rb

2012-10-15 Thread Stack
On Mon, Oct 15, 2012 at 8:25 AM, Norbert Burger norbert.bur...@gmail.com wrote: Hi folks, Does anyone have a good working process for renaming tables? From the links below, I gather that the bin/rename_table.rb (last included in 0.90.x) had a few issues.

Re: Question on IdentityTableReducer

2012-10-15 Thread Stack
On Mon, Oct 15, 2012 at 3:17 AM, Mahadevappa, Shobha shobha.mahadeva...@nttdata.com wrote: Hi, I want my Reducer to act as IdentityTableReducer based on a certain condition. Can you please let me know if there is a neat way to achieve this instead of populating the Put objects explicitly in

Re: Connecting Remote Hbase pentaho

2012-10-15 Thread Stack
On Mon, Oct 15, 2012 at 12:19 AM, Kuldeep Chitrakar kuldeep.chitra...@synechron.com wrote: Hi To connect to Hbase do we always need to have Pentao installed on one of the Hadoop Cluster machine. Cant we connect to a Hbase remotely using Pentahi like Pentaho on Windows machine and Hbase on

Re: is it possible to control physical deletion of data in hbase?

2012-10-13 Thread Stack
On Sat, Oct 13, 2012 at 6:27 AM, Richard Tang tristartom.t...@gmail.com wrote: Hi, I want to manually control the number of versions of data physically stored in hbase. I am aware that physically deleting a record in hbase occurs only in major compaction

Re: MapReduce vs hosts (Cannot resolve the host name)

2012-10-11 Thread Stack
On Thu, Oct 11, 2012 at 6:17 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Any idea where I can start to look at? Sounds like DNS -- forward and reverse lookups -- work on the machine you are launching your job from but not out on your cluster. Check DNS on the cluster members? St.Ack

Re: NoSuchColumnFamilyException with rowcounter

2012-10-11 Thread Stack
On Thu, Oct 11, 2012 at 10:43 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: :( That's where the -D column name is coming :( I tried to move it to few places before and it was not working. That's the only place where it's not crashing right from the launch. If you place it

Re: HBase Tuning

2012-10-10 Thread Stack
On Wed, Oct 10, 2012 at 5:51 AM, Ricardo Vilaça rmvil...@di.uminho.pt wrote: However, when adding an additional client node, with also 400 clients, the latency increases 3 times, but the RegionServers remains idle more than 80%. I had tried different values for the

Re: HBase in Mahout (for storing models)

2012-10-03 Thread Stack
On Wed, Oct 3, 2012 at 5:49 PM, David Arthur mum...@gmail.com wrote: I recently got a patch accepted into Mahout (MAHOUT-202) that includes an implementation of their DataModel for HBase (aptly named HBaseDataModel). The DataModel specifies a means to store user ratings (preferences) for

Re: [Programmatic cluster monitoring] How to use the HBase monitoring APIs

2012-10-02 Thread Stack
On Mon, Oct 1, 2012 at 10:52 PM, techbuddy techbuddy...@gmail.com wrote: As for programmatic monitoring, I was trying to figure out how to extend the already available metrics capture mechanism (available for Region server and Master server processes) to dump some *custom *metrics into a file

Re: [Programmatic cluster monitoring] How to use the HBase monitoring APIs

2012-10-01 Thread Stack
On Mon, Oct 1, 2012 at 3:17 PM, techbuddy techbuddy...@gmail.com wrote: Hi all Hbase veterans, gurus, newbies, I'm trying to figure out how to go about programmatically monitoring an Hbase cluster using the APIs listed @

Re: Tuning HBase for random reads

2012-09-26 Thread Stack
On Wed, Sep 26, 2012 at 9:05 AM, Jonathan Bishop jbishop@gmail.com wrote: I am using block size in HDFS of 64MB - the default I believe. I'll try something smaller, say 16MB or even 4MB. I'll also give bloom filters a try, but I don't believe that will help because I have so few columns.

Re: Tuning HBase for random reads

2012-09-26 Thread Stack
On Wed, Sep 26, 2012 at 12:01 PM, Jonathan Bishop jbishop@gmail.com wrote: Kevin, So, setting HBase block size is which configuration? Just tried the hadoop shortcircuit option and I see it does improve the performance, perhaps twice as fast, although it is hard to tell whether this was

Re: hbase cluster high loads

2012-09-26 Thread Stack
On Wed, Sep 26, 2012 at 1:13 AM, Yusup Ashrap aph...@gmail.com wrote: hi Stack, thanks for reply. *hbase version* is 0.90.2. This is an extremely old version. we use ganglia to monitor our cluster.write/read is normal/equally distributed all day long. 1k write , 4k read. it's kinda

Re: hbase cluster high loads

2012-09-25 Thread Stack
On Tue, Sep 25, 2012 at 9:02 PM, Yusup Ashrap aph...@gmail.com wrote: Hi Otis thanks for reply, servers are identical in terms of hardware, jvm. right now I cannot afford to restart my any machines, it's in the production environment :D. I will give a shot for some other clusters some time

Re: question CDH license

2012-09-25 Thread Stack
On Tue, Sep 25, 2012 at 7:50 PM, Xiang Hua bea...@gmail.com wrote: Hi, As we know CDH could be used freely. quesiton : can CDH be used for our custom? is it legal or not. Its Apache licensed. This article is pretty good on what that means: http://en.wikipedia.org/wiki/Apache_license

Re: HBase ChecksumException IllegalArgumentException

2012-09-24 Thread Stack
On Mon, Sep 24, 2012 at 6:17 AM, Bai Shen baishen.li...@gmail.com wrote: I'm still getting checksum errors for some reason. Things run fine and then start erroring out with checksum errors. Any ideas for what I can look at to figure out why I'm getting the checksum errors? You say you are

Re: Online Schema Edit Stability/Maturity

2012-09-22 Thread Stack
On Sat, Sep 22, 2012 at 7:30 AM, Jacques whs...@gmail.com wrote: I know that HBASE-4213 and HBASE-1730 are both listed as in 94 and 92 respectively. I also see that HBASE-4741 is marked for 96. I remember off-handed comments at some point saying that both mechanisms had challenges. How

Re: Usage of the task monitor

2012-09-21 Thread Stack
On Fri, Sep 21, 2012 at 9:02 AM, Tom Brown tombrow...@gmail.com wrote: Hi all, I was having some odd server pauses that appeared to be related to my usage of a coprocessor endpoint. To help me monitor these, I attempted to use the task monitor; Now I've got a memory leak and I suspect it's

Re: Status of HBASE-3529 (Add search to HBase)?

2012-09-20 Thread Stack
On Thu, Sep 20, 2012 at 12:43 PM, Andrew Purtell apurt...@apache.org wrote: The issue with the patch on HBASE-3529 is it relies on modifications to HDFS that the author of HBASE-3529 proposed to the HDFS project as https://issues.apache.org/jira/browse/HDFS-2004. The proposal was vetoed.

Re: Status of HBASE-3529 (Add search to HBase)?

2012-09-20 Thread Stack
On Thu, Sep 20, 2012 at 6:51 PM, Andrew Purtell apurt...@apache.org wrote: But, what stopped progress here is a veto of HDFS side changes needed for the implementation to get that performance. We could have another go and even do it ourselves if enough of us thought it worth it. If we're

Re: Status of HBASE-3529 (Add search to HBase)?

2012-09-20 Thread Stack
On Thu, Sep 20, 2012 at 9:00 PM, Andrew Purtell apurt...@apache.org wrote: Data wouldn't go in ES, just index. For us a generic indexing service may make sense but hey for others maybe not. That could be so. In my experience it starts out that way and then you start adding more and more data

Re: IllegalArgumentException when trying to split an empty region

2012-09-20 Thread Stack
On Thu, Sep 20, 2012 at 10:36 AM, John Edstrom jedst...@nearinfinity.com wrote: Apologies, this sent before I had finished writing it :X The stack trace is below, but what we are attempting to do is load data into HBase via MapReduce. When we're doing the load, we write the HFiles using

Re: How to specify empty value in HBase shell

2012-09-20 Thread Stack
On Thu, Sep 20, 2012 at 7:31 AM, Jerry Lam chiling...@gmail.com wrote: Hi HBase Community: I have been struggling to find a way to specify empty value/empty column qualifier in the hbase shell, but unsuccessful. I google it, nothing comes up. I don't know JRuby so that might be why. Do you

Re: IOException: Cannot append; log is closed -- data lost?

2012-09-19 Thread Stack
On Tue, Sep 18, 2012 at 11:37 AM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: We are running cdh3u2 on a 150 node cluster, where 50 are HBase and 100 are map reduce. The underlying hdfs spans all nodes. This is a 0.90.4 HBase and then some Bryan? What was the issue serving data that

Re: IOException: Cannot append; log is closed -- data lost?

2012-09-19 Thread Stack
On Tue, Sep 18, 2012 at 12:39 PM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: Looking closer at it I guess the flush and the IOException probably weren't related. So the multi call to delete must have failed at the client (which is good). It does seem very strange to me that the pattern

Re: Problem with Hadoop and /etc/hosts file

2012-09-18 Thread Stack
On Tue, Sep 18, 2012 at 12:01 AM, Alberto Cordioli cordioli.albe...@gmail.com wrote: Sorry, maybe I didn't explain well. I don't know hot to set up rDNS. I'd just know if this problem could generate the error I reported in the first post (since I get in any case the correct results). No need

Re: Problem with Hadoop and /etc/hosts file

2012-09-17 Thread Stack
On Mon, Sep 17, 2012 at 3:09 AM, Alberto Cordioli cordioli.albe...@gmail.com wrote: I already did this, but the problem still there. When I try this command: host 10.220.55.41 I get: Host 41.55.220.10.in-addr.arpa. not found: 3(NXDOMAIN) The same for each host. Is this normal? The

Re: lookup table

2012-09-17 Thread Stack
On Sun, Sep 16, 2012 at 4:27 PM, Rita rmorgan...@gmail.com wrote: Yes, I am trying to save on disk space because of limited resouces and the table will be around 30 billion rows. The lookup table itself will be around 9k rows so its not too bad. A character's range will be from 1 to 4. I

Re: java.io.IOEcxeption key k1 followed by a smaller key k2

2012-09-16 Thread Stack
On Sun, Sep 16, 2012 at 5:59 AM, Mohamed Ibrahim m0b...@gmail.com wrote: Here is the stack dump: at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:266) at Can you figure what is being read when you see this exception? Which region? Can you figure

Re: support checksums in HBase block

2012-09-16 Thread Stack
/9/16 Stack st...@duboce.net Go ahead, but, it seems like something is broke? We should file an issue to fix this broken checksuming? St.Ack

Re: lookup table

2012-09-16 Thread Stack
On Sat, Sep 15, 2012 at 8:09 AM, Rita rmorgan...@gmail.com wrote: I am debating if a lookup table would help my situation. I have a bunch of codes which map with timestamp (unsigned int). The codes look like this AA4 AAA5 A21 A4 ... Z435 The size range from 1 character to 4 characters

Re: support checksums in HBase block

2012-09-15 Thread Stack
On Sat, Sep 15, 2012 at 10:31 AM, jlei liu liulei...@gmail.com wrote: I use Hbase0.94.1 and hadoop-0.20.2-cdh3u5. The Hbase0.94.1 write checksums in HBase block, so we don't need to read checksums from metadata file of block file. But I find the BlockSender still read checksums from metadata

User meetup on 10/29?

2012-09-13 Thread Stack
The folks at wizecommerce have kindly offered to host a meetup down in San Mateo on the evening of 10/29. Are you all up for a user meetup at the end of October after Hadoop World? If so, I'll stick it up in meetup. If you are interested in presenting, write me off list and we'll get you signed

Re: 0.92 client fail to log in when using JNI

2012-09-13 Thread Stack
operation by directly using Java Api, and they succeed. So, is there any difference between using java api through JNI with directly calling the same java api?? Can anyone help me out? Thank you very much. Here is the call stack when using jni to call java api: java.lang.RuntimeException

Re: java.io.IOException: Pass a Delete or a Put

2012-09-11 Thread Stack
On Mon, Sep 10, 2012 at 7:06 PM, Jothikumar Ekanath kbmku...@gmail.com wrote: Hi, Getting this error while using hbase as a sink. Error java.io.IOException: Pass a Delete or a Put Would suggest you study the mapreduce jobs that ship with hbase both in main and under test. Looking at

Re: Doubt in performance tuning

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 9:58 AM, Ramasubramanian ramasubramanian.naraya...@gmail.com wrote: Hi, Currently it takes 11 odd minutes to load 1.2 million record into hbase from hdfs. Can u pls share some tips to do the same in few seconds? We tried doing this in both pig script and in pentaho.

Re: 答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 9:02 AM, huaxiang huaxi...@asiainfo-linkage.com wrote: Hi, I don't find the hbase-default.xml file using following command, any other way? To be clear, this hadoop was installed with CDH RPM package. Is it not bundled inside the hbase-*.jar? St.Ack

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 8:33 AM, Norbert Burger norbert.bur...@gmail.com wrote: Hi all -- we're currently on cdh3u3 (0.90.4 + patches). I have one table in our cluster which seems to functioning fine (gets/puts/scans are all working), but for which no regions are listed on the UI. The

Re: Getting ScannerTimeoutException even after several calls in the specified time limit

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 10:13 AM, Dhirendra Singh dps...@gmail.com wrote: I am facing this exception while iterating over a big table, by default i have specified caching as 100, i am getting the below exception, even though i checked there are several calls made to the scanner before it

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
? Mind checking again? What does the Master UI page look like? Complete? Or is it cut off where its should be listing regions (maybe look at html src?). If shell can scan .META., odd that UI can't. Lets try and figure the difference. St.Ack Thanks, Srinivas M On Sep 10, 2012 12:19 PM, Stack

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
scanned .META. in his own environment, not mine. ;-) On Sep 10, 2012 12:19 PM, Stack st...@duboce.net wrote: What happens if you scan .META. in shell? hbase scan .META. Does it all show? Thanks, Stack. Strangely, all regions do show up in .META. The table in question has 256 regions

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 10:50 AM, Norbert Burger norbert.bur...@gmail.com wrote: On Mon, Sep 10, 2012 at 1:37 PM, Stack st...@duboce.net wrote: What version of hbase? We're on cdh3u3, 0.90.4 + patches. Can you disable and reenable the table? I will try disabling/re-enabling at the next

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 12:05 PM, Norbert Burger norbert.bur...@gmail.com wrote: Mind putting up full listing in pastebin? Let me have a look. We could try a master restart too... so it refreshes its in-memory state. That might do it. St.Ack

Re: why hbase doesn't provide Encryption

2012-09-06 Thread Stack
On Wed, Sep 5, 2012 at 11:16 PM, Farrokh Shahriari mohandes.zebeleh...@gmail.com wrote: Tnx Stack for giving your time to me. No problem. St.Ack

Re: Unable to init HBaseAdmin

2012-09-06 Thread Stack
On Wed, Sep 5, 2012 at 4:57 PM, Ashish Nigam nigamash...@gmail.com wrote: readAndProcess threw exception java.io.EOFException. Count of bytes read: 0 java.io.EOFException Mismatched versions (client and server)? St.Ack

Re: Managing MapReduce jobs with concurrent client reads

2012-09-06 Thread Stack
On Wed, Sep 5, 2012 at 6:25 AM, Eric Czech e...@nextbigsound.com wrote: Hi everyone, Does anyone have any recommendations on how to maintain low latency for small, individual reads from HBase while MapReduce jobs are being run? Is replication a good way to handle this (i.e. run small,

Re: RS not processing any requests

2012-09-05 Thread Stack
On Wed, Sep 5, 2012 at 2:58 PM, Nathaniel Cook nathani...@qualtrics.com wrote: We ran a jstack on the both the RS process and the hbase shell process trying to do the scan. Jstack log for RS: http://pastebin.com/9Y9t5ERE What JVM (I don't know what (20.10-b01 mixed mode) is). I see a

Re: why hbase doesn't provide Encryption

2012-09-05 Thread Stack
On Wed, Sep 5, 2012 at 10:31 PM, Farrokh Shahriari mohandes.zebeleh...@gmail.com wrote: But it doesn't have any performance,I mean for each row it should encrypt/decrypt the cell,so for a query that has a lot of rows ,it will take a long time. How else would you see it working? (We can't

Re: connection error to remote hbase node

2012-09-04 Thread Stack
On Sun, Sep 2, 2012 at 6:38 AM, Richard Tang tristartom.t...@gmail.com wrote: Hi, I have a connection problem on setting up hbase on remote node. The ``hbase`` instance is on a machine ``nodeA``. when I am trying to use hbase on ``nodeA`` from another machine (say ``nodeB``), it complains

Re: Reading in parallel from table's regions in MapReduce

2012-09-04 Thread Stack
On Tue, Sep 4, 2012 at 8:17 AM, Ioakim Perros imper...@gmail.com wrote: Hello, I would be grateful if someone could shed a light to the following: Each M/R map task is reading data from a separate region of a table. From the jobtracker 's GUI, at the map completion graph, I notice that

Re: batch update question

2012-09-04 Thread Stack
On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote: Hello guys, I am reading the book HBase, the definitive guide, at the beginning of chapter 3, it is mentioned in order to reduce performance impact for clients to update the same row (lock contention issues for automatic write),

Re: Is there a way to replicate root and meta table in HBase?

2012-09-04 Thread Stack
On Tue, Sep 4, 2012 at 2:52 PM, Gen Liu ge...@zynga.com wrote: We are running into a case that if the region server that serves meta table is down, all request will timeouts because region lookup is not available. Only requests to .META. fail (and most of the time, .META. info is cached so

Re: why region server exited ?

2012-09-03 Thread Stack
Please say what version of hbase and pastebin the log rather than copy it into mail. Looking at the below it seems like we are missing actual reason server aborted. This is telling 'after 32254 ms, since caller disconnected...' seemingly saying the server halted -- GCing? -- but there is no

Re: md5 hash key and splits

2012-08-31 Thread Stack
On Thu, Aug 30, 2012 at 5:04 PM, Mohit Anchlia mohitanch...@gmail.com wrote: In general isn't it better to split the regions so that the load can be spread accross the cluster to avoid HotSpots? Time series data is a particular case [1] and the sematextians have tools to help w/ that

Re: md5 hash key and splits

2012-08-31 Thread Stack
On Fri, Aug 31, 2012 at 6:09 AM, Doug Meil doug.m...@explorysmedical.com wrote: Stack, re: Where did you read that?, I think he might also be referring to this... http://hbase.apache.org/book.html#important_configurations I'd say we need to revist that paragraph. It gives a 'wrong

Re: md5 hash key and splits

2012-08-31 Thread Stack
On Fri, Aug 31, 2012 at 7:55 AM, Mohit Anchlia mohitanch...@gmail.com wrote: My data is timeseries and to get random distribution and still have the keys in the same region for a user I am thinking of using md5(userid)+reversetimestamp as a row key. But with this type of key how can one do

Re: Inconsistent scan performance with caching set to 1

2012-08-30 Thread Stack
On Wed, Aug 29, 2012 at 10:42 AM, Wayne wav...@gmail.com wrote: This is basically a read bug/performance problem. The execution path followed when the caching is used up is not consistent with the initial execution path/performance. Can anyone help shed light on this? Was there any changes to

Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management

2012-08-30 Thread Stack
On Thu, Aug 30, 2012 at 7:51 AM, Cristofer Weber cristofer.we...@neogrid.com wrote: About HMasters, yes, it's not clear. In section 6.1 they say that “Since we focused on a setup with a maximum of 12 nodes, we did not assign the master node and jobtracker to separate nodes instead we

Re: Inconsistent scan performance with caching set to 1

2012-08-30 Thread Stack
On Thu, Aug 30, 2012 at 9:24 AM, Jay T jay.pyl...@gmail.com wrote: Thanks Stack for pointing us in the right direction. Indeed it was the tcpNodeDelay setting. We set these to be true. ipc.server.tcpnodelay == true hbase.ipc.client.tcpnodelay == true All reads that previously had the 40ms

Re: Occasional regionserver crashes following socket errors writing to HDFS

2012-08-30 Thread Stack
On Wed, Aug 29, 2012 at 11:26 PM, dva dva...@gmail.com wrote: 12/08/29 08:02:10 ERROR security.UserGroupInformation: PriviledgedActionException as:root cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/export/hadoop-1.0.1/bin/out.txt already exists Your

Re: md5 hash key and splits

2012-08-30 Thread Stack
On Thu, Aug 30, 2012 at 7:35 AM, Mohit Anchlia mohitanch...@gmail.com wrote: From what I;ve read it's advisable to do manual splits since you are able to spread the load in more predictable way. If I am missing something please let me know. Where did you read that? St.Ack

Re: sorting by value

2012-08-30 Thread Stack
On Tue, Aug 28, 2012 at 4:11 PM, Pamecha, Abhishek apame...@x.com wrote: Hi I probably know the usual answer but are there any tricks to do some sort of sort by value in HBase. The only option I know is to somehow embed value in the key part. The value is not a timestamp but a normal

Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management

2012-08-30 Thread Stack
On Thu, Aug 30, 2012 at 4:28 PM, Cristofer Weber cristofer.we...@neogrid.com wrote: On the other hand, I think that I can help in a way or another, documenting undocumented features, collecting more data on effects of changes over default values and relating this changes to different HBase

Re: HBase and unit tests

2012-08-30 Thread Stack
On Thu, Aug 30, 2012 at 4:44 PM, Cristofer Weber cristofer.we...@neogrid.com wrote: Hi there! After I started studying HBase, I've searched for open source projects backed by HBase and I found Titan distributed graph database (you probably heard about it). As soon as I read in their

Re: Inconsistent scan performance with caching set to 1

2012-08-30 Thread Stack
On Thu, Aug 30, 2012 at 9:15 PM, Ramkrishna.S.Vasudevan ramkrishna.vasude...@huawei.com wrote: Thanks Stack for giving a pointer to this. Yes it does seems this property is very important. I moved config. up to 'important configs' section out of troubleshooting section. St.Ack

Re: md5 hash key and splits

2012-08-29 Thread Stack
On Wed, Aug 29, 2012 at 3:56 PM, Mohit Anchlia mohitanch...@gmail.com wrote: If I use md5 hash + timestamp rowkey would hbase automatically detect the difference in ranges and peforms split? How does split work in such cases or is it still advisable to manually split the regions. Yes. On how

Re: md5 hash key and splits

2012-08-29 Thread Stack
On Wed, Aug 29, 2012 at 9:38 PM, Mohit Anchlia mohitanch...@gmail.com wrote: On Wed, Aug 29, 2012 at 9:19 PM, Stack st...@duboce.net wrote: On Wed, Aug 29, 2012 at 3:56 PM, Mohit Anchlia mohitanch...@gmail.com wrote: If I use md5 hash + timestamp rowkey would hbase automatically detect

Re: Disk space usage of HFilev1 vs HFilev2

2012-08-28 Thread Stack
On Mon, Aug 27, 2012 at 8:30 PM, anil gupta anilgupt...@gmail.com wrote: Hi All, Here are the steps i followed to load the table with HFilev1 format: 1. Set the property hfile.format.version to 1. 2. Updated the conf across the cluster. 3. Restarted the cluster. 4. Ran the bulk loader.

Re: MemStore and prefix encoding

2012-08-28 Thread Stack
On Tue, Aug 28, 2012 at 9:59 AM, Joe Pallas pal...@cs.stanford.edu wrote: On Aug 25, 2012, at 2:57 PM, lars hofhansl wrote: Each column family is its own store. All stores are flushed together, so have many add overhead (especially if a few tend to hold a lot of data, but the others don't,

Re: Disk space usage of HFilev1 vs HFilev2

2012-08-28 Thread Stack
On Tue, Aug 28, 2012 at 11:42 AM, lars hofhansl lhofha...@yahoo.com wrote: Are we terribly concerned about 3.5% of extra disk usage? HFileV2 was designed to be more main memory efficient, which is in much shorter supply than disk space (bloom filters and index blocks are interspersed with

Re: Pig, HBaseStorage, HBase, JRuby and Sinatra

2012-08-27 Thread Stack
On Mon, Aug 27, 2012 at 6:32 AM, Russell Jurney russell.jur...@gmail.com wrote: I wrote a tutorial around HBase, JRuby and Pig that I thought would be of interest to the HBase users list: http://hortonworks.com/blog/pig-as-hadoop-connector-part-two-hbase-jruby-and-sinatra/ Thanks Russell.

Re: Pig, HBaseStorage, HBase, JRuby and Sinatra

2012-08-27 Thread Stack
On Mon, Aug 27, 2012 at 10:31 AM, Russell Jurney russell.jur...@gmail.com wrote: Yes, and if possible the HBase and JRuby page needs to be updated. If you can grant me wiki access, I can edit it myself. http://wiki.apache.org/hadoop/Hbase/JRuby I added access for a login of RussellJurney

Re: Pig, HBaseStorage, HBase, JRuby and Sinatra

2012-08-27 Thread Stack
On Mon, Aug 27, 2012 at 11:04 AM, Doug Meil doug.m...@explorysmedical.com wrote: I think somewhere in here in the RefGuide would workŠ http://hbase.apache.org/book.html#other.info.sites That looks good. We don't have a pig section in the refguide? You up for adding a paragraph Russell?

Re: MemStore and prefix encoding

2012-08-27 Thread Stack
On Mon, Aug 27, 2012 at 9:20 AM, Tom Brown tombrow...@gmail.com wrote: Lars, I have been relying on the expected behavior (if I write another cell with the same {key, family, qualifier, version} it won't return the previous one) so you're answer was confusing to me. I did more research and I

Re: How to avoid stop-the-world GC for HBase Region Server under big heap size

2012-08-23 Thread Stack
On Wed, Aug 22, 2012 at 11:06 PM, Gen Liu ge...@zynga.com wrote: Hi, We are running Region Server on big memory machine (70G) and set Xmx=64G. Most heap is used as block cache for random read. Stop-the-world GC is killing the region server, but using less heap (16G) doesn't utilize our

Re: HBase row level cache for random read

2012-08-23 Thread Stack
On Thu, Aug 23, 2012 at 12:06 PM, Gen Liu ge...@zynga.com wrote: On 8/18/12 12:33 PM, Stack st...@duboce.net wrote: On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu ge...@zynga.com wrote: I assume block cache store compressed data, Generally its not, not unless you use block encoding. Can you

Re: Using HBase serving to replace memcached

2012-08-22 Thread Stack
On Wed, Aug 22, 2012 at 6:28 AM, Lin Ma lin...@gmail.com wrote: Thanks Anoop, My question is answered. Are you writing related part of code in HBase? From your detailed and knowledgeable description, you seems to be the author. :-) Anoop did not write that particular piece of code. He has

Re: Hbase Shell: UnsatisfiedLinkError

2012-08-22 Thread Stack
On Wed, Aug 22, 2012 at 4:39 AM, o brbrs obr...@gmail.com wrote: Thanks for your reply. I send this issue to the user mail list, but i haven't got any reply. I have installed jdk 1.6 and hbase 0.94, and have made configuration that are said in http://hbase.apache.org/book.html#configuration.

Re: Problem - Bringing up the HBase cluster

2012-08-22 Thread Stack
On Wed, Aug 22, 2012 at 8:43 AM, Jothikumar Ekanath kbmku...@gmail.com wrote: Hi, Thanks for the response, sorry i put this email in the dev space. My data replication is 2. and yes the region and master server connectivity is good Initially i started with 4 data nodes and 1 master,

Re: Problem - Bringing up the HBase cluster

2012-08-22 Thread Stack
On Wed, Aug 22, 2012 at 10:41 AM, Jothikumar Ekanath kbmku...@gmail.com wrote: Hi Stack, Ok, i will cleanup everything and start from fresh. This time i will add one more data node so 1 hbase master and 2 regions. Zookeeper managed by hbase is started in region1. This is my configuration, i

Re: is there a way to switch to once-a-day digest format for this Hbase user list?

2012-08-22 Thread Stack
On Wed, Aug 22, 2012 at 10:46 AM, Taylor, Ronald C ronald.tay...@pnnl.gov wrote: Hello folks, Getting a lot of Hbase email, so was wondering - is it possible to get a digest format for this email list, once a day? Ron Try user-digest-subscribe? I just sent mail to it and it made a

Re: Thrift2 interface

2012-08-21 Thread Stack
On Mon, Aug 20, 2012 at 6:18 PM, Joe Pallas joseph.pal...@oracle.com wrote: Anyone out there actively using the thrift2 interface in 0.94? Thrift bindings for C++ don’t seem to handle optional arguments too well (that is to say, it seems that optional arguments are not optional).

Re: Supervisord

2012-08-21 Thread Stack
On Tue, Aug 21, 2012 at 2:57 PM, Marco Gallotta ma...@gallotta.co.za wrote: Is it possible to run the hbase processes in the foreground so that they can be run and monitored by supervisord? Try defining HBASE_NOEXEC when you run it. St.Ack

Re: Help with parser

2012-08-21 Thread Stack
On Mon, Aug 20, 2012 at 6:02 PM, Harish Krishnan harish.t.krish...@gmail.com wrote: I'm trying to write an application that gets the hbase queries from users and returns the results. I wanted to use the parser class to validate user queries. Users will be using the shell to query hbase?

Re: Help with parser

2012-08-20 Thread Stack
On Mon, Aug 20, 2012 at 2:58 PM, Harish Krishnan harish.t.krish...@gmail.com wrote: Hi all, I'm a novice. I wanted to implement hbase parser for an application that I'm developing. I believe that hbase parser for the hbase shell is part of jruby jar (correct me if i'm wrong). Can someone

Re: HBase row level cache for random read

2012-08-18 Thread Stack
On Fri, Aug 17, 2012 at 4:42 PM, Gen Liu ge...@zynga.com wrote: I assume block cache store compressed data, Generally its not, not unless you use block encoding. one block can hold 6 rows, but in random read, maybe 1 row is ever accessed, 5/6 of the cache space is wasted. Is there a better

Re: HBase replication

2012-08-18 Thread Stack
On Fri, Aug 17, 2012 at 5:36 PM, Mohit Anchlia mohitanch...@gmail.com wrote: Are clients local to slave DC able to read data from HBase slave when replicating data from one DC to remote DC? Yes. If not then is there a way to design such a thing where clients are able to actively read/write

Re: how client location a region/tablet?

2012-08-18 Thread Stack
On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote: Hello guys, I am referencing the Big Table paper about how a client locates a tablet. In section 5.1 Tablet location, it is mentioned that client will cache all tablet locations, I think it means client will cache root tablet in

Re: Table is neither in disabled nor in enabled state

2012-08-17 Thread Stack
On Fri, Aug 17, 2012 at 4:05 AM, Pavel Vozdvizhenskiy pvozdvizhens...@griddynamics.com wrote: The question I have now is whether I had to stop whole hbase cluster or not? Is is safe to remove stale *znode* while HBase is operating, if I sure no compaction / splitting is going? My guess is

Re: Issue: WARN client.HTable: Null regioninfo cell in keyvalues

2012-08-16 Thread Stack
On Thu, Aug 16, 2012 at 5:46 AM, David Koch ogd...@googlemail.com wrote: Hello, Thank you for your detailed response. I did the delete in .META. - the table does now not exist anymore according to hbase hbck and hbase shell however the warning message persists. Perhaps another row still in

Re: Table is neither in disabled nor in enabled state

2012-08-16 Thread Stack
On Thu, Aug 16, 2012 at 3:48 PM, Pavel Vozdvizhenskiy pvozdvizhens...@griddynamics.com wrote: I would appreciate on any help how to fix it. I've not come across this one before. If you list whats under /hbase/table? Does the table show there? You could try removing the znode? You can look

Re: Put w/ timestamp - Deleteall - Put w/ timestamp fails

2012-08-15 Thread Stack
On Wed, Aug 15, 2012 at 9:13 AM, lars hofhansl lhofha...@yahoo.com wrote: I also have a short blog post about this here: http://hadoop-hbase.blogspot.com/2011/12/deletion-in-hbase.html I added link to this discussion into the Versioning section of our reference guide (thanks all above).

Re: Issue: WARN client.HTable: Null regioninfo cell in keyvalues

2012-08-15 Thread Stack
On Tue, Aug 14, 2012 at 7:10 AM, David Koch ogd...@googlemail.com wrote: Hello, I created an Hbase table programatically like so: String tableName = _myTable HBaseAdmin admin = new HBaseAdmin(some_configuration); if (admin.tableExists(outTable) == false) { HTableDescriptor desc = new

Re: Issue: WARN client.HTable: Null regioninfo cell in keyvalues

2012-08-15 Thread Stack
On Tue, Aug 14, 2012 at 3:42 PM, David Koch ogd...@googlemail.com wrote: Hello, It's a fully-distributed environment (CDH3). Hbase hbck sometimes reports inconsistencies like: ERROR: Region { meta = _myTable,,1344936991240.979b3fe3ced9016372a82b7af5d33c27., hdfs = null, deployed =

Re: Bulk loading job failed when one region server went down in the cluster

2012-08-15 Thread Stack
On Mon, Aug 13, 2012 at 6:05 PM, anil gupta anilgupt...@gmail.com wrote: It would be great if you can answer this simple question of mine: Is HBase Bulk Loading fault tolerant to Region Server failures in a viable/decent environment? Bulk Loading is an MapReduce job. Bulk Loading is as

Re: Slow full-table scans

2012-08-15 Thread Stack
On Mon, Aug 13, 2012 at 6:10 PM, Gurjeet Singh gurj...@gmail.com wrote: I am beginning to think that this is a configuration issue on my cluster. Do the following configuration files seem sane ? hbase-env.sh https://gist.github.com/3345338 Nothing wrong w/ this (Remove the -ea, you don't

Re: HBase configuration using two hadoop servers

2012-08-15 Thread Stack
On Mon, Aug 13, 2012 at 1:09 PM, Asaf Mesika asaf.mes...@gmail.com wrote: I've decided to write an end-to-end Installation guide for HBase, which also includes HDFS, user configuration and tons of other stuff no guide ever mentions, in a blog post:

Re: Coprocessor NPE

2012-08-15 Thread Stack
On Mon, Aug 13, 2012 at 11:57 AM, Julian Wissmann julian.wissm...@sdace.de wrote: In the regionserver log, I get the following output ten times: 2012-08-13 20:51:59,779 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: java.lang.NullPointerException at

<    4   5   6   7   8   9   10   11   12   13   >