You read my mind Ted because I downloaded the patch on this issue today looking
to merge it with CDH3B4.
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back.
- Piet Hein (via Tom White)
--- On Mon, 3/7/11, Ted Yu yuzhih...@gmail.com wrote:
From: Ted Yu
Since we are using EC2 Large instances, it seems
unlikely that network or some other virtualization
related resources crunch are affecting our
performance measurement.
Your assumptions are wrong. It seems only c1.xlarge and m2.4xlarge may be
assigned dedicated hardware. Reference:
For example there is topic table with time column (indexed).
When new post is inserted, topic time column is updatet too (by delete
index and reinsert new value?)
So when table scan goes on it can miss record witch is updated
(reinserted in already scaned values - am I right?).
Since HBase
From: Todd Lipcon t...@cloudera.com
[...]
So, my vote is either:
plan a: hybrid model - 0.91.X becomes a time-based release series
where we drop trunk once every month or two, and 0.92.0 is gated on
features
or:
plan b: strict time-based: we release 0.92.0 around summit, and lock
down the
Stack and I were chatting on IRC about settling with should get into 0.92
before pulling the trigger on the release.
Stack thinks we need online region schema editing. I agree because per-table
coprocessor loading is configured via table attributes. We'd also need some
kind of notification of
=jgit.git;a=commit;h=d8fafe4e6a2b91a0bb37fed7f83f70fba334c25a
Problems worthy of attack prove their worth by hitting back.
- Piet Hein (via Tom White)
--- On Tue, 2/1/11, Andrew Purtell apurt...@apache.org wrote:
From: Andrew Purtell apurt...@apache.org
Subject: host Git repositories on HBase
Completely up to the designer. Could be via Configuration (hbase-site.xml).
Could be an API added via Endpoint / dynamic RPC. Could be table or column
descriptor attributes ({HTD,HCD}.{get,set}Value()). Could be via some embedded
library.
I would suggest static configuration via table and/or
On Wed, 2/16/11, Ryan Rawson ryano...@gmail.com wrote:
Again, see HDFS-347, which is a huge
clear win but still no plans to include it in any hadoop
version.
I ported Ryan's patch for 0.20-append on HDFS-347 on top of CDH3B3 and it's
going into preproduction. We might be a bit more aggressive
Ok, so in my fixed up version of the patch the DN validates the block token
before handing out the file location, so this is not arbitrary access, but it
does mean that the hbase user and the hdfs user must both have read permissions
to the local DFS data directories for the sharing to then
See https://github.com/trendmicro/jgit-hbase
Use branch 'jgit.storage.hbase.v4'
Last night I loaded all of the following repositories into a small HBase
cluster running on my laptop (zk + master + 3 rs):
cascading
cascading.hbase
cascading.jruby
cascalog
flume
gremlins
From: Oleg Ruchovets oruchov...@gmail.com
1) We want to use multi column families bulk loading. The
question is what is the status of 0.92? Is it possible to
use in production?
The status of 0.92 is that it does not exist yet.
There has been recent talk of us putting out a developer preview
as in text/plain
or plain text within the xml/json?
Thanks a lot,
Hari
On Fri, Feb 4, 2011 at 12:37 AM, Andrew Purtell apurt...@apache.org
wrote:
Thanks guys for the replies. Is there any
difference if I use json
representation (application/json)?
Key, column, and value will also
The problem is how do you represent what could be binary
data [...]
Exactly. Base64 it required if you are using XML representation (text/xml)
because the basic data type in HBase is byte[].
You also have the option of binary representations, either protobuf
(application/x-protobuf) or raw
Host git repositories on HBase: https://github.com/trendmicro/jgit-hbase
Thanks to Shawn Pierce for jgit.storage.dht
(http://egit.eclipse.org/r/#change,2295)!
Only thing I'd advise is wait for the commit message on
org.apache.hadoop.hbase.jgit to change from Initial implementation to
Hi, does anyone know of any implementation of GeoIndexing on
HBase as of yet?
Given the lack of responses, I think not.
If not I was thinking of writing one using CoProcessors to
increment the substrings of a GeoHash to help with number
of neighbors and being able to filter out points
To: user@hbase.apache.org
Date: Monday, January 24, 2011, 4:21 AM
Hello,
in one old thread regarding hadoop/hbase 0.19.x Andrew
Purtell wrote, that running DFS balancer while HBase is
running, is not recommended. I didn't find any remarks about
this in Hadoop or HBase documentation.
http
, Andrew Purtell apurt...@apache.org wrote:
From: Andrew Purtell apurt...@apache.org
Subject: Re: DFS rebalancing with running HBase
To: user@hbase.apache.org
Date: Monday, January 24, 2011, 5:42 AM
Martin,
The trouble was due to a defect in how HDFS managed
partitioning deletion work
There are restrictions enforced client side. User table names can only start
with [a-zA-Z0-9\_] and otherwise can contain only [a-zA-Z0-9\-\_\.]
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back.
- Piet Hein (via Tom White)
--- On Thu, 1/20/11, Ted
, 2010 at 6:44 PM, Andrew Purtell apurt...@apache.org
wrote:
The latest CDH3 beta includes security changes that
currently HBase 0.90 and trunk don't incorporate. Of course
we can help out with clear HBase issues, but for security
exceptions or similar, what about that? Do we draw a line?
Where
From: Stack
If we draw a line, then as an ASF community we should
have a fallback option somewhere in ASF-land for the user to
try. Vanilla Hadoop is not sufficient for HBase. Therefore,
I propose we make a Hadoop 0.20-append tarball available.
What you thinking Andrew? I was
What is your -Xmx ?
Easiest way to scale is run more REST gateways.
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back.
- Piet Hein (via Tom White)
--- On Tue, 12/21/10, Jack Levin magn...@gmail.com wrote:
From: Jack Levin magn...@gmail.com
Subject:
This is a mailing list for the Apache version of HBase. Since you are using the
HBase of CDH3, you need to ask Cloudera for help.
HBase Version 0.87.20100924+28
The current version of HBase in CDH3 is 0.89.20100621+17. I think you should
start there.
Best regards,
- Andy
Problems
/10, Andrew Purtell apurt...@apache.org wrote:
From: Andrew Purtell apurt...@apache.org
Subject: Re: I give up, help please
To: user@hbase.apache.org
Cc: Pete Haidinyak javam...@cox.net
Date: Tuesday, December 21, 2010, 4:20 PM
This is a mailing list for the Apache
version of HBase. Since you
Yes, a good point. Swappiness is set to 60 -- suppose I should set it to 0?
Yes.
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back.
- Piet Hein (via Tom White)
-XX:+PrintGCDetails -XX:+PrintGCDateStamps
-Xloggc:/usr/lib/hbase/bin/../logs/gc-hbase.log
I think we are hitting some sort of GC issue.
-Jack
On Tue, Dec 21, 2010 at 4:15 PM, Andrew Purtell
apurt...@apache.org
wrote:
What is your -Xmx ?
Easiest way to scale is run more REST
Hey Friso,
The GC is G1, so it may look different from what you expect
from a GC log. I know it is considered experimental, but I
like the concept of it and think it's nice to gain some
experience with it.
You are probably the first to use the G1 GC seriously with HBase. Would love to
hear
I didn't see a pointer here, so here you go:
Facebook Messages Team Tech Talk, Tuesday December 7 2010
http://www.livestream.com/facebookevents/video?clipId=pla_601a75c2-7bd4-4240-91a2-c9ad0f643e36
Best regards,
- Andy
I didn't see a pointer here, so here you go:
Facebook Messages Team Tech Talk, Tuesday December 7 2010
http://www.livestream.com/facebookevents/video?clipId=pla_601a75c2-7bd4-4240-91a2-c9ad0f643e36
Best regards,
- Andy
We will have php querying hbase over tcp, and we need a
connector on the hbase end to return content the fastest
way possible
Typically the Thrift connector is used for this.
- Andy
I'm sorry, I'm having trouble following what seems like two XY turns in this
conversation. Or it could be that I'm just suffering from sleep debt
accumulated over the week.
We suggest the Thrift interface not because of language/interoperability
considerations but because the operations
Use hadoop-lzo-0.4.7 or higher from https://github.com/toddlipcon/hadoop-lzo
Best regards,
- Andy
--- On Thu, 12/16/10, Sandy Pratt prat...@adobe.com wrote:
From: Sandy Pratt prat...@adobe.com
Subject: RE: Simple OOM crash?
To: user@hbase.apache.org user@hbase.apache.org
Cc: Cosmin
Nobody is running Hadoop on Gentoo in production, either.
Did you tweak CFLAGS by chance?
Anyway, don't do it this way.
Run the Sun JVM on a stable version of CentOS/RedHat, Debian, or Ubuntu.
Best regards,
- Andy
--- On Thu, 12/9/10, Gary Helmling ghelml...@gmail.com wrote:
From:
The REST gateway (Stargate) is a long lived client. :-)
It uses HTablePool internally so this will keep some warm table references
around in addition to the region location caching that HConnectionManager does
behind the scenes. (10 references, but this could be made configurable.)
Best
, apurt...@apache.org
Date: Wednesday, November 24, 2010, 2:21 PM
Btw, does it mean, I can send in a
compressed query? Or only receive
compressed data from REST or both?
-Jack
On Wed, Nov 24, 2010 at 10:15 AM, Andrew Purtell apurt...@apache.org
wrote:
Regards compressing the HTTP
Fixed in
https://github.com/apurtell/hbase-ec2/commit/e2384222afdad33d49cdd60ede48604d76ac1600
Dumb bug.
Thanks for reporting it!
Best regards,
- Andy
--- On Mon, 11/22/10, Gary Helmling ghelml...@gmail.com wrote:
From: Gary Helmling ghelml...@gmail.com
Subject: Re: Ganglia website
Right now you can't get 0.90 from CDH3. It is an 0.89-mumble. It will not be
a better choice than 0.90 once 0.90 is released.
We are looking at deploying CDH3B3 plus a custom RPM built in house that
updates the CDH3B3 HBase package to 0.90.
I'm not sure we forgo support for our ops team just
On Mon, 11/22/10, Todd Lipcon t...@cloudera.com wrote:
Once 0.90 is released, we plan on spending a week or two to suss
out any possible integration issues, and then release CDH3b4
including 0.90.
I'm sure that will make everyone happy. :-) Glad to hear the projected time
between releases
free space on the / device?
There is plenty of space on /mnt but then do instruct yum
to install packages elsewhere. I'd have to link /lib/ etc to
/mnt or something.
Cheers
J
On Fri, Nov 19, 2010 at 2:27 PM, Andrew Purtell apurt...@apache.org
wrote:
The root device on instance-store
I would say yes, conditionally.
But indeed you have to use add_table.rb to add the copied over regions to the
META region of the target cluster.
And of course if you copy over table data as HFiles you have to at least
disable the table on the source cluster or shut it down before the copy, so
I can get about 1000 regions per node operating comfortably on a 5 node
c1.xlarge EC2 cluster using:
Somewhere out of /etc/rc.local:
echo root soft nofile 65536 /etc/security/limits.conf
echo root hard nofile 65536 /etc/security/limits.conf
sysctl -w fs.file-max=65536
sysctl -w
We have had ixSystems build hardware for us as well.
Best regards,
- Andy
- Original Message
From: Daniel Einspanjer deinspan...@mozilla.com
To: user@hbase.apache.org
Cc: M. C. Srivas mcsri...@gmail.com
Sent: Sun, November 7, 2010 1:27:58 AM
Subject: Re: Where do you get your
Hi,
For early stages of new development, an all-localhost setup is enough for basic
testing.
However to test with any appreciable data size or load, indeed you need to
think 5-10 servers. I use on demand clusters of ~5 nodes on EC2 for development
and for functional testing of new changes.
Thanks for reporting back Jeremy. We really appreciate it when users who have
figured out their issues write back to the list for others to find via searches
later.
- Andy
--- On Mon, 10/25/10, Jeremy Hinegardner jer...@hinegardner.org wrote:
When I have 3 concurrent clients querying
Any tips on how to find out which dfs 'client' is
talking to namenode?
I've never needed to do this, sorry. Maybe others know.
Odd and unhelpful that the audit messages don't carry this information.
- Andy
This is at the root of the trouble with the REST server also I expect.
You said your ZooKeeper ensemble peer was unhappy? Can we see the logs? Did you
report this to the ZK guys?
Best regards,
- Andy
--- On Fri, 10/22/10, Jack Levin magn...@gmail.com wrote:
From: Jack Levin
[Changed the title of the mail, probably not a REST server issue.]
This is the HBase client library embedded in the REST server warning that the
znode in ZooKeeper corresponding to the root regionserver went away. It waits
and waits for it to come back, but it never does.
Nothing in the
Jeremy,
Have you given any thought to trying out the latest 0.89 release?
The Stargate package has been moved into org.apache.hadoop.hbase.rest but
otherwise it is the same.
If this is a concurrency problem with the HBase client library it would be
better to try and deal with it on what is
Whoops, I hit send before considering the other meaning of overhead, sorry.
Can't say yet, probably not much, but will profile.
Best regards,
- Andy
--- On Mon, 10/18/10, Andrew Purtell apurt...@apache.org wrote:
From: Andrew Purtell apurt...@apache.org
Subject: Re: question about
Hi Jack,
There are three representations the REST server can return, selectable by the
client via the Accept header:
text/xml: XML representation, base64 encoding of data
application/x-protobuf: pbuf representation, see the *.proto files
application/octet-stream: plain binary
It's the
Hi Fleming,
First, Sanel is correct, whatever you are attempting to use is not Stargate.
Kindly follow the rest of the advice.
HBase 20.2
You should be using HBase 0.20.6. We can't help muchwith problems with 0.20.2
any more -- in just about all cases the first advice will be to upgrade to
This seems to be the real issue:
SEVERE: Mapped exception to response: 503 (Service Unavailable)
javax.ws.rs.WebApplicationException: java.io.IOException:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Direct buffer memory
If you're using a Sun JVM, you can change that by using the
Hi William,
I think you are asking about HBASE-2000:
https://issues.apache.org/jira/browse/HBASE-2000
Work on an in-process parallel execution framework for HBase is in progress,
yes. We have some initial patches up for review which are the start of this.
Best regards,
- Andy
--- On
Is a security feature available that I am not aware of? if
not? what is the point to create a database that can be
edited/deleted by anonymous users?
That's kind of a loaded question but I'll bite.
Single tenancy is common in systems of this type, which are meant for
deployment into back
HTML directory listing is not Stargate output. Got something else running on
port 8080?
Best regards,
- Andy
--- On Thu, 9/30/10, mike anderson saidthero...@gmail.com wrote:
From: mike anderson saidthero...@gmail.com
Subject: stargate troubles
To: hbase-u...@hadoop.apache.org
Date:
Matt,
Since you are using ZooKeeper already, conceivably you could keep a hosts file
in ZooKeeper somewhere, use a strategy for updates similar to what is done for
implementing locking to insure a new slave gets and updates the latest version
atomically, and use Twitcher to trigger updates on
A working equivalent of sync() in HDFS, and support for it.
See http://www.cloudera.com/blog/2010/07/whats-new-in-cdh3-b2-hbase/ ,
especially: HDFS improvements for HBase – along with the HDFS team at
Facebook, we have contributed a number of important bug fixes and improvements
for HDFS
Yes. The DFS client sets the parameter when the file is created. (In this case
HBase.) So the setting needs to be changed in hbase-site.xml or you should
symlink your hdfs-site.xml into hbase/conf/.
Best regards,
--- On Sun, 9/26/10, Jack Levin magn...@gmail.com wrote:
From: Jack Levin
there is an alternative. We'll
attempt to go to 0.89 but if we can't get reliable indexing, we
may have to go with this hadoop-append branch.
-GS
On Wed, Sep 22, 2010 at 5:57 PM, Andrew Purtell apurt...@apache.org
wrote:
While 0.89/0.90 is the way to go, there is also the
0.20-append branch
From: Stack st...@duboce.net
Whats your frontend? Why REST? It might be more efficient if you
could run with thrift given REST base64s its payload IIRC (check the
src yourself).
Stargate (and rest in trunk) supports binary puts via protobufs or
application/octet-stream.
Best regards,
While 0.89/0.90 is the way to go, there is also the 0.20-append branch of
Hadoop, in the hadoop-common repo, which is better than nothing if using HBase
0.20:
http://github.com/apache/hadoop-common/tree/branch-0.20-append
There is also an amalgamation of 0.20-append and Yahoo Secure Hadoop
how it is actually done.
Unfortunately it confirmed my suspicion that current TTL is
implemented
purely based on active compaction. And in log
table/history data table, current implementation is not
sufficient.
Jimmy
--
From: Andrew
Yeah, indeed the TTL feature is not broken. It works as advertised if you
understand how HBase internals work.
But we can accommodate the expectations communicated on this thread, it sounds
reasonable.
- Andy
--- On Wed, 9/15/10, Ryan Rawson ryano...@gmail.com wrote:
From: Ryan Rawson
I did a test with 2 key structure: 1. time:random ,
and 2. random:time.
the TTL is set to 10 minutes. the time is current system
time. the random is a random string with length 2-10
characters long.
This use case doesn't make much sense the way HBase currently works. You can
set the TTL
so what criterion HBase use to sort the returned result ? By row key ?
Yes, by row key.
- Andy
--- On Wed, 9/15/10, Jeff Zhang zjf...@gmail.com wrote:
From: Jeff Zhang zjf...@gmail.com
Subject: Does HBase guarantee return the same result when I invoke scan
operation ?
To:
In addition to what Jon said please be aware that if compression is specified
in the table schema, it happens at the store file level -- compression happens
after write I/O, before read I/O, so if you transmit a 100MB object that
compresses to 30MB, the performance impact is that of 100MB, not
From: Bradford Stephens
A small improvement, but nowhere near what I'm used to,
even from vague memories of old clusters on EC2.
Those days are gone.
Used to be m1.small provided reasonable performance for some apps.
Now comment to the effect that the platform is simply too oversubscribed to
I've tried to post the below comment twice at
The problems with ACID, and how to fix them without going NoSQL
http://dbmsmusings.blogspot.com/2010/08/problems-with-acid-and-how-to-fix-them.html
For whatever reason, it has appeared in the comments section from my
perspective briefly
From: Bradford Stephens
I'm banging my head against some perf issues on EC2. I'm
using .20.6 on ASF hadoop .20.2, and tweaked the ec2 hbase
scripts to handle the new version.
I'm trying to insert about 22G of data across nodes on EC2
m1.large instances [...]
c1.xlarge provides (barely)
From: Matthew LeMieux
I'm starting to find that EC2 is not reliable enough to support
HBase.
[...]
(I've been using m1.large and m2.xlarge running CDH3)
I personally don't use EC2 for anything more than on demand ad hoc testing, but
I do know of successful deployments there.
However, I at
From: Gary Helmling
If you're using AMIs based on the latest Ubuntu (10.4),
theres a known kernel issue that seems to be causing
high loads while idle. More info here:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910
Seems best to avoid using Lucid on EC2 for now, then.
seems stargate saves state of previous requests.
If so that's unintentional, and not the way the Jersey/JAX-RS framework works
according to my understanding.
if I try to put row and supply wrong column name.
NoSuchColumnFamilyException exception will be thrown for all
other requests (even
I'll update the wiki to remove the bit about alpha status. Stargate surely has
bugs somewhere but is known to operate stably under load in several
applications.
- Andy
From: sasha.maksimenko sasha.maksime...@gmail.com
Subject: [stargate] status
To: user@hbase.apache.org
Date:
You are asking Stargate for XML representation. HBase stores data of arbitrary
byte[]. Also, characters like '' and '' will confuse various XML parsers.
Therefore Stargate must base 64 encode the row key, column, qualifier, and data
to be XML safe.
You can also ask for protobufs
Trend Micro will provide a solution for this:
Cons : No security control
by end of 2010 Q3.
- Andy
From: y_823...@tsmc.com y_823...@tsmc.com
Subject: HBase's pros and cons
To: user@hbase.apache.org
Cc: kevin_h...@tsmc.com
Date: Monday, June 21, 2010, 8:02 PM
Hi there,
I would
tcp_tw_recycle did not do what you needed?
- Andy
On Mon, Jun 14, 2010 at 11:40 PM, Friso van Vollenhoven wrote:
Hi all,
Since I got no replies to my previous message (see
below), I went ahead and set the tcp_tw_recycle to true.
This worked like a charm. The number of sockets in
From: Mark Laffoon
Subject: RE: ICV concurrency problem (?)
1. I have multiple clients (map/reduce task executors)
hitting an HBase cluster with multiple region servers.
Assuming the client code doesn't explicitly set the
timestamp, which box actually generates the timestamp
for a put?
From: Vidhyashankar Venkataraman
What do you mean by pastebinning it? I will try hosting it on a
webserver..
No need... http://pastebin.com/
- Andy
Hi Daniel,
My concern is that if we don't take advantage of your
coprocessor work, we will end up needing to write our
own callback code from scratch anyway, and that doesn't
seem to be a better choice than helping you flesh out a
solid use case for co-processors and implement it.
801 - 878 of 878 matches
Mail list logo