The table has hashed keys so rows are evenly distributed amongst the
regionservers, and load on each regionserver is pretty much the same. I also
have per-table balancing turned on. I get mostly data local mappers with only a
few rack local (maybe 10 of the 250 mappers).
Currently the table is
Hi,
We are using HBase replication (over Apache 0.94.2) between two sites and we
need to define firewall rules between the two sites.
Can anyone provide some information regarding the ports that are used between
the sites?
Our understand is:
Replication is from site1 to site2.
*
If you can, try 0.94.4+; it should significantly reduce the amount of bytes
copied around in RAM during scanning, especially if you have wide rows and/or
large key portions. That in turns makes scans scale better across cores, since
RAM is shared resource between cores (much like disk).
It's
Not that it's a long-term solution, but try major-compacting before running
the benchmark. If the LSM tree is CPU bound in merging HFiles/KeyValues
through the PriorityQueue, then reducing to a single file per region should
help. The merging of HFiles during a scan is not heavily optimized yet.
Hi i have two questions regarding hdfs and jps utility
I am new to Hadoop and started leraning hadoop from the past week
1.when ever i start start-all.sh and jps in console it showing the
processes started
*naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
*22283 NameNode*
*23516 TaskTracker*
*26711
Hi i have two questions regarding hdfs and jps utility
I am new to Hadoop and started leraning hadoop from the past week
1.when ever i start start-all.sh and jps in console it showing the
processes started
*naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
*22283 NameNode*
*23516 TaskTracker*
*26711
This happens when your java process is running in debug mode and
suspend='Y' option is selected.
Regards
Ram
On Wed, May 1, 2013 at 12:55 PM, Naidu MS
sanyasinaidu.malla...@gmail.comwrote:
Hi i have two questions regarding hdfs and jps utility
I am new to Hadoop and started leraning hadoop
Sorry. I think someone hijacked this thread and I replied to this.
Naidu,
Request you to post a new thread if you have queries and do not hijack the
thread.
Regards
Ram
On Wed, May 1, 2013 at 12:57 PM, ramkrishna vasudevan
ramkrishna.s.vasude...@gmail.com wrote:
This happens when your java
yes, true according to the docs.
however, there still something strange with the classpath
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.{HBaseAdmin,HTable,Put,Get}
import org.apache.hadoop.hbase.util.Bytes
val conf = new HBaseConfiguration()
val admin
@Lars, how have your calculated the 35K/row size? I'm not able to find the
same number.
@Bryan, Matt's idea below is good. With the hadoop test you always had data
locality. Which your HBase test, maybe not. Can you take a look at the JMX
console and tell us your locality % ? Also, over those 45
Unfortunately as this idea keeps popping up, you are going to have this
discussion.
1) As you admit... salting is bad when your primary access vector is get()s.
2) Range scans. Instead of 1 range scan, you now have N where N is the number
of salt values. In this case 10.
You wouldn't think
I see what you are saying Michael but I think following is a blanket
assumption:
bq Think of it this way... the operation was a success but the patient
died. eq
This is not always the case. Yes, if your use-case/system is such that it
will have lots of users trying to access then perhaps N users
I'd say go to Avro over protobufs in terms of redesigning your schema.
With respect to CPUs, you don't say what your system looks like. Intel vs AMD ,
Num physical cores, what else you're running on the machine (#Mappers/Reducer
slots) etc ...
In terms of the schema...
How are you
Yes I would like to try this, if you can point me to the pom.xml patch that
would save me some time.
On Tuesday, April 30, 2013, lars hofhansl wrote:
If you can, try 0.94.4+; it should significantly reduce the amount of
bytes copied around in RAM during scanning, especially if you have wide
What about deflating the jar, to get the file and to put it manually on the
classpath?
At least it will help in terms of debugging the underlying problem.
On May 1, 2013, at 3:24 AM, Håvard Wahl Kongsgård
haavard.kongsga...@gmail.com wrote:
yes, true according to the docs.
however,
Yes I have monitored GC, CPU, disk and network IO, anything else I could think
of. Only the CPU usage by the regionserver is on the high side.
I mentioned data local jobs make up generally 240 of the 250 mappers (96%) - I
get this information from the jobtracker. Does the JMX console give more
Hi,
Currently in Gora, we support the following table attributes, which we
specify when mapping data into HBase;
compression, blockCache, blockSize, bloomFilter, maxVersions, timeToLive,
inMemory .
These expand to the following
HColumnDescriptor columnDescriptor = getOrCreateFamily(familyName,
What version of HBase are you using ?
Assuming it is 0.94.x, you can find the default values
in src/main/resources/hbase-default.xml
e.g.
property
namehfile.block.cache.size/name
value0.25/value
description
Percentage of maximum heap (-Xmx setting) to allocate to block cache
Hi Ted,
Thank you for reply.
This is where I drop a bomb... which I reservedly apologize for, I should
have dropped in original email.
We currently pull 0.90.4 maven artifact within Gora trunk!
We plan to upgrade to 0.94.X [0] after our next release (next few weeks)
Thanks Ted
[0]
0.90.x code base is no longer actively maintained.
Looking forward to the upgrade of HBase in Gora.
On Wed, May 1, 2013 at 11:49 AM, Lewis John Mcgibbney
lewis.mcgibb...@gmail.com wrote:
Hi Ted,
Thank you for reply.
This is where I drop a bomb... which I reservedly apologize for, I should
Hi,
As far as I remember, there were attempts to add filtering on hbase side to
nutch-2.x commands, which could use SingleColumnValue filters that are
available in hbase-0.95. So, I think it is advisable to upgrade hbase in gora
to this version.
Thanks.
Alex.
-Original
Thanks to both of you.
We've had a struggle with lots of other stuff.
All of this HBase related stuff will be addressed in the next development
drive.
On Wed, May 1, 2013 at 12:03 PM, alx...@aim.com wrote:
Hi,
As far as I remember, there were attempts to add filtering on hbase side
to
The following filters are in 0.94 code base as well:
./src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueExcludeFilter.java
./src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java
On Wed, May 1, 2013 at 12:03 PM, alx...@aim.com wrote:
Hi,
As far as I remember,
Ok, I spoke too soon. I tried going from Nutch to Hive... not supported
(but adding support to it sounds like a fun side-project :) ). But, I can
go to HBase.
Jean-Marc, I do have a question for you. When you said that I should
get the UI going before anything else, what did you mean? I'm
Hi Yves,
Nice to see you back ;)
The UI is http://192.168.x.x:60010/master-status
If you don't have the master UI working, there is no need to try the shell,
it will not work.
JM
2013/5/1 Yves S. Garret yoursurrogate...@gmail.com
Ok, I spoke too soon. I tried going from Nutch to Hive...
Hello Yves,
I think by that JM means that you should first make sure that all
you Hbase daemons are running fine. The webUI is a pretty convenient tool
which allows you to monitor everything in a simpler way. If you are able to
see the webUI properly, it means everything is in place and
Commment out everything in /etc/host file add 127.0.0.1 localhost and then try.
*Thanks Regards*
∞
Shashwat Shriparv
On Thu, May 2, 2013 at 1:57 AM, Mohammad Tariq donta...@gmail.com wrote:
Hello Yves,
I think by that JM means that you should first make sure that all
you
Hi,
I am seeing the following which is a JVM segfault:
hbase-regionser[28734]: segfault at 8 ip 7f269bcc307e sp
7fff50f7e638 error 4 in libc-2.15.so[7f269bc51000+1b5000]
Benoit Tsuna reported a similar issue a while back -
From what I see from
ldd --version
ldd (Ubuntu EGLIBC 2.15-0ubuntu10.3) 2.15
We are running eglibc which is somewhat different from glibc -
http://en.wikipedia.org/wiki/Embedded_GLIBC.
It seems that this is a problem with Ubuntu, have folks seen this on non
ubuntu installs ?
Thanks
Varun
On
Sudarshan,
Below are the results that Mujtaba put together. He put together two
version of your schema: one with the ATTRIBID as part of the row key
and one with it as a key value. He also benchmarked the query time both
when all of the data was in the cache versus when all of the data was
read
Hi Jean-Marc, I'll go back through this tutorial once more.
http://hbase.apache.org/book/quickstart.html
On Wed, May 1, 2013 at 4:27 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
Hi Yves,
Nice to see you back ;)
The UI is http://192.168.x.x:60010/master-status
If you don't
Done. I'll go through the previously mentioned tutorial with
this in mind.
Thank you for your help.
On Wed, May 1, 2013 at 4:39 PM, shashwat shriparv dwivedishash...@gmail.com
wrote:
Commment out everything in /etc/host file add 127.0.0.1 localhost and then
try.
*Thanks Regards*
I tried running my test with 0.94.4, unfortunately performance was about the
same. I'm planning on profiling the regionserver and trying some other things
tonight and tomorrow and will report back.
On May 1, 2013, at 8:00 AM, Bryan Keller brya...@gmail.com wrote:
Yes I would like to try this,
Hi guys, one more question.
I'm looking at this http://hbase.apache.org/book/quickstart.html link in
section 1.2.1 and where I have to modify
conf/hbase-site.xml, for parts hbase.rootdir and
hbase.zookeper.property.dataDir, to what should I set hbase.rootdir?
At the moment, I have hbase.rootdir
One more little update.
I ran this command in HBASE_HOME [ $ bin/start-hbase.sh ],
using these configuration settings:
http://bin.cakephp.org/view/1134614486
After I rand it, I checked
$HBASE_HOME/logs/hbase-ysg-master-ysg.connect.log and
this is what I saw:
http://bin.cakephp.org/view/823736802
Forgot to add, out of the 3 files in logs:
-rw-rw-r--. 1 ysg ysg 26439 May 1 22:04 hbase-ysg-master-ysg.connect.log
-rw-rw-r--. 1 ysg ysg 0 May 1 22:04 hbase-ysg-master-ysg.connect.out
-rw-rw-r--. 1 ysg ysg 0 May 1 22:04 SecurityAuth.audit
Only the .log file has anything in it. Not
hbase.rootdir the directory HBase writes data to. I you are planning to
have a distributed HBase setup then set this property to some a directory
in your HDFS, like hdfs://NN_MACHINE:9000/hbase. Otherwise point some dir
on your local FS. And for hbase.zookeper.property.dataDir, create a
separate
Navis
Thanks for the issue link. Currently the read queries will start MR
jobs as usual for reading from HBase. Correct? Is there any plan for
supporting noMR?
-Anoop-
On Thu, May 2, 2013 at 7:09 AM, Navis류승우 navis@nexr.com wrote:
Currently, hive storage handler reads rows one by
Hmm... Did you actually use exactly version 0.94.4, or the latest 0.94.7.
I would be very curious to see profiling data.
-- Lars
- Original Message -
From: Bryan Keller brya...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org
Cc:
Sent: Wednesday, May 1, 2013 6:01 PM
Subject:
I used exactly 0.94.4, pulled from the tag in subversion.
On May 1, 2013, at 9:41 PM, lars hofhansl la...@apache.org wrote:
Hmm... Did you actually use exactly version 0.94.4, or the latest 0.94.7.
I would be very curious to see profiling data.
-- Lars
- Original Message -
40 matches
Mail list logo