Sometimes row key design is a trade-off issue between load-balance and
query : if you design row key such that you can query it very fast and
convenient, maybe the records are not spread evenly across the nodes; if
you design row key such that the records are spread evenly across the
nodes, maybe i
Should bulk load task clean up partitions_ upon completion ?
Cheers
On Mon, Dec 16, 2013 at 6:53 PM, Bijieshan wrote:
> > I think I should delete these files immediately after I have finished
> bulk loading data into HBase since they are useless at that time, right ?
>
> Ya. I think so. T
> I think I should delete these files immediately after I have finished bulk
> loading data into HBase since they are useless at that time, right ?
Ya. I think so. They are useless once bulk load task finished.
Jieshan.
-Original Message-
From: Tao Xiao [mailto:xiaotao.cs@gmail.com]
In the first step, the files are read correctly and regionGroups is
creates as it should.
Did you notice the reducer numbers? Did it equal to 2000(Before your extended
HFileOutputFormat)?
>>> RegionServer logs in the RegionServer that the files are moved to
>>> indeed shows that all
Indeed these files are produced by org.apache.hadoop.hbase.mapreduce.
LoadIncrementalHFiles in the directory specified by what
job.getWorkingDirectory()
returns, and I think I should delete these files immediately after I have
finished bulk loading data into HBase since they are useless at that tim
This is kinda tangential, but for very very common dependencies such as
guava, jackson, etc. would it make sense to use a shaded jar so as not to
affect user dependencies?
On Mon, Dec 16, 2013 at 7:47 PM, Ted Yu wrote:
> Please try out patch v2 from HBASE-10174
>
> Thanks
>
> On Dec 16, 2013, a
Please try out patch v2 from HBASE-10174
Thanks
On Dec 16, 2013, at 11:42 AM, Kristoffer Sjögren wrote:
> Oh thank you very much Ted! :-)
>
> Ill give it a try tomorrow.
>
> Cheers!
>
>
> On Mon, Dec 16, 2013 at 6:05 PM, Ted Yu wrote:
>
>> I created HBASE-10174 and attached a patch there.
Oh thank you very much Ted! :-)
Ill give it a try tomorrow.
Cheers!
On Mon, Dec 16, 2013 at 6:05 PM, Ted Yu wrote:
> I created HBASE-10174 and attached a patch there.
>
> Running 0.94 test suite now.
>
>
> On Mon, Dec 16, 2013 at 7:05 AM, Nicolas Liochon
> wrote:
>
> > That means more or les
hbase-0.96.1 is now available for download. Get it at your
nearest Apache mirror [1] or maven repository.
157 bug fixes and performance fixes [2] have been
made since 0.96.0. Please use this release rather than
0.96.0 going forward.
Yours,
Your HBase Team
1. http://www.apache.org/dyn/closer.cg
Thanks for the tips. I'll play around with this this week and try to get a
script that won't affect our performance too bad. I imagine most people do
this at off-peak times, but we don't have that so we'll have to figure out
how to spread out the load as much as possible.
On Fri, Dec 13, 2013 at
I created HBASE-10174 and attached a patch there.
Running 0.94 test suite now.
On Mon, Dec 16, 2013 at 7:05 AM, Nicolas Liochon wrote:
> That means more or less backporting the patch to the 0.94, no?
> It should work imho.
>
>
>
>
> On Mon, Dec 16, 2013 at 3:16 PM, Kristoffer Sjögren >wrote:
I've managed to isolate the problem.
I implemented an extension of HFileOutputFormat - because each bulk load
will import data to the newly created regions only, I pass the prefix
(MMdd) to MyHFileOutputFormat.configureIncrementalLoad() so
that getRegionStartKeys returns only the corresponding
Hi,
I'm a newbie to hbase and have a question on the rowkey design and I
hope this question isn't to newbie-like for this list. I have a question
which cannot be answered by knoledge of code but by experience with
large databases, thus this mail.
For the sake of explaination I create a small exam
That means more or less backporting the patch to the 0.94, no?
It should work imho.
On Mon, Dec 16, 2013 at 3:16 PM, Kristoffer Sjögren wrote:
> Thanks! But we cant really upgrade to HBase 0.96 right now, but we need to
> go to Guava 15 :-(
>
> I was thinking of overriding the classes fixed in
Loaded regions are listed in .META. table and the ENCODED field in the
table points to an existing directory. But all family directories in this
region are empty...
On Mon, Dec 16, 2013 at 4:29 PM, Amit Sela wrote:
> I ran the hbck tool, and while I do have some inconsistencies they are not
> i
I ran the hbck tool, and while I do have some inconsistencies they are not
in the table that has the bulk load issues.
On Mon, Dec 16, 2013 at 4:22 PM, Amit Sela wrote:
> RegionServer logs in the RegionServer that the files are moved to indeed
> shows that all files are moved to that region (w
RegionServer logs in the RegionServer that the files are moved to indeed
shows that all files are moved to that region (when it doesn't happen it
shows only 1 file per family moved to a RegionServer)
On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela wrote:
> In the first step, the files are read corre
In the first step, the files are read correctly and regionGroups is creates
as it should.
When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice
that ServerCallable's regionName returned from server is the wrong region
(the pre-split last region).
The previous last region is not su
Thanks! But we cant really upgrade to HBase 0.96 right now, but we need to
go to Guava 15 :-(
I was thinking of overriding the classes fixed in the patch in our test
environment.
Could this work maybe?
On Mon, Dec 16, 2013 at 11:01 AM, Kristoffer Sjögren wrote:
> Hi
>
> At the moment HFileWrit
As we know, bulk load has two steps:
1. Create HFiles by MapReduce.
2. Load HFiles into HBase.
I wonder whether it read the right partitions information during the first
step. Have you run hbck tool to check the cluster healthy?
You mentioned you see the new regions in the webapp. The files were
The reduce partition information is stored in this partition_ file. See the
below code:
HFileOutputFormat#configureIncrementalLoad:
.
Path partitionsPath = new Path(job.getWorkingDirectory(),
"partitions_" + UUID.randomUUID())
Hi,
It's fixed in HBase 0.96 (by HBASE-9667).
Cheers,
Nicolas
On Mon, Dec 16, 2013 at 11:01 AM, Kristoffer Sjögren wrote:
> Hi
>
> At the moment HFileWriterV2.close breaks at startup when using Guava 15.
> This is not a client problem - it happens because we start a master node to
> do integr
I imported data into HBase in the fashion of bulk load, but after that I
found many unexpected file were created in the HDFS directory of
/user/root/, and they like these:
/user/root/partitions_fd74866b-6588-468d-8463-474e202db070
/user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d
/user/r
Hi
At the moment HFileWriterV2.close breaks at startup when using Guava 15.
This is not a client problem - it happens because we start a master node to
do integration tests.
A bit precarious and wonder if there are any plans to support Guava 15, or
if there are clever way around this?
Cheers,
-K
24 matches
Mail list logo