I've managed to isolate the problem. I implemented an extension of HFileOutputFormat - because each bulk load will import data to the newly created regions only, I pass the prefix (yyyyMMdd) to MyHFileOutputFormat.configureIncrementalLoad() so that getRegionStartKeys returns only the corresponding keys. I did this in order to avoid having 2000 reducers when my target is 15 regions...
When I use HFileOutputFormat it seems to work. But I don't understand why it doesn't happen in other tables (some smaller and some much much bigger) or even in that table it happens every once in a while ? Any ideas ? On Mon, Dec 16, 2013 at 4:37 PM, Amit Sela <[email protected]> wrote: > Loaded regions are listed in .META. table and the ENCODED field in the > table points to an existing directory. But all family directories in this > region are empty... > > > On Mon, Dec 16, 2013 at 4:29 PM, Amit Sela <[email protected]> wrote: > >> I ran the hbck tool, and while I do have some inconsistencies they are >> not in the table that has the bulk load issues. >> >> >> >> On Mon, Dec 16, 2013 at 4:22 PM, Amit Sela <[email protected]> wrote: >> >>> RegionServer logs in the RegionServer that the files are moved to indeed >>> shows that all files are moved to that region (when it doesn't happen it >>> shows only 1 file per family moved to a RegionServer) >>> >>> >>> On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela <[email protected]> wrote: >>> >>>> In the first step, the files are read correctly and regionGroups is >>>> creates as it should. >>>> When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice >>>> that ServerCallable's regionName returned from server is the wrong region >>>> (the pre-split last region). >>>> The previous last region is not supposed to delete I'm just adding new >>>> regions (always following lexicographically) so that the last region before >>>> the pre-split is not the last anymore. >>>> It seems that wherever the ServerCallable is running, it is not updated >>>> with the new regions... I tried major compacting (the new regions) after >>>> pre-split and before the bulkload, but that didn't help. >>>> >>>> >>>> >>>> On Mon, Dec 16, 2013 at 3:07 PM, Bijieshan <[email protected]>wrote: >>>> >>>>> As we know, bulk load has two steps: >>>>> 1. Create HFiles by MapReduce. >>>>> 2. Load HFiles into HBase. >>>>> >>>>> I wonder whether it read the right partitions information during the >>>>> first step. Have you run hbck tool to check the cluster healthy? >>>>> You mentioned you see the new regions in the webapp. The files were >>>>> moved to the previous old region indicated the old region directory was >>>>> still there. So you started bulk load just after region split? (Old region >>>>> directory will be deleted soon by CatalogJanitor after region-split once >>>>> compaction finished) >>>>> >>>>> I suggest to check the regionserver logs. >>>>> >>>>> Jieshan. >>>>> -----Original Message----- >>>>> From: Amit Sela [mailto:[email protected]] >>>>> Sent: Monday, December 16, 2013 2:29 PM >>>>> To: [email protected] >>>>> Subject: RE: Bulk load moving HFiles to the wrong region >>>>> >>>>> Every split executed is a new day. The row key design is yyyyMMdd_URL. >>>>> And the split points are yyyyMMdd_x, yyyyMMdd_y etc. In a way that the >>>>> entire load is (almost) evenly spread. >>>>> The problem I described causes the bulk load to load all files to to >>>>> the last region of the previous day. >>>>> Thanks. >>>>> On Dec 16, 2013 3:43 AM, "Bijieshan" <[email protected]> wrote: >>>>> >>>>> > Hi Amit: >>>>> > Can you provide the split-keys of the new regions and your row-key >>>>> design? >>>>> > >>>>> > Thank you. >>>>> > Jieshan. >>>>> > -----Original Message----- >>>>> > From: Amit Sela [mailto:[email protected]] >>>>> > Sent: Monday, December 16, 2013 7:09 AM >>>>> > To: [email protected] >>>>> > Subject: Bulk load moving HFiles to the wrong region >>>>> > >>>>> > Hi all, >>>>> > I'm using Hadoop 1.0.4 and HBase 0.94.12. >>>>> > When trying to bulk load using the Java API I sometimes get the >>>>> HFiles >>>>> > moved to the wrong directory. >>>>> > I'm pre-splitting regions and the new regions are always the last >>>>> > (lexicographically), so when this happens all files move to the last >>>>> > region pre-split. But the split does work. I see the new regions in >>>>> > the webapp before bulk load executes. Once a table has this problem >>>>> > (not all the time) it keeps on until I restart HBase. >>>>> > >>>>> > Anyone seen something similar ? >>>>> > >>>>> > Thanks, >>>>> > Amit. >>>>> > >>>>> >>>> >>>> >>> >> >
