RegionServer logs in the RegionServer that the files are moved to indeed shows that all files are moved to that region (when it doesn't happen it shows only 1 file per family moved to a RegionServer)
On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela <[email protected]> wrote: > In the first step, the files are read correctly and regionGroups is > creates as it should. > When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice > that ServerCallable's regionName returned from server is the wrong region > (the pre-split last region). > The previous last region is not supposed to delete I'm just adding new > regions (always following lexicographically) so that the last region before > the pre-split is not the last anymore. > It seems that wherever the ServerCallable is running, it is not updated > with the new regions... I tried major compacting (the new regions) after > pre-split and before the bulkload, but that didn't help. > > > > On Mon, Dec 16, 2013 at 3:07 PM, Bijieshan <[email protected]> wrote: > >> As we know, bulk load has two steps: >> 1. Create HFiles by MapReduce. >> 2. Load HFiles into HBase. >> >> I wonder whether it read the right partitions information during the >> first step. Have you run hbck tool to check the cluster healthy? >> You mentioned you see the new regions in the webapp. The files were moved >> to the previous old region indicated the old region directory was still >> there. So you started bulk load just after region split? (Old region >> directory will be deleted soon by CatalogJanitor after region-split once >> compaction finished) >> >> I suggest to check the regionserver logs. >> >> Jieshan. >> -----Original Message----- >> From: Amit Sela [mailto:[email protected]] >> Sent: Monday, December 16, 2013 2:29 PM >> To: [email protected] >> Subject: RE: Bulk load moving HFiles to the wrong region >> >> Every split executed is a new day. The row key design is yyyyMMdd_URL. >> And the split points are yyyyMMdd_x, yyyyMMdd_y etc. In a way that the >> entire load is (almost) evenly spread. >> The problem I described causes the bulk load to load all files to to the >> last region of the previous day. >> Thanks. >> On Dec 16, 2013 3:43 AM, "Bijieshan" <[email protected]> wrote: >> >> > Hi Amit: >> > Can you provide the split-keys of the new regions and your row-key >> design? >> > >> > Thank you. >> > Jieshan. >> > -----Original Message----- >> > From: Amit Sela [mailto:[email protected]] >> > Sent: Monday, December 16, 2013 7:09 AM >> > To: [email protected] >> > Subject: Bulk load moving HFiles to the wrong region >> > >> > Hi all, >> > I'm using Hadoop 1.0.4 and HBase 0.94.12. >> > When trying to bulk load using the Java API I sometimes get the HFiles >> > moved to the wrong directory. >> > I'm pre-splitting regions and the new regions are always the last >> > (lexicographically), so when this happens all files move to the last >> > region pre-split. But the split does work. I see the new regions in >> > the webapp before bulk load executes. Once a table has this problem >> > (not all the time) it keeps on until I restart HBase. >> > >> > Anyone seen something similar ? >> > >> > Thanks, >> > Amit. >> > >> > >
