I ran the hbck tool, and while I do have some inconsistencies they are not in the table that has the bulk load issues.
On Mon, Dec 16, 2013 at 4:22 PM, Amit Sela <[email protected]> wrote: > RegionServer logs in the RegionServer that the files are moved to indeed > shows that all files are moved to that region (when it doesn't happen it > shows only 1 file per family moved to a RegionServer) > > > On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela <[email protected]> wrote: > >> In the first step, the files are read correctly and regionGroups is >> creates as it should. >> When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice >> that ServerCallable's regionName returned from server is the wrong region >> (the pre-split last region). >> The previous last region is not supposed to delete I'm just adding new >> regions (always following lexicographically) so that the last region before >> the pre-split is not the last anymore. >> It seems that wherever the ServerCallable is running, it is not updated >> with the new regions... I tried major compacting (the new regions) after >> pre-split and before the bulkload, but that didn't help. >> >> >> >> On Mon, Dec 16, 2013 at 3:07 PM, Bijieshan <[email protected]> wrote: >> >>> As we know, bulk load has two steps: >>> 1. Create HFiles by MapReduce. >>> 2. Load HFiles into HBase. >>> >>> I wonder whether it read the right partitions information during the >>> first step. Have you run hbck tool to check the cluster healthy? >>> You mentioned you see the new regions in the webapp. The files were >>> moved to the previous old region indicated the old region directory was >>> still there. So you started bulk load just after region split? (Old region >>> directory will be deleted soon by CatalogJanitor after region-split once >>> compaction finished) >>> >>> I suggest to check the regionserver logs. >>> >>> Jieshan. >>> -----Original Message----- >>> From: Amit Sela [mailto:[email protected]] >>> Sent: Monday, December 16, 2013 2:29 PM >>> To: [email protected] >>> Subject: RE: Bulk load moving HFiles to the wrong region >>> >>> Every split executed is a new day. The row key design is yyyyMMdd_URL. >>> And the split points are yyyyMMdd_x, yyyyMMdd_y etc. In a way that the >>> entire load is (almost) evenly spread. >>> The problem I described causes the bulk load to load all files to to the >>> last region of the previous day. >>> Thanks. >>> On Dec 16, 2013 3:43 AM, "Bijieshan" <[email protected]> wrote: >>> >>> > Hi Amit: >>> > Can you provide the split-keys of the new regions and your row-key >>> design? >>> > >>> > Thank you. >>> > Jieshan. >>> > -----Original Message----- >>> > From: Amit Sela [mailto:[email protected]] >>> > Sent: Monday, December 16, 2013 7:09 AM >>> > To: [email protected] >>> > Subject: Bulk load moving HFiles to the wrong region >>> > >>> > Hi all, >>> > I'm using Hadoop 1.0.4 and HBase 0.94.12. >>> > When trying to bulk load using the Java API I sometimes get the HFiles >>> > moved to the wrong directory. >>> > I'm pre-splitting regions and the new regions are always the last >>> > (lexicographically), so when this happens all files move to the last >>> > region pre-split. But the split does work. I see the new regions in >>> > the webapp before bulk load executes. Once a table has this problem >>> > (not all the time) it keeps on until I restart HBase. >>> > >>> > Anyone seen something similar ? >>> > >>> > Thanks, >>> > Amit. >>> > >>> >> >> >
