I seem to have a problem with writing to an HFile: I always get an error saying I'm using the wrong number of partitions in keyset.
I'm using configureIncrementalLoad() on my CDH4 vmware: hbase-0.94.2+202-1.cdh4.2.0.p0.11.el6.x86_64 and hadoop-2.0.0+922-1.cdh4.2.0.p0.12.el6.x86_64. I've modified my configuration to be in pseudo-distributed mode, however I find it strange that mapred.job.tracker returns "local". Is this an issue? My table contains 2 regions according to http://localhost:60010/master-status/myTable. After I call configureIncrementalLoad(), job.getNumReduceTasks() returns "2", yet when I watch job.getNumReduceTasks() in job.waitForCompletion(), it returns "1". Finally, splitPoints.length is set to 1 (since I have 2 regions, I would think it's ok to have a single split point, right?). K[] splitPoints = readPartitions(fs, partFile, keyClass, conf); if (splitPoints.length != job.getNumReduceTasks() - 1) { throw new IOException("Wrong number of partitions in keyset"); } I've tried to look into the partitions.lst file to see if I could spot something wrong, however it doesn't seem to be very human readable. I'd appreciate any pointers! Cheers, David
