Doh! Thanks Dave and JD. I'll update the RefGuide with this fact.
On 4/26/12 6:32 PM, "Jean-Daniel Cryans" <[email protected]> wrote: >Yep same old problem that was asked a bunch of time on the user list :) > >On Thu, Apr 26, 2012 at 3:29 PM, Dave Revell <[email protected]> >wrote: >> Hi Doug, >> >> When I hit this problem, I concluded that HFileOutputFormat cannot be >>used >> in standalone mode since it requires DistributedCache, which doesn't >>work >> with the local job runner. >> >> So you're not the only one :( >> >> -Dave >> >> On Thu, Apr 26, 2012 at 1:52 PM, Doug Meil >><[email protected]>wrote: >> >>> >>> Hi Devs- >>> >>> I'm coding up a local bulkloading example for the RefGuide but I've >>>been >>> banging my head on thisŠ. >>> >>> >>> WARN [Thread-8] (LocalJobRunner.java:295) - job_local_0001 >>> >>> java.lang.IllegalArgumentException: Can't read partitions file >>> >>> at >>> >>>org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.s >>>etConf(TotalOrderPartitioner.java:111) >>> >>> at >>>org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) >>> >>> at >>> >>>org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java: >>>117) >>> >>> at >>> >>>org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java: >>>552) >>> >>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631) >>> >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) >>> >>> at >>>org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) >>> >>> Caused by: java.io.FileNotFoundException: File _partition.lst does not >>> exist. >>> >>> at >>> >>>org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem >>>.java:372) >>> >>> at >>> >>>org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.jav >>>a:251) >>> >>> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:751) >>> >>> at >>>org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) >>> >>> at >>>org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) >>> >>> at >>> >>>org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.r >>>eadPartitions(TotalOrderPartitioner.java:296) >>> >>> at >>> >>>org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.s >>>etConf(TotalOrderPartitioner.java:82) >>> >>> Š does bulk loading work with the local job runner? Obviously, you're >>>not >>> going to run a production cluster off your laptop but it's nice to at >>>least >>> be able to test your code. >>> >>> I know the DistributedCache doesn't work with the LocalJobRunner (and >>> TotalOrderPartitioner uses the DistributedCache) and then there's this >>>log >>> message.. >>> >>> >>> WARN [main] (LocalJobRunner.java:134) - LocalJobRunner does not >>>support >>> symlinking into current working dir. >>> >>> Š so I'm wondering how this actually works, if it does work locally. >>> >>> Coincidentally, this exact error is in the troubleshooting chapter.. >>> >>> http://hbase.apache.org/book.html#trouble.mapreduce >>> >>> Š but it came up in a different context. In the context that the guy >>>was >>> asking the question he thought he was remote, but he was really local. >>> >>> Doug Meil >>> Chief Software Architect, Explorys >>> [email protected] >>> >>> >
