>From this jira it was fixed in 0.21.0: https://issues.apache.org/jira/browse/MAPREDUCE-476
I know CDH has it patched in, not sure about the others. J-D On Thu, Aug 11, 2011 at 1:28 AM, Ophir Cohen <[email protected]> wrote: > I did some more tests and found the problem: on local run the distribtued > cache does not work. > > On full cluster it works. > Sorry for your time... > Ophir > > PS > Is there any way to use distributed cache localy as well (i.e. when I'm > running MR from intellijIdea )? > > On Thu, Aug 11, 2011 at 11:20, Ophir Cohen <[email protected]> wrote: > >> Now I see that it uses the distributed cache - but for some reason >> the TotalOrderPartitioner does not grab it. >> Ophir >> >> >> On Thu, Aug 11, 2011 at 11:08, Ophir Cohen <[email protected]> wrote: >> >>> Hi, >>> I started to use bulk upload and encounter a strange problem. >>> I'm using Cloudera cdh3-u1. >>> >>> I'm using HFileOutputFormat.configureIncrementalLoad() to configure my >>> job. >>> This method create partition file for the TotalOrderPartitioner and save >>> it to HDFS. >>> >>> When the TotalOrderPartitioner initiated it tries to find the path for the >>> file in the configuration: >>> public static String getPartitionFile(Configuration conf) { >>> return conf.get(PARTITIONER_PATH, DEFAULT_PATH); >>> } >>> >>> The strange thing is that this parameter never assigned! >>> It looks to me that it should have configured >>> in HFileOutputFormat.configureIncrementalLoad() but it does not! >>> >>> Then it takes the default ("_part") or something similar and (of course) >>> does not find it... >>> >>> BTW >>> When I manually add this parameter it works great. >>> >>> Is that a bug or do I miss something? >>> Thanks, >>> Ophir >>> >>> >> >
