i m trying to access file that I sent as -files option in my hadoop jar command.
in my outputformat, I am doing something like: Path[] cacheFiles = DistributedCache.getLocalCacheFiles(conf); String file1=""; String file2=""; Path pt=null; for (Path p : cacheFiles) { if (p != null) { if (p.getName().endsWith(".ryp")) { file1 = p.getName(); } else if (p.getName().endsWith(".cpt")) { file2 = p.getName(); pt=p; } } } // then read the file, which gives file does not exist exception: Path pat = new Path(file2); BufferedReader reader = null; try { FileSystem fs = FileSystem.get(conf); reader=new BufferedReader( new InputStreamReader(fs.open(pat))); String line = null; while ((line = reader.readLine()) != null) { System.out.println("Now parsing the line: " + line); } } catch (Exception e) { System.out.println("exception" + e.getMessage()); } On Fri, Jul 29, 2011 at 10:50 AM, Alejandro Abdelnur <t...@cloudera.com>wrote: > Where are you getting the error, in the client submitting the job or in the > MR tasks? > > Are you trying to access a file or trying to set a JAR in the > DistributedCache? > How/when are you adding the file/JAR to the DC? > How are you retrieving the file/JAR from your outputformat code? > > Thxs. > > Alejandro > > > On Fri, Jul 29, 2011 at 10:43 AM, Mapred Learn <mapred.le...@gmail.com>wrote: > >> I am trying to create a custom text outputformat where I want to access a >> distirbuted cache file. >> >> >> >> On Fri, Jul 29, 2011 at 10:42 AM, Harsh J <ha...@cloudera.com> wrote: >> >>> Mapred, >>> >>> By outputformat, do you mean the frontend, submit-time run of >>> OutputFormat? Then no, it cannot access the distributed cache cause >>> its not really setup at that point, and the front end doesn't need the >>> distributed cache really when it can access those files directly. >>> >>> Could you describe slightly deeper on what you're attempting to do? >>> >>> On Fri, Jul 29, 2011 at 10:57 PM, Mapred Learn <mapred.le...@gmail.com> >>> wrote: >>> > Hi, >>> > I am trying to access distributed cache in my custom output format but >>> it >>> > does not work and file open in custom output format fails with file >>> does not >>> > exist even though it physically does. >>> > >>> > Looks like distributed cache only works for Mappers and Reducers ? >>> > >>> > Is there a way I can read Distributed Cache in my custom output format >>> ? >>> > >>> > Thanks, >>> > -JJ >>> > >>> >>> >>> >>> -- >>> Harsh J >>> >> >> >