I am trying to download binary files stored in Hadoop but there is like a 2
minute wait on a 20mb file when I try to execute the in.read(buf).

is there a better way to be doing this?

    private void pipe(InputStream in, OutputStream out) throws IOException
    {    System.out.println(System.currentTimeMillis()+" Starting to Pipe
Data");
        byte[] buf = new byte[1024];
        int read = 0;
        while ((read = in.read(buf)) >= 0)
        {
            out.write(buf, 0, read);
            System.out.println(System.currentTimeMillis()+" Piping Data");
        }
        out.flush();
        System.out.println(System.currentTimeMillis()+" Finished Piping
Data");

    }

public void readFile(String fileToRead, OutputStream out)
            throws IOException
    {
        System.out.println(System.currentTimeMillis()+" Start Read File");
        Path inFile = new Path(fileToRead);
        System.out.println(System.currentTimeMillis()+" Set Path");
        // Validate the input/output paths before reading/writing.

        if (!fs.exists(inFile))
        {
            throw new HadoopFileException("Specified file  " + fileToRead
                    + " not found.");
        }
        if (!fs.isFile(inFile))
        {
            throw new HadoopFileException("Specified file  " + fileToRead
                    + " not found.");
        }
        // Open inFile for reading.
        System.out.println(System.currentTimeMillis()+" Opening Data
Stream");
        FSDataInputStream in = fs.open(inFile);

        System.out.println(System.currentTimeMillis()+" Opened Data
Stream");
        // Open outFile for writing.

        // Read from input stream and write to output stream until EOF.
        pipe(in, out);

        // Close the streams when done.
        out.close();
        in.close();
    }
Ananth T Sarathy

Reply via email to