It looks like this will do the trick...
_____
FileSystem fs = etc.
FSDataInputStream dataInputStream = fs.open(firstInputPath);
DatumReader<GenericRecord> reader = new
GenericDatumReader<GenericRecord>();
DataFileStream<GenericRecord> dataFileStream = new
DataFileStream<GenericRecord>(dataInputStream, reader);
Schema s = dataFileStream.getSchema();
_____
On Mon, Feb 27, 2012 at 3:41 PM, David B. Martin <[email protected]> wrote:
> I've been getting my feet wet writing pseduo-distributed code. In
> that environment:
>
> _____
>
> File file = new File(input);
> DatumReader<GenericRecord> reader = new
> GenericDatumReader<GenericRecord>();
> DataFileReader<GenericRecord> dataFileReader = new
> DataFileReader<GenericRecord>(file, reader);
>
> Schema s = dataFileReader.getSchema();
> _____
>
> Saying something like this works just fine. Now I have the schema of
> my input and am ready for real action.
>
> But on the HDFS, I have to work in terms of Path instances instead of
> File instances. Right? I can't figure out how to perform the above
> operation when my inputs are of type org.apache.hadoop.fs.Path and
> not java.io.File.
>
> Dave