On Apr 12, 2010, at 6:13 PM, Lurga wrote:

> Hello,
> I create a "Person" record (3 fields: first,last,age), and an "Extract" 
> record (2 fields: first,last). Then I use "Person" to write some object to a 
> file. When I use "Extract" to read data from the file, I got an exception: 
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -14.
> It seems like GenericDatumReader.readRecord won't skip the last fields. How 
> can I read the data corretly?
> 
> My code is below:
> public void browseName() throws IOException {
>  List<Field> fields = new ArrayList<Field>();
>  fields.add(new Field("First", Schema.create(Type.STRING), null, null));
>  fields.add(new Field("Last", Schema.create(Type.STRING), null, null)); 
>  Schema extractSchema = Schema.createRecord(fields);
>  DataFileReader<Record> reader = new DataFileReader<Record>(new File(
>    fileName), new GenericDatumReader<Record>(extractSchema));
>  try {
>    while (reader.hasNext()) {
>      Record person = reader.next();
>      System.out.print(person.get("First").toString() + " " + 
> person.get("Last").toString() + "\t");
>    }
>  } finally {
>    reader.close();
>  }
> }
> 

Try configuring the 'expected' schema.

The schema you are creating above is the expected (reader's) schema, but you 
are configuring the actual 'data' schema.

See
GenericDatumReader.setExpected(Schema expected);

It looks like the above needs javadoc improvement.

setSchema() sets the schema of the data being read (what is in the file).  The 
DataFileReader calls setSchema() on its own to what it finds in the file 
(overwriting what you passed in).  But you will have to set your expected 
schema yourself.

Try something like:

DatumReader dr = new GenericDatumReader();
dr.setExpected(extractSchema);
DataFileReader<Record> reader = new DataFileReader<Record>(new File(
  fileName), dr);


> Regards,
> 
> 2010-04-13 
> 
> 
> 
> Lurga 
> 

Reply via email to