Thanks for the reply Doug.

Out of curiosity, is maintaining sync markers while writing the file and
then passing these markers to the readers while reading not a good way to
achieve random access in avro? Atleast that's what my understanding from
reading the javadoc[1] was, which could be flawed.

[1]
http://avro.apache.org/docs/1.3.3/api/java/org/apache/avro/file/DataFileWriter.html#sync()


On Mon, Jul 1, 2013 at 12:05 PM, Doug Cutting <[email protected]> wrote:

> Avro data files do not generally support random access.
>
> SortedKeyValueFile supports random access by key.
>
>
> http://avro.apache.org/docs/current/api/java/org/apache/avro/hadoop/file/SortedKeyValueFile.Reader.html
>
> From the documentation:
>
> "The SortedKeyValueFile is a directory with two files, named 'data'
> and 'index'. The 'data' file is an ordinary Avro container file with
> records. Each record has exactly two fields, 'key' and 'value'. The
> keys are sorted lexicographically. The 'index' file is a small Avro
> container file mapping keys in the 'data' file to their byte
> positions. The index file is intended to fit in memory, so it should
> remain small. There is one entry in the index file for each data block
> in the Avro container file."
>
> Doug
>
> On Mon, Jul 1, 2013 at 8:37 AM, [email protected]
> <[email protected]> wrote:
> > Hello,
> >
> > Is it possible to have random access to a record in an avro file? For
> > instance, if I have an avro file with a schema containing four records:
> > employee id, name, address and phone. While reading the file, is there
> any
> > way at all to directly jump to a record with employee id 100 instead of
> > having to scan the whole file every single time and filtering out
> records?
> >
> > Thanks for the help.
> >
> > --
> > Swarnim
>



-- 
Swarnim

Reply via email to