Please provide any feedback on the start of the writer interface described
in the attached document. It should be a more formalized interface in the
next few days.
-Jason
---------- Forwarded message ----------
From: Jason Altekruse <[email protected]>
Date: Wed, Oct 2, 2013 at 1:12 PM
Subject: Fwd: Writer interface start
To: [email protected]
---------- Forwarded message ----------
From: Jason Altekruse <[email protected]>
Date: Wed, Oct 2, 2013 at 12:31 PM
Subject: Writer interface start
To: Jacques Nadeau <[email protected]>, Ben Becker <[email protected]>,
Steven Phillips <[email protected]>
A quick update on the status of the writer interface. I haven't written it
formally yet but I put together a document describing the important design
considerations for various formats, trying to be as general as possible.
Should be more fleshed out in detail in the next few days.
See attached
-Jason
Considerations for Drill writer interface:
- two major types of output formats
- column major
- ORC and RC File
- parquet
- block compressed sequence files
- row major
- CSV
- basic sequence files
- JSON
Considerations for column major:
- much more state heavy for reading/writing
- two states to manage, fill level of VV, section of file already on
disk
(or in buffer ready to be written to disk when complete section of file
is full)
- at some level the file will be buffered in sections, keeping
track of
when new buffering needs to take place needs to happen alongside
management of the fill level of VVs
- keeping track of numbers of records processed
- handling cutoffs for less frequent split points in files
- with parquet there is a range of sizes for each sub-component of
the
file
- we have to stay within these ranges while working within the drill
model of not knowing how much data will come in the next batch
- need to build up larger in memory structures to be written all at once
to disk
- sizes of different parts of file need to be recorded
in file level meta-data, as well as schema (parquet, might be
applicable for others)
- schema changes can result in new file creation, or a refactor
of previously written batches to include nulls for newly
discovered columns in later batches
- this would involve parsing the written part of the file, adding
a column or columns full of nulls and re-writing
- need to balance good file sizes with minimizing re-processing time
- for efficient processing should always define a translation between
columnar in-memory Value Vector structures of Drill and various formats
- do not want to pull values out of one format and place them
individually in the other
- this applies for reading as well as writing
- Various compression algorithms
- value compression, Run Length Encoding (RLE), Bit-Packing (see
parquet documentation for more specific information on these
techniques)
- general compression algorithms, Snappy, gzip, etc.
- applied to blocks of values, which may be value compressed already
Considerations for Row Major formats:
- Will obviously have to pull individual values out of Value Vectors
- Depending on existing APIs, might have to create objects to pass into
existing writer code for a given format
- should try to avoid new object creation for each record
- try to reuse objects, or simply handle individual primitives
General Considerations:
- try to use as much existing code as possible while keeping writing
efficient
- need to define a syntax for specifying encodings, compression types,
columns to be written destination, etc. in logical and physical plan
- will be very format specific
===============================
current record reader interface
===============================
- RecordReader, SchemaProvider,
public interface SchemaProvider {
static final org.slf4j.Logger logger =
org.slf4j.LoggerFactory.getLogger(SchemaProvider.class);
public Object getSelectionBaseOnName(String tableName);
}
public interface RecordReader {
/**
* Configure the RecordReader with the provided schema and the record batch
* that should be written to.
*
* @param output
* The place where output for a particular scan should be written. The
* record reader is responsible for mutating the set of schema values
* for that particular record.
* @throws ExecutionSetupException
*/
public abstract void setup(OutputMutator output) throws
ExecutionSetupException;
/**
* Increment record reader forward, writing into the provided output batch.
*
* @return The number of additional records added to the output.
*/
public abstract int next();
public abstract void cleanup();
}