Hello,
I just started using Drill and would love to use it as a query engine for a
custom data format.
The data format is actually a set of SQLite files that additionally contain
hierarchical data in some fields (similar to json, but stored in a binary
format)
Now, I want to learn how to write a custom storage plugin for this.
Therefore I’d like to know:
1) Is there any documentation / tutorial out there covering this topic?
2) Which existing storage plugin might be a good candidate to look at and learn
from?
3) As SQLite is the underlying storage engine and thus is able to help in
pushed-down selections and joins, can I implement partial pushdown, meaning
just for some fields?
Because SQLite can help with operations on regular SQLite columns, but
SQLite does not understand the binary blobs that represent structural nested
data. So, can I tell Drill, that
only for some parts of the schema pushdown is possible?
4) If starting from an existing plugin, would you recommend starting from the
JDBC plugin, a file-system/json plugin, or any other one?
5) Do I need to specify the schema fully (i.e. the full nested blob-column data
types?) or is there something like Drill-Data-type: „json“ and Drill does the
inspection?
I would appreciate your advice and best regards,
Thomas