Edmon,
I see you're working on adding documentation about creating storage
plug-ins.
I was looking into that myself a little while ago, but wasn't able to
continue.
Below are some notes on the detailed requirements I had extracted from
the code. Hopefully they'll be helpful in filling in your documentation
of what's required to create a storage plug-in.
Daniel
--------------------------------------------------------------------------------
Storage Plug-In Notes
Pieces needed for/aspects of a storage plug-in:
* Need storage plug-in configuration class:
- per StoragePluginConfig (abstract class (currently))
- (org.apache.drill.common.logical.StoragePluginConfig)
- public class
- public no-argument constructor? (modulo Jackson/@JsonTypeName?)
- one per storage plug-in type?
- What about Jackson serialization?
- TODO: REFINE: plug-in type name (for ?) defaults to simple class name; can
be specified by some kind of NAME property (caps? any?)
- what are requirements on serializability?
* Need storage plug-in class:
- per StoragePlugin (interface)
- (org.apache.drill.exec.store StoragePlugin)
- public class (not clearly needed)
- public constructor ...( SomeStoragePluginConfig, DrillContext, String),
where:
- StoragePluginConfig is _specific_ implementation class of
StoragePluginConfig
- multiple storage plug-ins can share one storage plug-in class--one
constructor per StoragePluginConfig class
* Class path scanning requirement:
- StoragePluginConfig and StoragePlugin implementation classes are found by
classpath scanning
- Need drill-module.conf file in root of classpath subtree containing classes
to be found.
- Normally need to append name of each package (immediately?) containing
implementation classes to configuration property
drill.exec.storage.packages.
* bootstrap-storage-plugins.json
- Normally need to have bootstrap-storage-plugins.json file in same classpath
root.
- Normally have default configuration for plug-in in same classpath root's
bootstrap-storage-plugins.json file.
- Format seems to Jackson's serialization of some kind of list of
StoragePluginConfig:
- Jackson seems to follow Java Beans getter/setting mapping rules
(verified only for simple values--String, boolean)
- (What else?)
* Schema, ROUGH:
- Calcite's Schema
- Drill's AbstractSchema
- implementations of interface Calcite's Table must be subclasses of Drill's
DrillTable
(Document that old code doesn't follow recently clarified terminology in (most
of) user documetation:
- "storage plug-in" refers to the code itself (what plugs into Drill)
- "storage plug-in configuration" refers to the configuration associated with
names such as "cp" and "dfs"--different configurations of the file-system
plug-in
- "storage plug-in configuration name" refers to names such as "cp" and "dfs"
- "storage plug-in type name" refers to ... (e.g., "file", "hive")
- (old terms in code: "storage engine" (sometimes) means storage plug-in
configuration name)
)
Pending questions:
- Q: What does the @JsonTypeInfo annotation on StoragePluginConfig do?
Specifically, how exactly does it relate to "type" in 'type: "file"' and to
"NAME" and "name" fields (JavaBeans/Jackson properties?) on plug-in classes?
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME,
include = JsonTypeInfo.As.PROPERTY, property="type") on StoragePluginConfig
specify that JavaBeans/Jackson property named "type" on subclasses
- Q: What exactly does SystemTablePluginConfig's _public_ NAME field do?
- Q: What exactly does SystemTablePluginConfig's _public_ INSTANCE field do?
--------------------------------------------------------------------------------
--
Daniel Barclay
MapR Technologies