neoremind created CALCITE-4034:
----------------------------------
Summary: Implement a MySQL InnoDB adapter
Key: CALCITE-4034
URL: https://issues.apache.org/jira/browse/CALCITE-4034
Project: Calcite
Issue Type: New Feature
Components: csv-adapter
Affects Versions: 1.23.0
Reporter: neoremind
I would like to introduce a new adapter to Calcite. I would like to introduce a
new adapter to Calcite.
Calcite’s InnoDB adapter allows you to query the data based on InnoDB data
files directy, data files are also known as .ibd files, this adapter is
different from JDBC adapter which requires a MySQL server to serve response.
With .ibd files and the corresponding DDLs, InnoDB adapter is able to work like
a simple "MySQL server", it accepts SQL query and attempts to compile the query
based on InnoDB file accessing APIs provided by
[innodb-java-reader|[https://github.com/alibaba/innodb-java-reader]] (I am the
author of the library, which is well-tested and production-ready), it will
exploit projecting, filtering and sorting directly in InnoDB data file where
possible. What’s more, with DDLs, the adapter is "index aware", it leverages
rules to choose the right index to scan, for example, using primary key or
secondary keys to look up data, then it tries to push down some conditions into
storage engine. Also, the adapter leaves option to provide hint as well, so
that user can indicate the optimizer to force use one specific index.
The InnoDB adapter can,
1) Full scan a table.
2) Scan by index such as primary key or secondary key, single column or
composite keys.
3) Push down filtering condition to storage engine, support both point query
and range query.
4) Push down projection.
5) Push down ordering if it matches the natural collation of an index.
6) Support almost all the commonly used data types.
7) Work as a simple MySQL server being able to read .ibd data files directly
through file system, this can be a tool to query or dump table data by
offloading from MySQL process under some conditions.
Before I describe the adapter and its design, I was wondering if it is
appropriate to deliver the work by contributing the codebase back to Calcite,
or if it would be better to stay in another project, probably being referred
from Calcite adapter link. Looking forward to Calcite developer's advice.
Below is the first version of the usage.
----------
[MySQL](https://www.mysql.com/) is the most popular Open Source SQL database
management system, is developed, distributed, and supported by Oracle
Corporation. InnoDB is a general-purpose storage engine that balances high
reliability and high performance in MySQL, since 5.6 InnoDB has become the
default MySQL storage engine.
Calcite’s InnoDB adapter allows you to query the data based on InnoDB data
files directy, data files are also known as .ibd files. It leverages
[innodb-java-reader](https://github.com/alibaba/innodb-java-reader). This
adapter is different from JDBC adapter which maps a schema in a JDBC data
source and requires a MySQL server to serve response. With .ibd files and the
corresponding DDLs, InnoDB adapter is able to work like a simple "MySQL
server", it accepts SQL query and attempts to compile the query based on InnoDB
file accessing APIs provided by
[innodb-java-reader](https://github.com/alibaba/innodb-java-reader), it will
exploit projecting, filtering and sorting directly in InnoDB data file where
possible.
What’s more, with DDLs, the adapter is "index aware", it leverages rules to
choose the right index to scan, for example, using primary key or secondary
keys to look up data, then it tries to push down some conditions into storage
engine. Also, the adapter leaves option to provide hint as well, so that user
can indicate the optimizer to force use one specific index.
A basic example of a model file is given below, this schema reads from a MySQL
"scott" database:
{code:java}
{
"version": "1.0",
"defaultSchema": "scott",
"schemas": [
{
"name": "scott",
"type": "custom",
"factory": "org.apache.calcite.adapter.innodb.InnodbSchemaFactory",
"operand": {
"sqlFilePath": [ "/Users/xu/sql/scott.sql" ],
"ibdDataFileBasePath": "/usr/local/mysql/data/scott"
}
}
]
}
{code}
`sqlFilePath` is a list of DDL files, you can generate table definitions by
executing `_mysqldump -d -u<username> -p<password> -h <hostname> <dbname>_` in
command-line.
The file content of `/Users/xu/sql/scott.sql` is given below:
{code:java}
CREATE TABLE `DEPT`(
`DEPTNO` TINYINT NOT NULL,
`DNAME` VARCHAR(50) NOT NULL,
`LOC` VARCHAR(20),
UNIQUE KEY `DEPT_PK` (`DEPTNO`)
)ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `EMP`(
`EMPNO` INT(11) NOT NULL,
`ENAME` VARCHAR(100) NOT NULL,
`JOB` VARCHAR(15) NOT NULL,
`AGE` SMALLINT,
`MGR` BIGINT,
`HIREDATE` DATE,
`SAL` DECIMAL(8,2) NOT NULL,
`COMM` DECIMAL(6,2),
`DEPTNO` TINYINT,
`EMAIL` VARCHAR(100) DEFAULT NULL,
`CREATE_DATETIME` DATETIME,
`CREATE_TIME` TIME,
`UPSERT_TIME` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
CURRENT_TIMESTAMP,
PRIMARY KEY (`EMPNO`),
KEY `ENAME_KEY` (`ENAME`),
KEY `HIREDATE_KEY` (`HIREDATE`),
KEY `CREATE_DATETIME_JOB_KEY` (`CREATE_DATETIME`, `JOB`),
KEY `CREATE_TIME_KEY` (`CREATE_TIME`),
KEY `UPSERT_TIME_KEY` (`UPSERT_TIME`),
KEY `DEPTNO_JOB_KEY` (`DEPTNO`, `JOB`),
KEY `DEPTNO_SAL_COMM_KEY` (`DEPTNO`, `SAL`, `COMM`),
KEY `DEPTNO_MGR_KEY` (`DEPTNO`, `MGR`),
KEY `AGE_KEY` (`AGE`)
)ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
{code}
ibdDataFileBasePath is the parent file path of .ibd files.
Assuming the model file is stored as `model.json`, you can connect to InnoDB
data file to performa query via
[sqlline](https://github.com/julianhyde/sqlline) as follows:
{% highlight bash %}
sqlline> !connect jdbc:calcite:model=model.json admin admin
We can issue a simple query to query all employees by writing standard SQL:
{% highlight bash %}
sqlline> select empno,ename,job,age,mgr from "EMP";
+-------+--------+-----------+-----+------+
| EMPNO | ENAME | JOB | AGE | MGR |
+-------+--------+-----------+-----+------+
| 7369 | SMITH | CLERK | 30 | 7902 |
| 7499 | ALLEN | SALESMAN | 24 | 7698 |
| 7521 | WARD | SALESMAN | 41 | 7698 |
| 7566 | JONES | MANAGER | 28 | 7839 |
| 7654 | MARTIN | SALESMAN | 27 | 7698 |
| 7698 | BLAKE | MANAGER | 38 | 7839 |
| 7782 | CLARK | MANAGER | 32 | 7839 |
| 7788 | SCOTT | ANALYST | 45 | 7566 |
| 7839 | KING | PRESIDENT | 22 | null |
| 7844 | TURNER | SALESMAN | 54 | 7698 |
| 7876 | ADAMS | CLERK | 35 | 7788 |
| 7900 | JAMES | CLERK | 40 | 7698 |
| 7902 | FORD | ANALYST | 28 | 7566 |
| 7934 | MILLER | CLERK | 32 | 7782 |
+-------+--------+-----------+-----+------+
While executing this query, the InnoDB adapter scans the InnoDB data file
`EMP.ibd` using primary key, also known as clustering B+ tree index in MySQL,
and is able to
push down projection to underlying storage engine. Projection can reduce the
size of data fetched from the storage engine.
We can look up one employee by filtering. The InnoDB adapter retrieves all
indexes through DDL file provided in `model.json`.
{% highlight bash %}
sqlline> select empno,ename,job,age,mgr from "EMP" where empno = 7782;
+-------+-------+---------+-----+------+
| EMPNO | ENAME | JOB | AGE | MGR |
+-------+-------+---------+-----+------+
| 7782 | CLARK | MANAGER | 32 | 7839 |
+-------+-------+---------+-----+------+
The InnoDB adapter is able to recognize that `empno` is the primary key and do
a point query lookup by using clustering index instead of full table scan.
We can do range query on primary key as well.
{% highlight bash %}
sqlline> select empno,ename,job,age,mgr from "EMP" where empno > 7782 and empno
< 7900;
Note that such query is usually efficient in MySQL with InnoDB storage engine,
because for clustering B+ tree index, records close in index are close in data
file, which is good for scanning.
We can look up one employee by secondary key. For example, the filtering
condition will be on a `VARCHAR` field `ename`.
{% highlight bash %}
sqlline> select empno,ename,job,age,mgr from "EMP" where ename = 'smith';
+-------+-------+-------+-----+------+
| EMPNO | ENAME | JOB | AGE | MGR |
+-------+-------+-------+-----+------+
| 7369 | SMITH | CLERK | 30 | 7902 |
+-------+-------+-------+-----+------+
The InnoDB adapter works well on almost all the commonly used data types in
MySQL, for more information on supported data types, please refer to
[innodb-java-reader](https://github.com/alibaba/innodb-java-reader#3-features).
We can query by composite key. For example, given secondary index of
`DEPTNO_MGR_KEY`, we can issue a query below:
{% highlight bash %}
sqlline> select empno,ename,job,age,mgr from "EMP" where deptno = 20 and mgr =
7566;
+-------+-------+---------+-----+------+
| EMPNO | ENAME | JOB | AGE | MGR |
+-------+-------+---------+-----+------+
| 7788 | SCOTT | ANALYST | 45 | 7566 |
| 7902 | FORD | ANALYST | 28 | 7566 |
+-------+-------+---------+-----+------+
The InnoDB adapter will leverage the right key to push down both filtering
conditions.
In some cases, only part of the conditions can be pushed down since there is a
limitation in the underlying storage engine API, leaving unpushed condition in
the rest of the plan. Given the below SQL, only `DEPTNO` condition is pushed
down,
{% highlight bash %}
```
select empno,ename,job,age,mgr from "EMP" where deptno = 20 and upsert_time >
'2018-01-01 00:00:00';
```
You can think of this as `Index Condition Pushdown (ICP)` in MySQL server,
`innodb-java-reader` only supports range query with lower and upper bound using
an index, the storage engine returns a range of rows and Calcite will evaluates
the rest of `WHERE` condition from the rows fetched.
For the below SQL, there are multiple index satisfying the left-prefix index
rule, these possible indexes are `DEPTNO_JOB_KEY`, `DEPTNO_SAL_COMM_KEY` and
`DEPTNO_MGR_KEY`, the Innod adapter will choose one of them according to the
ordinal defined in DDL, only `DEPTNO` condition is pushed down, leaving `SAL`
condition handled by Calcite built-in execution engine.
{% highlight bash %}
sqlline> select empno,deptno,sal from "EMP" where deptno = 20 and sal > 2000;
+-------+--------+---------+
| EMPNO | DEPTNO | SAL |
+-------+--------+---------+
| 7788 | 20 | 3000.00 |
| 7902 | 20 | 3000.00 |
| 7566 | 20 | 2975.00 |
+-------+--------+---------+
For secondary key, the execution process will be scanning by secondary index
and look up full record back to clustering index in InnoDB, for a "big"
scanning, that would introduce many random IOs, so the performance is usually
not good enough. Note that the query ablove can be more performant by using
`EPTNO_SAL_COMM_KEY` index, because covering index does not need to look up
back to clustering index anymore. We can force using `DEPTNO_SAL_COMM_KEY `
index by hint.
{% highlight bash %}
sqlline> select empno,ename,job,age,sal from "emp"/*+
index(DEPTNO_SAL_COMM_KEY) */ where deptno = 20 and sal > 2000;
Hint strategy can be configured in `SqlToRelConverter`, you should register
`index` hint on `TableScan` strategy beforehand in order to use this feature.
Index hint takes effect when issuing SQL query with filtering, for query
without any filtering, a full scan on clustering index will be used.
Ordering can be pushed down if it matches the natural collation of the index
used.
{% highlight bash %}
sqlline> select deptno,ename,hiredate from "EMP" where hiredate < '2020-01-01'
order by hiredate desc;
+--------+--------+------------+
| DEPTNO | ENAME | HIREDATE |
+--------+--------+------------+
| 20 | ADAMS | 1987-05-23 |
| 20 | SCOTT | 1987-04-19 |
| 10 | MILLER | 1982-01-23 |
| 20 | FORD | 1981-12-03 |
| 30 | JAMES | 1981-12-03 |
| 10 | KING | 1981-11-17 |
| 30 | MARTIN | 1981-09-28 |
| 30 | TURNER | 1981-09-08 |
| 10 | CLARK | 1981-06-09 |
| 30 | WARD | 1981-02-22 |
| 30 | ALLEN | 1981-02-20 |
| 20 | JONES | 1981-02-04 |
| 30 | BLAKE | 1981-01-05 |
| 20 | SMITH | 1980-12-17 |
+--------+--------+------------+
## Limitations
`innodb-java-reader` has some prerequisites for .ibd files, please refer to
[Prerequisites](https://github.com/alibaba/innodb-java-reader#2-prerequisites).
Currently the InnoDB adapter does not aware row count and cardinality of a .ibd
data file, so it will only rely on simple rules to perform optimization, once
underlying storage engine could provide such metrics and metadata, this can be
integrated in Calcite by leveraging cost based optimization in the future.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)