[
https://issues.apache.org/jira/browse/CALCITE-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211077#comment-17211077
]
neoremind edited comment on CALCITE-4034 at 10/10/20, 11:09 AM:
----------------------------------------------------------------
[~julianhyde] -I have rebased to
[https://github.com/julianhyde/calcite/tree/4034-innodb], try to find the
reason of the failed testcases. The input parameter - _hints_ is empty when
creating InnodbTableScan, the hint is registered when optimizing (see below). I
think it is better to consult committers about the finding, do you have any
idea how the behavior changes and affects after 1.25?-
{code:java}
public class InnodbTableScan extends TableScan implements InnodbRel {
...
@Override public void register(RelOptPlanner planner) {
HintStrategyTable strategies = HintStrategyTable.builder()
.hintStrategy("index", HintPredicates.TABLE_SCAN)
.build();
getCluster().setHintStrategies(strategies);
planner.addRule(InnodbRules.TO_ENUMERABLE);
for (RelOptRule rule : InnodbRules.RULES) {
planner.addRule(rule);
}
}
{code}
was (Author: neoremind):
[~julianhyde] I have rebased to
[https://github.com/julianhyde/calcite/tree/4034-innodb], try to find the
reason of the failed testcases. The input parameter - _hints_ is empty when
creating InnodbTableScan, the hint is registered when optimizing (see below). I
think it is better to consult committers about the finding, do you have any
idea how the behavior changes and affects after 1.25?
{code:java}
public class InnodbTableScan extends TableScan implements InnodbRel {
...
@Override public void register(RelOptPlanner planner) {
HintStrategyTable strategies = HintStrategyTable.builder()
.hintStrategy("index", HintPredicates.TABLE_SCAN)
.build();
getCluster().setHintStrategies(strategies);
planner.addRule(InnodbRules.TO_ENUMERABLE);
for (RelOptRule rule : InnodbRules.RULES) {
planner.addRule(rule);
}
}
{code}
> Implement a MySQL InnoDB adapter
> --------------------------------
>
> Key: CALCITE-4034
> URL: https://issues.apache.org/jira/browse/CALCITE-4034
> Project: Calcite
> Issue Type: New Feature
> Components: csv-adapter
> Affects Versions: 1.23.0
> Reporter: neoremind
> Assignee: Julian Hyde
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> Calcite’s InnoDB adapter allows you to query the data based on InnoDB data
> files directy, data files are also known as .ibd files, this adapter is
> different from JDBC adapter which requires a MySQL server to serve response.
> With .ibd files and the corresponding DDLs, InnoDB adapter is able to work
> like a simple "MySQL server", it accepts SQL query and attempts to compile
> the query based on InnoDB file accessing APIs provided by
> [innodb-java-reader|[https://github.com/alibaba/innodb-java-reader]] (I am
> the author of the library, which is well-tested and production-ready), it
> will exploit projecting, filtering and sorting directly in InnoDB data file
> where possible. What’s more, with DDLs, the adapter is "index aware", it
> leverages rules to choose the right index to scan, for example, using primary
> key or secondary keys to look up data, then it tries to push down some
> conditions into storage engine. Also, the adapter leaves option to provide
> hint as well, so that user can indicate the optimizer to force use one
> specific index.
> The InnoDB adapter can,
> 1) Full scan a table.
> 2) Scan by index such as primary key or secondary key, single column or
> composite keys.
> 3) Push down filtering condition to storage engine, support both point query
> and range query.
> 4) Push down projection.
> 5) Push down ordering if it matches the natural collation of an index.
> 6) Support almost all the commonly used data types.
> 7) Work as a simple MySQL server being able to read .ibd data files directly
> through file system, this can be a tool to query or dump table data by
> offloading from MySQL process under some conditions.
> Before I describe the adapter and its design, I was wondering if it is
> appropriate to deliver the work by contributing the codebase back to Calcite,
> or if it would be better to stay in another project, probably being referred
> from Calcite adapter link. Looking forward to Calcite developer's advice.
>
> Below is the first version of the usage.
> ----------
> [MySQL]([https://www.mysql.com/]) is the most popular Open Source SQL
> database management system, is developed, distributed, and supported by
> Oracle Corporation. InnoDB is a general-purpose storage engine that balances
> high reliability and high performance in MySQL, since 5.6 InnoDB has become
> the default MySQL storage engine.
> Calcite’s InnoDB adapter allows you to query the data based on InnoDB data
> files directy, data files are also known as .ibd files. It leverages
> [innodb-java-reader]([https://github.com/alibaba/innodb-java-reader]). This
> adapter is different from JDBC adapter which maps a schema in a JDBC data
> source and requires a MySQL server to serve response. With .ibd files and the
> corresponding DDLs, InnoDB adapter is able to work like a simple "MySQL
> server", it accepts SQL query and attempts to compile the query based on
> InnoDB file accessing APIs provided by innodb-java-reader, it will exploit
> projecting, filtering and sorting directly in InnoDB data file where possible.
> What’s more, with DDLs, the adapter is "index aware", it leverages rules to
> choose the right index to scan, for example, using primary key or secondary
> keys to look up data, then it tries to push down some conditions into storage
> engine. Also, the adapter leaves option to provide hint as well, so that user
> can indicate the optimizer to force use one specific index.
> A basic example of a model file is given below, this schema reads from a
> MySQL "scott" database:
> {code:java}
> {
> "version": "1.0",
> "defaultSchema": "scott",
> "schemas": [
> {
> "name": "scott",
> "type": "custom",
> "factory": "org.apache.calcite.adapter.innodb.InnodbSchemaFactory",
> "operand": {
> "sqlFilePath": [ "/path/scott.sql" ],
> "ibdDataFileBasePath": "/usr/local/mysql/data/scott"
> }
> }
> ]
> }
> {code}
> `sqlFilePath` is a list of DDL files, you can generate table definitions by
> executing `_mysqldump -d -u<username> -p<password> -h <hostname> <dbname>_`
> in command-line.
> The file content of `/path/scott.sql` is given below:
> {code:java}
> CREATE TABLE `DEPT`(
> `DEPTNO` TINYINT NOT NULL,
> `DNAME` VARCHAR(50) NOT NULL,
> `LOC` VARCHAR(20),
> UNIQUE KEY `DEPT_PK` (`DEPTNO`)
> )ENGINE=InnoDB DEFAULT CHARSET=latin1;
> CREATE TABLE `EMP`(
> `EMPNO` INT(11) NOT NULL,
> `ENAME` VARCHAR(100) NOT NULL,
> `JOB` VARCHAR(15) NOT NULL,
> `AGE` SMALLINT,
> `MGR` BIGINT,
> `HIREDATE` DATE,
> `SAL` DECIMAL(8,2) NOT NULL,
> `COMM` DECIMAL(6,2),
> `DEPTNO` TINYINT,
> `EMAIL` VARCHAR(100) DEFAULT NULL,
> `CREATE_DATETIME` DATETIME,
> `CREATE_TIME` TIME,
> `UPSERT_TIME` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
> CURRENT_TIMESTAMP,
> PRIMARY KEY (`EMPNO`),
> KEY `ENAME_KEY` (`ENAME`),
> KEY `HIREDATE_KEY` (`HIREDATE`),
> KEY `CREATE_DATETIME_JOB_KEY` (`CREATE_DATETIME`, `JOB`),
> KEY `CREATE_TIME_KEY` (`CREATE_TIME`),
> KEY `UPSERT_TIME_KEY` (`UPSERT_TIME`),
> KEY `DEPTNO_JOB_KEY` (`DEPTNO`, `JOB`),
> KEY `DEPTNO_SAL_COMM_KEY` (`DEPTNO`, `SAL`, `COMM`),
> KEY `DEPTNO_MGR_KEY` (`DEPTNO`, `MGR`),
> KEY `AGE_KEY` (`AGE`)
> )ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
> {code}
> ibdDataFileBasePath is the parent file path of .ibd files.
> Assuming the model file is stored as `model.json`, you can connect to InnoDB
> data file to performa query via
> [sqlline]([https://github.com/julianhyde/sqlline]) as follows:
> {code:java}
> sqlline> !connect jdbc:calcite:model=model.json admin admin{code}
> We can issue a simple query to query all employees by writing standard SQL:
> {code:java}
> sqlline> select empno,ename,job,age,mgr from "EMP";
> +-------+--------+-----------+-----+------+
> | EMPNO | ENAME | JOB | AGE | MGR |
> +-------+--------+-----------+-----+------+
> | 7369 | SMITH | CLERK | 30 | 7902 |
> | 7499 | ALLEN | SALESMAN | 24 | 7698 |
> | 7521 | WARD | SALESMAN | 41 | 7698 |
> | 7566 | JONES | MANAGER | 28 | 7839 |
> | 7654 | MARTIN | SALESMAN | 27 | 7698 |
> | 7698 | BLAKE | MANAGER | 38 | 7839 |
> | 7782 | CLARK | MANAGER | 32 | 7839 |
> | 7788 | SCOTT | ANALYST | 45 | 7566 |
> | 7839 | KING | PRESIDENT | 22 | null |
> | 7844 | TURNER | SALESMAN | 54 | 7698 |
> | 7876 | ADAMS | CLERK | 35 | 7788 |
> | 7900 | JAMES | CLERK | 40 | 7698 |
> | 7902 | FORD | ANALYST | 28 | 7566 |
> | 7934 | MILLER | CLERK | 32 | 7782 |
> +-------+--------+-----------+-----+------+
> {code}
> While executing this query, the InnoDB adapter scans the InnoDB data file
> `EMP.ibd` using primary key, also known as clustering B+ tree index in MySQL,
> and is able to
> push down projection to underlying storage engine. Projection can reduce the
> size of data fetched from the storage engine.
> We can look up one employee by filtering. The InnoDB adapter retrieves all
> indexes through DDL file provided in `model.json`.
> {code:java}
> sqlline> select empno,ename,job,age,mgr from "EMP" where empno = 7782;
> +-------+-------+---------+-----+------+
> | EMPNO | ENAME | JOB | AGE | MGR |
> +-------+-------+---------+-----+------+
> | 7782 | CLARK | MANAGER | 32 | 7839 |
> +-------+-------+---------+-----+------+
> {code}
> The InnoDB adapter is able to recognize that `empno` is the primary key and
> do a point query lookup by using clustering index instead of full table scan.
> We can do range query on primary key as well.
> {code:java}
> sqlline> select empno,ename,job,age,mgr from "EMP" where empno > 7782 and
> empno < 7900;{code}
> Note that such query with acceptable range is usually efficient in MySQL with
> InnoDB storage engine, because for clustering B+ tree index, records close in
> index are close in data file, which is good for scanning.
> We can look up employee by secondary key. For example, the filtering
> condition will be on a `VARCHAR` field `ename`.
> {code:java}
> sqlline> select empno,ename,job,age,mgr from "EMP" where ename = 'smith';
> +-------+-------+-------+-----+------+
> | EMPNO | ENAME | JOB | AGE | MGR |
> +-------+-------+-------+-----+------+
> | 7369 | SMITH | CLERK | 30 | 7902 |
> +-------+-------+-------+-----+------+
> {code}
> The InnoDB adapter works well on almost all the commonly used data types in
> MySQL, for more information on supported data types, please refer to
> [innodb-java-reader](https://github.com/alibaba/innodb-java-reader#3-features).
> We can query by composite key. For example, given secondary index of
> `DEPTNO_MGR_KEY`.
> {code:java}
> sqlline> select empno,ename,job,age,mgr from "EMP" where deptno = 20 and mgr
> = 7566;
> +-------+-------+---------+-----+------+
> | EMPNO | ENAME | JOB | AGE | MGR |
> +-------+-------+---------+-----+------+
> | 7788 | SCOTT | ANALYST | 45 | 7566 |
> | 7902 | FORD | ANALYST | 28 | 7566 |
> +-------+-------+---------+-----+------+
> {code}
> The InnoDB adapter will leverage the matched key `DEPTNO_MGR_KEY` to push
> down filtering condition of `deptno = 20 and mgr = 7566`.
> In some cases, only part of the conditions can be pushed down since there is
> a limitation in the underlying storage engine API, leaving unpushed remainder
> conditions in the rest of the plan. Given the below SQL, only `deptno = 20`
> is pushed down.
> {code:java}
> select empno,ename,job,age,mgr from "EMP" where deptno = 20 and upsert_time >
> '2018-01-01 00:00:00';
> {code}
> `innodb-java-reader` only supports range query with lower and upper bound
> using an index, not fully `Index Condition Pushdown (ICP)`. The storage
> engine returns a range of rows and Calcite will evaluates the rest of `WHERE`
> condition from the rows fetched.
> For the below SQL, there are multiple indexes satisfying the left-prefix
> index rule, the possible indexes are `DEPTNO_JOB_KEY`, `DEPTNO_SAL_COMM_KEY`
> and `DEPTNO_MGR_KEY`, the Innod adapter will choose one of them according to
> the ordinal defined in DDL, only `deptno = 20` condition is pushed down,
> leaving the rest of `WHERE` condition handled by Calcite built-in execution
> engine.
> {code:java}
> sqlline> select empno,deptno,sal from "EMP" where deptno = 20 and sal > 2000;
> +-------+--------+---------+
> | EMPNO | DEPTNO | SAL |
> +-------+--------+---------+
> | 7788 | 20 | 3000.00 |
> | 7902 | 20 | 3000.00 |
> | 7566 | 20 | 2975.00 |
> +-------+--------+---------+
> {code}
> Accessing rows through secondary key requires scanning by secondary index and
> retrieving records back to clustering index in InnoDB, for a "big" scan, that
> would introduce many random I/O operations, so performance is usually not
> good enough. Note that the query above can be more performant by using
> `EPTNO_SAL_COMM_KEY` index, because covering index does not need to retrieve
> back to clustering index. We can force using `DEPTNO_SAL_COMM_KEY` index by
> hint as below.
> {code:java}
> sqlline> select empno,ename,job,age,sal from "emp"/*+
> index(DEPTNO_SAL_COMM_KEY) */ where deptno = 20 and sal > 2000;
> {code}
> Hint can be configured in `SqlToRelConverter`, to enable hint, you should
> register `index` HintStrategy on `TableScan` in
> `SqlToRelConverter.ConfigBuilder`. Index hint takes effect on the base
> `TableScan` relational node, if there are conditions matching the index,
> index condition can be pushed down as well. For the below SQL, although none
> of the indexes can be used, but by leveraging covering index, the performance
> is better than full table scan, we can force to use `DEPTNO_MGR_KEY` to scan
> in secondary index.
> {code:java}
> sqlline> select empno,mgr from "EMP"/*+ index(DEPTNO_MGR_KEY) */ where mgr =
> 7839;
> {code}
> Ordering can be pushed down if it matches the natural collation of the index
> used.
> {code:java}
> sqlline> select deptno,ename,hiredate from "EMP" where hiredate <
> '2020-01-01' order by hiredate desc;
> +--------+--------+------------+
> | DEPTNO | ENAME | HIREDATE |
> +--------+--------+------------+
> | 20 | ADAMS | 1987-05-23 |
> | 20 | SCOTT | 1987-04-19 |
> | 10 | MILLER | 1982-01-23 |
> | 20 | FORD | 1981-12-03 |
> | 30 | JAMES | 1981-12-03 |
> | 10 | KING | 1981-11-17 |
> | 30 | MARTIN | 1981-09-28 |
> | 30 | TURNER | 1981-09-08 |
> | 10 | CLARK | 1981-06-09 |
> | 30 | WARD | 1981-02-22 |
> | 30 | ALLEN | 1981-02-20 |
> | 20 | JONES | 1981-02-04 |
> | 30 | BLAKE | 1981-01-05 |
> | 20 | SMITH | 1980-12-17 |
> +--------+--------+------------+
> {code}
> *Limitations*
> `innodb-java-reader` has some prerequisites for `.ibd` files, please refer to
> [Prerequisites](https://github.com/alibaba/innodb-java-reader#2-prerequisites).
> You can think of the adapter as a simple MySQL server, with the ability to
> query, dump data by offloading from MySQL process under some conditions. If
> pages are not flushed from InnoDB Buffer Pool to disk, then the result may be
> inconsistent (the LSN in `.ibd` file might smaller than in-memory pages).
> InnoDB leverages write ahead log in terms of performance, so there is no
> command available to flush all dirty pages. Only internal mechanism manages
> when and where to persist pages to disk, like Page Cleaner thread, adaptive
> flushing, etc.
> Currently the InnoDB adapter does not aware row count and cardinality of a
> `.ibd` data file, so it will only rely on simple rules to perform
> optimization, once underlying storage engine could provide such metrics and
> metadata, this can be integrated in Calcite by leveraging cost based
> optimization in the future.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)