This is an automated email from the ASF dual-hosted git repository.
twice pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/kvrocks-website.git
The following commit(s) were added to refs/heads/main by this push:
new 38acf1d Add user documentation for Kvrocks Search (#257)
38acf1d is described below
commit 38acf1d15843fe9b47bdd81c8e5c3eb7b2de9721
Author: Twice <[email protected]>
AuthorDate: Sun Nov 10 17:36:02 2024 +0800
Add user documentation for Kvrocks Search (#257)
---
docs/kvrocks-search.md | 269 +++++++++++++++++++++++++++++++++++++++++++++++++
sidebars.js | 1 +
2 files changed, 270 insertions(+)
diff --git a/docs/kvrocks-search.md b/docs/kvrocks-search.md
new file mode 100644
index 0000000..300373b
--- /dev/null
+++ b/docs/kvrocks-search.md
@@ -0,0 +1,269 @@
+# Search
+
+**Apache Kvrocks™** Search, also known as **Kvrocks Search** (or KQIR, as a
technical term), is an internal component of Apache Kvrocks™. It functions as a
query engine that supports (secondary) indexing on structured data and complex
queries by effectively utilizing various indexes.
+
+In addition to being compatible with many commands and the query syntax of
[RediSearch](https://redis.io/docs/latest/develop/interact/search-and-query/)
(e.g. [FT.CREATE](#ftcreate) and [FT.SEARCH](#ftsearch)), Kvrocks Search also
offers support for SQL syntax to accommodate various scenarios (via
[FT.SEARCHSQL](#ftsearchsql-extension) and other related commands).
+
+Kvrocks Search is currently in the experimental stage and only available on
the `unstable` branch. We do not provide compatibility guarantees at this time.
If you encounter any problems, please submit them to [GitHub
issues](https://github.com/apache/kvrocks/issues).
+
+For its implementation details, please refer to [this blog
post](../blog/kqir-query-engine).
+
+## Supported Commands
+
+Currently, Kvrocks has supported some of the main commands in RediSearch,
these commands are mostly used for creating indexes, managing indexes (listing,
showing details, deleting), and querying.
+
+### FT.SEARCH
+
+```
+FT.SEARCH index query
+ [RETURN count identifier [ identifier ...]]
+ [SORTBY sortby [ ASC | DESC]]
+ [LIMIT offset num]
+ [PARAMS nargs name value [ name value ...]]
+```
+
+`FT.SEARCH` is to perform a `query` (in RediSearch query syntax) on a given
`index` (created by `FT.CREATE`).
+
+Additional parameters:
+- `RETURN` to control which fields will be presented in the output;
+- `SORTBY` to control the order of rows in the output (same as `ORDER BY` in
SQL);
+- `LIMIT` to control how many rows and the offset of actual results in the
output;
+- `PARAMS` to supply additional information to the parameterized query.
+
+Please refer to [here](#redisearch-query-syntax) to check available syntax of
`query`.
+
+### FT.EXPLAIN
+
+```
+FT.EXPLAIN index query
+ [RETURN count identifier [ identifier ...]]
+ [SORTBY sortby [ ASC | DESC]]
+ [LIMIT offset num]
+ [PARAMS nargs name value [ name value ...]]
+```
+
+`FT.EXPLAIN` is to obtain a plan on how Kvrocks will execute the `query`
(a.k.a. the query plan).
+
+### FT.CREATE
+
+```
+FT.CREATE index
+ [ON HASH | JSON]
+ [PREFIX count prefix [prefix ...]]
+ SCHEMA field_name TAG | NUMERIC | VECTOR [FIELD PROPERTIES ...] [NOINDEX]
+ [ field_name TAG | NUMERIC | VECTOR [FIELD PROPERTIES ...] [NOINDEX]
+ ...]
+```
+
+`FT.CREATE` is to create a new `index` with a given schema.
+
+Addtional parameters:
+- `ON HASH | JSON`: the data type of keys to be indexed;
+- `PREFIX`: the prefix of keys to be indexed.
+
+Schema details:
+- `field_name`: name of the field, multiple of which an index is composed of;
+- `TAG | NUMERIC | VECTOR`: currently only these 3 types of fields is
supported;
+- `FIELD PROPERTIES`: additional properties of this field; depends on the
field type;
+- `NOINDEX`: do not indexing data on this field (just for filtering data on
queries).
+
+### FT.DROPINDEX
+
+```
+FT.DROPINDEX index
+```
+
+`FT.DROPINDEX` is to drop the given `index` to delete all indexing data and
index information.
+
+### FT._LIST
+
+```
+FT._LIST
+```
+
+`FT._LIST` is to list names of all indexes (in the current namespace).
+
+### FT.INFO
+
+```
+FT.INFO index
+```
+
+`FT.INFO` is to obtain detailed information of the given `index`.
+
+The output format of this command is like:
+
+```
+1) index_name
+2) ...
+3) index_definition
+4) 1) key_type
+ 2) ...
+ 3) prefixes
+ 4) 1) ...
+ 2) ...
+5) fields
+6) 1) 1) identifier
+ 2) ...
+ 3) type
+ 4) "tag"
+ 5) options
+ 6) ...
+ 2) 1) identifier
+ 2) ...
+ 3) type
+ 4) "numeric"
+ 5) options
+ 6) ...
+ 3) ...
+```
+
+Note that the output format may change as Kvrocks Search is currently
experimental.
+
+### FT.SEARCHSQL (extension)
+
+```
+FT.SEARCHSQL sql
+ [PARAMS nargs name value [ name value ...]]
+```
+
+`FT.SEARCHSQL` is to perform a `sql` query on an index created by `FT.CREATE`.
+
+Additional parameters:
+- `PARAMS` to supply additional information to the parameterized query.
+
+### FT.EXPLAINSQL (extension)
+
+```
+FT.EXPLAINSQL sql
+ [PARAMS nargs name value [ name value ...]]
+ [SIMPLE | DOT]
+```
+
+`FT.EXPLAINSQL` is to obtain a plan on how Kvrocks will execute the `sql`
query (a.k.a. the query plan).
+
+Additional parameters:
+- `PARAMS`: same as in `FT.SEARCHSQL`;
+- `SIMPLE`: print a simple representation of the query plan;
+- `DOT`: print the query plan in Graphviz
[DOT](https://en.wikipedia.org/wiki/DOT_(graph_description_language)) format
(which can be used to generate a graphical representation of a directed graph).
+
+## SQL syntax
+
+Currently Kvrocks supports an extended subset of the MySQL query syntax, in
particular the `SELECT` statement:
+
+```
+SELECT
+ * | field [, field ...]
+FROM index_name
+WHERE query_expr
+ORDER BY
+ field_name [ASC | DESC] | vec_field <-> vec < range
+LIMIT [offset] count
+```
+
+where the query expression `query_expr` can be:
+
+```
+true | false |
+(query_expr) |
+query_expr AND query_expr |
+query_expr OR query_expr |
+NOT query_expr |
+tag_field HASTAG tag |
+num_atom NUM_OP num_atom |
+vec_field <-> vec < range
+```
+
+where the numeric operation `NUM_OP` can be:
+
+```
+< | <= | > | >= | !=
+```
+
+and the `num_atom` can be:
+
+```
+num_field | num_literal
+```
+
+Also, these literals inside the query in can be parameters `@param_name`,
+e.g. `a < 233` can be `a < @num` with `PARAMS 1 num 233` supplied to the
`FT.SEARCHSQL`.
+
+## RediSearch query syntax
+
+Currently Kvrocks also supports a subset of [the RediSearch query
syntax](https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/query_syntax/).
+
+RediSearch controls the evolution of the query syntax through [dialect
versioning](https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/dialects/).
+Currently, Kvrocks supports `DIALECT 2`.
+And in future developments, we may support higher versions of dialect
(currently, 3 and 4), but `DIALECT 1` is NOT considered for support.
+
+The followings are the query clauses currently supported in Kvrocks, and you
can compose them via `clause | clause` (OR), `clause clause` (AND) and
`-clause` (NOT):
+- `*`, i.e. `true` in SQL;
+- `@num_field:[NUM_BOUND NUM_BOUND]`, e.g. `@a:[1 (3]` means `a >= 1 and a <
3`;
+- `@tag_field:{tag [|tag ...]}`, e.g. `@b:{x | y}` means `b hastag x or b
hastag y`;
+- `@vec_field:[VECTOR_RANGE range $vec]` for vector range query.
+
+where `NUM_BOUND` can be:
+```
+ num
+| (num
+| INF
+| +INF
+| -INF
+```
+
+Also KNN query without prefiltering is supported:
+```
+* => [KNN n @vec_field $vec]
+```
+
+Also, these literals inside the query in can be parameters `$param_name`,
+e.g. `@a:[inf 233]` can be `@a:[inf $num]` with `PARAMS 1 num 233` supplied to
the `FT.SEARCH`.
+
+## Field types
+
+An index in RediSearch consists of multiple fields, and fields can be in
different types.
+Currently, Kvrocks supports three field types:
+- `TAG`: a tag field can hold a set of string tags, to filter rows by specific
tags in queries;
+- `NUMERIC`: a numeric field can hold a floating point number;
+- `VECTOR`: a vector field can hold a vector, for performing vector search.
+
+### Tag
+
+Field properties:
+```
+SCHEMA field_name TAG
+ [SEPARATOR sep]
+ [CASESENSITIVE]
+```
+
+By default, the `SEPARATOR` is `,` and `CASESENSITIVE` is not set.
+
+The only operation for tag field in queries is to check if a row is labeled by
tag, i.e. `tag_field HASTAG tag` in SQL.
+
+### Numeric
+
+Numeric field has no field properties, i.e.
+```
+SCHEMA field_name NUMERIC
+```
+
+As shown in the query syntax, numeric fields can be used in numeric comparison
to filter data.
+
+### Vector
+
+Field properties:
+```
+SCHEMA field_name VECTOR HNSW nargs
+ TYPE FLOAT64
+ DIM dim
+ DISTANCE_METRIC L2 | IP | COSINE
+ [M m]
+ [EF_CONSTRUCTION ef_construcion]
+ [EF_RUNTIME ef_runtime]
+ [EPSILON epsilon]
+```
+
+Currently the indexing algorithm of vector field can only be `HNSW`,
+and the `TYPE` of HNSW vector field can only be `FLOAT64`.
+We may extend it to more types like `FLOAT32` and `FLOAT16`.
diff --git a/sidebars.js b/sidebars.js
index 8c62300..9f7ee9f 100644
--- a/sidebars.js
+++ b/sidebars.js
@@ -4,6 +4,7 @@ const sidebars = {
'getting-started',
'namespace',
'cluster',
+ 'kvrocks-search',
'replication',
{
"type": "category",