This is an automated email from the ASF dual-hosted git repository.
liuxiaocs pushed a commit to branch docusaurus
in repository https://gitbox.apache.org/repos/asf/incubator-hugegraph-doc.git
The following commit(s) were added to refs/heads/docusaurus by this push:
new f473564f chore: migrate guide pages
f473564f is described below
commit f473564f9bba44d579059fd40b69408b05568843
Author: liuxiao <[email protected]>
AuthorDate: Mon Apr 15 09:31:36 2024 +0800
chore: migrate guide pages
---
docs/guide/_category_.json | 4 +
docs/guide/architectural.md | 29 +++
docs/guide/backup-restore.md | 159 ++++++++++++++
docs/guide/custom-plugin.md | 320 +++++++++++++++++++++++++++++
docs/guide/desgin-concept.md | 218 ++++++++++++++++++++
docs/guide/faq.md | 101 +++++++++
docs/guide/security.md | 24 +++
static/img/guide/GraphCut.png | Bin 0 -> 37174 bytes
static/img/guide/PropertyGraph.png | Bin 0 -> 126622 bytes
static/img/guide/architectural-revised.png | Bin 0 -> 338402 bytes
10 files changed, 855 insertions(+)
diff --git a/docs/guide/_category_.json b/docs/guide/_category_.json
new file mode 100644
index 00000000..811adbcc
--- /dev/null
+++ b/docs/guide/_category_.json
@@ -0,0 +1,4 @@
+{
+ "label": "Guide",
+ "position": 5
+}
\ No newline at end of file
diff --git a/docs/guide/architectural.md b/docs/guide/architectural.md
new file mode 100644
index 00000000..559d94e6
--- /dev/null
+++ b/docs/guide/architectural.md
@@ -0,0 +1,29 @@
+---
+id: 'architecture-overview'
+title: 'HugeGraph Architecture Overview'
+sidebar_label: 'Architecture Overview'
+sidebar_position: 1
+---
+
+### 1 Overview
+
+As a general-purpose graph database product, HugeGraph needs to possess basic
graph database functionality. HugeGraph supports two types of graph
computation: OLTP and OLAP. For OLTP, it implements the [Apache
TinkerPop3](https://tinkerpop.apache.org) framework and supports the
[Gremlin](https://tinkerpop.apache.org/gremlin.html) and
[Cypher](https://en.wikipedia.org/wiki/Cypher) query languages. It comes with a
complete application toolchain and provides a plugin-based backend storage d
[...]
+
+Below is the overall architecture diagram of HugeGraph:
+
+
+
+HugeGraph consists of three layers of functionality: the application layer,
the graph engine layer, and the storage layer.
+
+- Application Layer:
+ - [Hubble](/docs/quickstart/hugegraph-hubble/): An all-in-one visual
analytics platform that covers the entire process of data modeling, rapid data
import, online and offline analysis of data, and unified management of graphs.
It provides a guided workflow for operating graph applications.
+ - [Loader](/docs/quickstart/hugegraph-loader/): A data import component that
can transform data from various sources into vertices and edges and bulk import
them into the graph database.
+ - [Tools](/docs/quickstart/hugegraph-tools/): Command-line tools for
deploying, managing, and backing up/restoring data in HugeGraph.
+ - [Computer](/docs/quickstart/hugegraph-computer/): A distributed graph
processing system (OLAP) that implements
[Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It can run on
Kubernetes.
+ - [Client](/docs/quickstart/hugegraph-client/): HugeGraph client written in
Java. Users can use the client to operate HugeGraph using Java code. Support
for other languages such as Python, Go, and C++ may be provided in the future.
+- [Graph Engine Layer](/docs/quickstart/hugegraph-server/):
+ - REST Server: Provides a RESTful API for querying graph/schema information,
supports the [Gremlin](https://tinkerpop.apache.org/gremlin.html) and
[Cypher](https://en.wikipedia.org/wiki/Cypher) query languages, and offers APIs
for service monitoring and operations.
+ - Graph Engine: Supports both OLTP and OLAP graph computation types, with
OLTP implementing the [Apache TinkerPop3](https://tinkerpop.apache.org)
framework.
+ - Backend Interface: Implements the storage of graph data to the backend.
+- Storage Layer:
+ - Storage Backend: Supports multiple built-in storage backends
(RocksDB/MySQL/HBase/...) and allows users to extend custom backends without
modifying the existing source code.
diff --git a/docs/guide/backup-restore.md b/docs/guide/backup-restore.md
new file mode 100644
index 00000000..d6abef12
--- /dev/null
+++ b/docs/guide/backup-restore.md
@@ -0,0 +1,159 @@
+---
+id: 'backup-restore'
+title: 'Backup and Restore'
+sidebar_label: 'Backup Restore'
+sidebar_position: 4
+---
+
+## Description
+
+Backup and Restore are functions of backup map and restore map. The data
backed up and restored includes metadata (schema) and graph data (vertex and
edge).
+
+#### Backup
+
+Export the metadata and graph data of a graph in the HugeGraph system in JSON
format.
+
+#### Restore
+
+Re-import the data in JSON format exported by Backup to a graph in the
HugeGraph system.
+
+Restore has two modes:
+
+- In Restoring mode, the metadata and graph data exported by Backup are
restored to the HugeGraph system intact. It can be used for graph backup and
recovery, and the general target graph is a new graph (without metadata and
graph data). for example:
+ - System upgrade, first back up the map, then upgrade the system, and
finally restore the map to the new system
+ - Graph migration, from a HugeGraph system, use the Backup function to
export the graph, and then use the Restore function to import the graph into
another HugeGraph system
+- In the Merging mode, the metadata and graph data exported by Backup are
imported into another graph that already has metadata or graph data. During the
process, the ID of the metadata may change, and the IDs of vertices and edges
will also change accordingly.
+ - Can be used to merge graphs
+
+## Instructions
+
+You can use [hugegraph-tools](/docs/quickstart/hugegraph-tools) to backup and
restore graphs.
+
+#### Backup
+
+```bash
+bin/hugegraph backup -t all -d data
+```
+
+This command backs up all the metadata and graph data of the hugegraph graph
of http://127.0.0.1 to the data directory.
+
+> Backup works fine in all three graph modes
+
+#### Restore
+
+Restore has two modes: RESTORING and MERGING. Before backup, you must first
set the graph mode according to your needs.
+
+##### Step 1: View and set graph mode
+
+```bash
+bin/hugegraph graph-mode-get
+```
+This command is used to view the current graph mode, including: NONE,
RESTORING, MERGING.
+
+```bash
+bin/hugegraph graph-mode-set -m RESTORING
+```
+
+This command is used to set the graph mode. Before Restore, it can be set to
RESTORING or MERGING mode. In the example, it is set to RESTORING.
+
+##### Step 2: Restore data
+
+```bash
+bin/hugegraph restore -t all -d data
+```
+This command re-imports all metadata and graph data in the data directory to
the hugegraph graph at http://127.0.0.1.
+
+##### Step 3: Restoring Graph Mode
+
+```bash
+bin/hugegraph graph-mode-set -m NONE
+```
+This command is used to restore the graph mode to NONE.
+
+So far, a complete graph backup and graph recovery process is over.
+
+#### help
+
+For detailed usage of backup and restore commands, please refer to the
[hugegraph-tools documentation](/docs/quickstart/hugegraph-tools).
+
+## API description for Backup/Restore usage and implementation
+
+#### Backup
+
+Backup uses the corresponding list(GET) API export of metadata and graph data,
and no new API is added.
+
+#### Restore
+
+Restore uses the corresponding create(POST) API imports for metadata and graph
data, and does not add new APIs.
+
+There are two different modes for Restore: Restoring and Merging. In addition,
there is a regular mode of NONE (default), the differences are as follows:
+
+- In None mode, the writing of metadata and graph data is normal, please refer
to the function description. special:
+ - ID is not allowed when metadata (schema) is created
+ - Graph data (vertex) is not allowed to specify an ID when the id strategy
is Automatic
+- Restoring mode, restoring to a new graph, in particular:
+ - ID is allowed to be specified when metadata (schema) is created
+ - Graph data (vertex) allows specifying an ID when the id strategy is
Automatic
+- Merging mode, merging into a graph with existing metadata and graph data, in
particular:
+ - ID is not allowed when metadata (schema) is created
+ - Graph data (vertex) allows specifying an ID when the id strategy is
Automatic
+
+
+Normally, the graph mode is None. When you need to restore the graph, you need
to temporarily change the graph mode to Restoring mode or
+Merging mode as needed, and when the Restore is completed, restore the graph
mode to None.
+
+The implemented RESTful API for setting graph mode is as follows:
+
+##### View the schema of a graph. **This operation requires administrator
privileges**
+
+###### Method & Url
+
+```
+GET http://localhost:8080/graphs/{graph}/mode
+```
+
+###### Response Status
+
+```json
+200
+```
+
+###### Response Body
+
+```json
+{
+ "mode": "NONE"
+}
+```
+
+> Legal graph modes include: NONE, RESTORING, MERGING
+
+##### Set the mode of a graph. ""This operation requires administrator
privileges**
+
+###### Method & Url
+
+```
+PUT http://localhost:8080/graphs/{graph}/mode
+```
+
+###### Request Body
+
+```
+"RESTORING"
+```
+
+> Legal graph modes include: NONE, RESTORING, MERGING
+
+###### Response Status
+
+```json
+200
+```
+
+###### Response Body
+
+```json
+{
+ "mode": "RESTORING"
+}
+```
diff --git a/docs/guide/custom-plugin.md b/docs/guide/custom-plugin.md
new file mode 100644
index 00000000..fba40928
--- /dev/null
+++ b/docs/guide/custom-plugin.md
@@ -0,0 +1,320 @@
+---
+id: 'hugegraph-plugin'
+title: 'HugeGraph Plugin mechanism and plug-in extension process'
+sidebar_label: 'HugeGraph Plugin'
+sidebar_position: 3
+---
+
+### Background
+
+1. HugeGraph is not only open source and open, but also simple and easy to
use. General users can easily add plug-in extension functions without changing
the source code.
+2. HugeGraph supports a variety of built-in storage backends, and also allows
users to extend custom backends without changing the existing source code.
+3. HugeGraph supports full-text search. The full-text search function involves
word segmentation in various languages. Currently, there are 8 built-in Chinese
word
+breakers, and it also allows users to expand custom word breakers without
changing the existing source code.
+
+### Scalable dimension
+
+Currently, the plug-in method provides extensions in the following dimensions:
+
+- backend storage
+- serializer
+- Custom configuration items
+- tokenizer
+
+### Plug-in implementation mechanism
+
+1. HugeGraph provides a plug-in interface HugeGraphPlugin, which supports
plug-in through the Java SPI mechanism
+2. HugeGraph provides four extension registration functions:
registerOptions(), registerBackend(), registerSerializer(),registerAnalyzer()
+3. The plug-in implementer implements the corresponding Options, Backend,
Serializer or Analyzer interface
+4. The plug-in implementer implements register()the method of the
HugeGraphPlugin interface, registers the specific
+implementation class listed in the above point 3 in this method, and packs it
into a jar package
+5. The plug-in user puts the jar package in the HugeGraph Server installation
directory plugins, modifies the relevant
+configuration items to the plug-in custom value, and restarts to take effect
+
+### Plug-in implementation process example
+
+#### 1 Create a new maven project
+
+##### 1.1 Name the project name: hugegraph-plugin-demo
+
+##### 1.2 Add `hugegraph-core` Jar package dependencies
+
+The details of maven pom.xml are as follows:
+
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>org.apache.hugegraph</groupId>
+ <artifactId>hugegraph-plugin-demo</artifactId>
+ <version>1.0.0</version>
+ <packaging>jar</packaging>
+
+ <name>hugegraph-plugin-demo</name>
+
+ <dependencies>
+ <dependency>
+ <groupId>org.apache.hugegraph</groupId>
+ <artifactId>hugegraph-core</artifactId>
+ <version>${project.version}</version>
+ </dependency>
+ </dependencies>
+</project>
+
+```
+
+#### 2 Realize extended functions
+
+##### 2.1 Extending a custom backend
+
+###### 2.1.1 Implement the interface BackendStoreProvider
+
+- Realizable interfaces:
`org.apache.hugegraph.backend.store.BackendStoreProvider`
+- Or inherit an abstract
class:`org.apache.hugegraph.backend.store.AbstractBackendStoreProvider`
+
+Take the RocksDB backend RocksDBStoreProvider as an example:
+
+```java
+public class RocksDBStoreProvider extends AbstractBackendStoreProvider {
+
+ protected String database() {
+ return this.graph().toLowerCase();
+ }
+
+ @Override
+ protected BackendStore newSchemaStore(String store) {
+ return new RocksDBSchemaStore(this, this.database(), store);
+ }
+
+ @Override
+ protected BackendStore newGraphStore(String store) {
+ return new RocksDBGraphStore(this, this.database(), store);
+ }
+
+ @Override
+ public String type() {
+ return "rocksdb";
+ }
+
+ @Override
+ public String version() {
+ return "1.0";
+ }
+}
+```
+
+###### 2.1.2 Implement interface BackendStore
+
+The BackendStore interface is defined as follows:
+
+```java
+public interface BackendStore {
+ // Store name
+ public String store();
+
+ // Database name
+ public String database();
+
+ // Get the parent provider
+ public BackendStoreProvider provider();
+
+ // Open/close database
+ public void open(HugeConfig config);
+ public void close();
+
+ // Initialize/clear database
+ public void init();
+ public void clear();
+
+ // Add/delete data
+ public void mutate(BackendMutation mutation);
+
+ // Query data
+ public Iterator<BackendEntry> query(Query query);
+
+ // Transaction
+ public void beginTx();
+ public void commitTx();
+ public void rollbackTx();
+
+ // Get metadata by key
+ public <R> R metadata(HugeType type, String meta, Object[] args);
+
+ // Backend features
+ public BackendFeatures features();
+
+ // Generate an id for a specific type
+ public Id nextId(HugeType type);
+}
+```
+
+###### 2.1.3 Extending custom serializers
+
+The serializer must inherit the abstract class:
`org.apache.hugegraph.backend.serializer.AbstractSerializer`
+( `implements GraphSerializer, SchemaSerializer`) The main interface is
defined as follows:
+
+```java
+public interface GraphSerializer {
+ public BackendEntry writeVertex(HugeVertex vertex);
+ public BackendEntry writeVertexProperty(HugeVertexProperty<?> prop);
+ public HugeVertex readVertex(HugeGraph graph, BackendEntry entry);
+ public BackendEntry writeEdge(HugeEdge edge);
+ public BackendEntry writeEdgeProperty(HugeEdgeProperty<?> prop);
+ public HugeEdge readEdge(HugeGraph graph, BackendEntry entry);
+ public BackendEntry writeIndex(HugeIndex index);
+ public HugeIndex readIndex(HugeGraph graph, ConditionQuery query,
BackendEntry entry);
+ public BackendEntry writeId(HugeType type, Id id);
+ public Query writeQuery(Query query);
+}
+
+public interface SchemaSerializer {
+ public BackendEntry writeVertexLabel(VertexLabel vertexLabel);
+ public VertexLabel readVertexLabel(HugeGraph graph, BackendEntry entry);
+ public BackendEntry writeEdgeLabel(EdgeLabel edgeLabel);
+ public EdgeLabel readEdgeLabel(HugeGraph graph, BackendEntry entry);
+ public BackendEntry writePropertyKey(PropertyKey propertyKey);
+ public PropertyKey readPropertyKey(HugeGraph graph, BackendEntry entry);
+ public BackendEntry writeIndexLabel(IndexLabel indexLabel);
+ public IndexLabel readIndexLabel(HugeGraph graph, BackendEntry entry);
+}
+```
+
+###### 2.1.4 Extend custom configuration items
+
+When adding a custom backend, it may be necessary to add new configuration
items. The implementation process mainly includes:
+
+- Add a configuration item container class and implement the interface
`org.apache.hugegraph.config.OptionHolder`
+- Provide a singleton method `public static OptionHolder instance()`, and call
the method when the object is initialized `OptionHolder.registerOptions()`
+- Add configuration item declaration, single-value configuration item type is
`ConfigOption`, multi-value configuration item type is `ConfigListOption`
+
+Take the RocksDB configuration item definition as an example:
+
+```java
+public class RocksDBOptions extends OptionHolder {
+
+ private RocksDBOptions() {
+ super();
+ }
+
+ private static volatile RocksDBOptions instance;
+
+ public static synchronized RocksDBOptions instance() {
+ if (instance == null) {
+ instance = new RocksDBOptions();
+ instance.registerOptions();
+ }
+ return instance;
+ }
+
+ public static final ConfigOption<String> DATA_PATH =
+ new ConfigOption<>(
+ "rocksdb.data_path",
+ "The path for storing data of RocksDB.",
+ disallowEmpty(),
+ "rocksdb-data"
+ );
+
+ public static final ConfigOption<String> WAL_PATH =
+ new ConfigOption<>(
+ "rocksdb.wal_path",
+ "The path for storing WAL of RocksDB.",
+ disallowEmpty(),
+ "rocksdb-data"
+ );
+
+ public static final ConfigListOption<String> DATA_DISKS =
+ new ConfigListOption<>(
+ "rocksdb.data_disks",
+ false,
+ "The optimized disks for storing data of RocksDB. " +
+ "The format of each element: `STORE/TABLE:
/path/to/disk`." +
+ "Allowed keys are [graph/vertex, graph/edge_out,
graph/edge_in, " +
+ "graph/secondary_index, graph/range_index]",
+ null,
+ String.class,
+ ImmutableList.of()
+ );
+}
+```
+
+##### 2.2 Extend custom tokenizer
+
+The tokenizer needs to implement the interface
`org.apache.hugegraph.analyzer.Analyzer`, take implementing a SpaceAnalyzer
space tokenizer as an example.
+
+```java
+package org.apache.hugegraph.plugin;
+
+import java.util.Arrays;
+import java.util.HashSet;
+import java.util.Set;
+
+import org.apache.hugegraph.analyzer.Analyzer;
+
+public class SpaceAnalyzer implements Analyzer {
+
+ @Override
+ public Set<String> segment(String text) {
+ return new HashSet<>(Arrays.asList(text.split(" ")));
+ }
+}
+```
+
+#### 3. Implement the plug-in interface and register it
+
+The plug-in registration entry is `HugeGraphPlugin.register()`, the custom
plug-in must implement this interface method, and register the extension
+items defined above inside it. The interface
`org.apache.hugegraph.plugin.HugeGraphPlugin` is defined as follows:
+
+```java
+public interface HugeGraphPlugin {
+
+ public String name();
+
+ public void register();
+
+ public String supportsMinVersion();
+
+ public String supportsMaxVersion();
+}
+```
+
+And HugeGraphPlugin provides 4 static methods for registering extensions:
+
+- registerOptions(String name, String classPath): register configuration items
+- registerBackend(String name, String classPath): register backend
(BackendStoreProvider)
+- registerSerializer(String name, String classPath): register serializer
+- registerAnalyzer(String name, String classPath): register tokenizer
+
+
+The following is an example of registering the SpaceAnalyzer tokenizer:
+
+```java
+package org.apache.hugegraph.plugin;
+
+public class DemoPlugin implements HugeGraphPlugin {
+
+ @Override
+ public String name() {
+ return "demo";
+ }
+
+ @Override
+ public void register() {
+ HugeGraphPlugin.registerAnalyzer("demo",
SpaceAnalyzer.class.getName());
+ }
+}
+```
+
+#### 4. Configure SPI entry
+
+1. Make sure the services directory exists:
hugegraph-plugin-demo/resources/META-INF/services
+2. Create a text file in the services directory:
org.apache.hugegraph.plugin.HugeGraphPlugin
+3. The content of the file is as follows:
org.apache.hugegraph.plugin.DemoPlugin
+
+#### 5. Make Jar package
+
+Through maven packaging, execute the command in the project directory mvn
package, and a Jar package file will be generated in the
+target directory. Copy the Jar package to the `plugins` directory when using
it, and restart the service to take effect.
diff --git a/docs/guide/desgin-concept.md b/docs/guide/desgin-concept.md
new file mode 100644
index 00000000..16069530
--- /dev/null
+++ b/docs/guide/desgin-concept.md
@@ -0,0 +1,218 @@
+---
+id: 'design-concepts'
+title: 'HugeGraph Design Concepts'
+sidebar_label: 'Design Concepts'
+sidebar_position: 2
+---
+
+
+### 1. Property Graph
+There are two common graph data representation models, namely the RDF
(Resource Description Framework) model and the Property Graph (Property Graph)
model.
+Both RDF and Property Graph are the most basic and well-known graph
representation modes, and both can represent entity-relationship modeling of
various graphs.
+RDF is a W3C standard, while Property Graph is an industry standard and is
widely supported by graph database vendors. HugeGraph currently uses Property
Graph.
+
+The storage concept model corresponding to HugeGraph is also designed with
reference to Property Graph. For specific examples, see the figure below:
+( This figure is outdated for the old version design, please ignore it and
update it later )
+
+
+
+Inside HugeGraph, each vertex/edge is identified by a unique VertexId/EdgeId,
and the attributes are stored inside the corresponding vertex/edge.
+The relationship/mapping between vertices is stored through edges.
+
+When the vertex attribute value is stored by edge pointer, if you want to
update a vertex-specific attribute value, you can directly write it by
overwriting.
+The disadvantage is that the VertexId is redundantly stored; if you want to
update the attribute of the relationship, you need to use the read-and-modify
method ,
+read all attributes first, modify some attributes, and then write to the
storage system, the update efficiency is low. According to experience, there
are more
+requirements for modifying vertex attributes, but less for edge attributes.
For example, calculations such as PageRank and Graph Cluster require frequent
+modification of vertex attribute values.
+
+### 2. Graph Partition Scheme
+For distributed graph databases, there are two partition storage methods for
graphs: Edge Cut and Vertex Cut, as shown in the following figure. When using
the
+Edge Cut method to store graphs, any vertex will only appear on one machine,
while edges may be distributed on different machines. This storage method may
lead
+to multiple storage of edges. When using the Vertex Cut method to store
graphs, any edge will only appear on one machine, and each same point may be
distributed
+to different machines. This storage method may result in multiple storage of
vertices.
+
+
+
+The EdgeCut partition scheme can support high-performance insert and update
operations, while the VertexCut partition scheme is more suitable for static
graph query
+analysis, so EdgeCut is suitable for OLTP graph query, and VertexCut is more
suitable for OLAP graph query. HugeGraph currently adopts the partition scheme
of EdgeCut.
+
+### 3. VertexId Strategy
+
+Vertex of HugeGraph supports three ID strategies. Different VertexLabels in
the same graph database can use different Id strategies. Currently, the Id
strategies
+supported by HugeGraph are:
+
+- Automatic generation (AUTOMATIC): Use the Snowflake algorithm to
automatically generate a globally unique Id, Long type;
+- Primary Key (PRIMARY_KEY): Generate Id through VertexLabel+PrimaryKeyValues,
String type;
+- Custom (CUSTOMIZE_STRING|CUSTOMIZE_NUMBER): User-defined Id, which is
divided into two types: String and Long, and you need to ensure the uniqueness
of the Id yourself;
+
+The default Id policy is AUTOMATIC, if the user calls the primaryKeys() method
and sets the correct PrimaryKeys, the PRIMARY_KEY policy is automatically
enabled.
+After enabling the PRIMARY_KEY strategy, HugeGraph can implement data
deduplication based on PrimaryKeys.
+
+ 1. AUTOMATIC ID Policy
+ ```java
+schema.vertexLabel("person")
+ .useAutomaticId()
+ .properties("name", "age", "city")
+ .create();
+graph.addVertex(T.label, "person","name", "marko", "age", 18, "city",
"Beijing");
+ ```
+
+ 2. PRIMARY_KEY ID policy
+ ```java
+schema.vertexLabel("person")
+ .usePrimaryKeyId()
+ .properties("name", "age", "city")
+ .primaryKeys("name", "age")
+ .create();
+graph.addVertex(T.label, "person","name", "marko", "age", 18, "city",
"Beijing");
+ ```
+
+ 3. CUSTOMIZE_STRING ID Policy
+ ```java
+schema.vertexLabel("person")
+ .useCustomizeStringId()
+ .properties("name", "age", "city")
+ .create();
+graph.addVertex(T.label, "person", T.id, "123456", "name", "marko","age", 18,
"city", "Beijing");
+ ```
+
+ 4. CUSTOMIZE_NUMBER ID Policy
+ ```java
+schema.vertexLabel("person")
+ .useCustomizeNumberId()
+ .properties("name", "age", "city")
+ .create();
+graph.addVertex(T.label, "person", T.id, 123456, "name", "marko","age", 18,
"city", "Beijing");
+ ```
+
+If users need Vertex deduplication, there are three options:
+
+1. Adopt PRIMARY_KEY strategy, automatic overwriting, suitable for batch
insertion of large amount of data, users cannot know whether overwriting has
occurred
+2. Adopt AUTOMATIC strategy, read-and-modify, suitable for small data
insertion, users can clearly know whether overwriting occurs
+3. Using the CUSTOMIZE_STRING or CUSTOMIZE_NUMBER strategy, the user
guarantees the uniqueness
+
+### 4. EdgeId policy
+
+The EdgeId of HugeGraph is composed of `srcVertexId` + `edgeLabel` + `sortKey`
+ `tgtVertexId`. Among them `sortKey` is an important concept of HugeGraph.
+There are two reasons for adding Edge sortKeyas the unique ID of Edge:
+
+1. If there are multiple edges of the same Label between two vertices, they
can be sortKeydistinguished by
+2. For SuperNode nodes, it can be sortKeysorted and truncated by.
+
+Since EdgeId is composed of `srcVertexId` + `edgeLabel` + `sortKey` +
`tgtVertexId`, HugeGraph will automatically overwrite when the same Edge is
inserted
+multiple times to achieve deduplication. It should be noted that the
properties of Edge will also be overwritten in the batch insert mode.
+
+In addition, because HugeGraph's EdgeId adopts an automatic deduplication
strategy, HugeGraph considers that there is only one edge in the case of
self-loop
+(a vertex has an edge pointing to itself). The graph has two edges.
+
+> The edges of HugeGraph only support directed edges, and undirected edges can
be realized by creating two edges, Out and In.
+
+### 5. HugeGraph transaction overview
+
+##### TinkerPop transaction overview
+
+A TinkerPop transaction refers to a unit of work that performs operations on
the database. A set of operations within a transaction either succeeds or all
fail. For a detailed introduction, please refer to the official documentation
of TinkerPop:
http://tinkerpop.apache.org/docs/current/reference/#transactions:http://tinkerpop.apache.org/docs/current/reference/#transactions
+
+##### TinkerPop transaction overview
+
+- open open transaction
+- commit commit transaction
+- rollback rollback transaction
+- close closes the transaction
+
+##### TinkerPop transaction specification
+
+- The transaction must be explicitly committed before it can take effect (the
modification operation can only be seen by the query in this transaction if it
is not committed)
+- A transaction must be opened before it can be committed or rolled back
+- If the transaction setting is automatically turned on, there is no need to
explicitly turn it on (the default method), if it is set to be turned on
manually, it must be turned on explicitly
+- When the transaction is closed, you can set three modes: automatic commit,
automatic rollback (default mode), manual (explicit shutdown is prohibited),
etc.
+- The transaction must be closed after committing or rolling back
+- The transaction must be open after the query
+- Transactions (non-threaded tx) must be thread-isolated, and multi-threaded
operations on the same transaction do not affect each other
+
+For more transaction specification use cases, see: [Transaction
Test](https://github.com/apache/tinkerpop/blob/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/structure/TransactionTest.java)
+
+##### HugeGraph transaction implementation
+
+- All operations in a transaction either succeed or fail
+- A transaction can only read what has been committed by another transaction
(Read committed)
+- All uncommitted operations can be queried in this transaction, including:
+ - Adding a vertex can query the vertex
+ - Delete a vertex to filter out the vertex
+ - Deleting a vertex can filter out the related edges of the vertex
+ - Adding an edge can query the edge
+ - Delete edge can filter out the edge
+ - Adding/modifying (vertex, edge) attributes can take effect when querying
+ - Delete (vertex, edge) attributes can take effect at query time
+- All uncommitted operations become invalid after the transaction is rolled
back, including:
+ - Adding and deleting vertices and edges
+ - Addition/modification, deletion of attributes
+
+Example: One transaction cannot read another transaction's uncommitted content
+
+```java
+ static void testUncommittedTx(final HugeGraph graph) throws
InterruptedException {
+
+ final CountDownLatch latchUncommit = new CountDownLatch(1);
+ final CountDownLatch latchRollback = new CountDownLatch(1);
+
+ Thread thread = new Thread(() -> {
+ // this is a new transaction in the new thread
+ graph.tx().open();
+
+ System.out.println("current transaction operations");
+
+ Vertex james = graph.addVertex(T.label, "author",
+ "id", 1, "name", "James Gosling",
+ "age", 62, "lived", "Canadian");
+ Vertex java = graph.addVertex(T.label, "language", "name", "java",
+ "versions", Arrays.asList(6, 7, 8));
+ james.addEdge("created", java);
+
+ // we can query the uncommitted records in the current transaction
+ System.out.println("current transaction assert");
+ assert graph.vertices().hasNext() == true;
+ assert graph.edges().hasNext() == true;
+
+ latchUncommit.countDown();
+
+ try {
+ latchRollback.await();
+ } catch (InterruptedException e) {
+ throw new RuntimeException(e);
+ }
+
+ System.out.println("current transaction rollback");
+ graph.tx().rollback();
+ });
+
+ thread.start();
+
+ // query none result in other transaction when not commit()
+ latchUncommit.await();
+ System.out.println("other transaction assert for uncommitted");
+ assert !graph.vertices().hasNext();
+ assert !graph.edges().hasNext();
+
+ latchRollback.countDown();
+ thread.join();
+
+ // query none result in other transaction after rollback()
+ System.out.println("other transaction assert for rollback");
+ assert !graph.vertices().hasNext();
+ assert !graph.edges().hasNext();
+ }
+```
+
+##### Principle of transaction realization
+
+- The server internally realizes isolation by binding transactions to threads
(ThreadLocal)
+- The uncommitted content of this transaction overwrites the old data in
chronological order for this transaction to query the latest version of data
+- The bottom layer relies on the back-end database to ensure transaction
atomicity (for example, the batch interface of Cassandra/RocksDB guarantees
atomicity)
+
+###### Notice
+
+> The RESTful API does not expose the transaction interface for the time being
+
+> TinkerPop API allows open transactions, which are automatically closed when
the request is completed (Gremlin Server forces close)
+
diff --git a/docs/guide/faq.md b/docs/guide/faq.md
new file mode 100644
index 00000000..6bde9a38
--- /dev/null
+++ b/docs/guide/faq.md
@@ -0,0 +1,101 @@
+---
+id: 'faq'
+title: 'FAQ'
+sidebar_label: 'FAQ'
+sidebar_position: 5
+---
+
+- How to choose the back-end storage? Choose RocksDB or Cassandra or Hbase or
Mysql?
+
+ Judge according to your specific needs. Generally, if the stand-alone
machine or the data volume is < 10 billion, RocksDB is recommended, and other
back-end clusters that use distributed storage are recommended.
+
+- Prompt when starting the service: `xxx (core dumped) xxx`
+
+ Please check if the JDK version is Java 11, at least Java 8 is required
+
+- The service is started successfully, but there is a prompt similar to
"Unable to connect to the backend or the connection is not open" when operating
the graph
+
+ init-storeBefore starting the service for the first time, you need to use
the initialization backend first , and subsequent versions will prompt more
clearly and directly.
+
+- Do all backends need to be executed before use init-store, and can the
serialization options be filled in at will?
+
+ Except memorynot required, other backends are required, such as:
`cassandra`, `hbaseand`, `rocksdb`, etc. Serialization needs to be one-to-one
correspondence and cannot be filled in at will.
+
+- Execution `init-store` error: ```Exception in thread "main"
java.lang.UnsatisfiedLinkError: /tmp/librocksdbjni3226083071221514754.so:
/usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.10' not found (required by
/tmp/librocksdbjni3226083071221514754.so)```
+
+ RocksDB requires gcc 4.3.0 (GLIBCXX_3.4.10) and above
+
+- The error `NoHostAvailableException` occurred while executing
`init-store.sh`.
+
+ `NoHostAvailableException` means that the `Cassandra` service cannot be
connected to. If you are sure that you want to use the Cassandra backend,
please install and start this service first. As for the message itself, it may
not be clear enough, and we will update the documentation to provide further
explanation.
+
+- The `bin` directory contains `start-hugegraph.sh`, `start-restserver.sh` and
`start-gremlinserver.sh`. These scripts seem to be related to startup. Which
one should be used?
+
+ Since version 0.3.3, GremlinServer and RestServer have been merged into
HugeGraphServer. To start, use start-hugegraph.sh. The latter two will be
removed in future versions.
+
+- Two graphs are configured, the names are `hugegraph` and `hugegraph1`, and
the command to start the service is `start-hugegraph.sh`. Is only the hugegraph
graph opened?
+
+ `start-hugegraph.sh` will open all graphs under the graphs of
`gremlin-server.yaml`. The two have no direct relationship in name
+
+- After the service starts successfully, garbled characters are returned when
using `curl` to query all vertices
+
+ The batch vertices/edges returned by the server are compressed (gzip), and
can be redirected to `gunzip` for decompression (`curl http://example |
gunzip`), or can be sent with the `postman` of `Firefox` or the `restlet`
plug-in of Chrome browser. request, the response data will be decompressed
automatically.
+
+- When using the vertex Id to query the vertex through the `RESTful API`, it
returns empty, but the vertex does exist
+
+ Check the type of the vertex ID. If it is a string type, the "id" part of
the API URL needs to be enclosed in double quotes, while for numeric types, it
is not necessary to enclose the ID in quotes.
+
+- Vertex Id has been double quoted as required, but querying the vertex via
the RESTful API still returns empty
+
+ Check whether the vertex id contains `+`, `space`, `/`, `?`, `%`, `&`, and
`=` reserved characters of these `URLs`. If they exist, they need to be
encoded. The following table gives the coded values:
+
+ ```
+ special character | encoded value
+ ------------------| -------------
+ + | %2B
+ space | %20
+ / | %2F
+ ? | %3F
+ % | %25
+ # | %23
+ & | %26
+ = | %3D
+ ```
+
+- Timeout when querying vertices or edges of a certain category (`query by
label`)
+
+ Since the amount of data belonging to a certain label may be relatively
large, please add a limit limit.
+
+- It is possible to operate the graph through the `RESTful API`, but when
sending `Gremlin` statements, an error is reported: `Request Failed(500)`
+
+ It may be that the configuration of `GremlinServer` is wrong, check whether
the `host` and `port` of `gremlin-server.yaml` match the `gremlinserver.url` of
`rest-server.properties`, if they do not match, modify them, and then Restart
the service.
+
+- When using `Loader` to import data, a `Socket Timeout` exception occurs, and
then `Loader` is interrupted
+
+ Continuously importing data will put too much pressure on the `Server`,
which will cause some requests to time out. The pressure on `Server` can be
appropriately relieved by adjusting the parameters of `Loader` (such as: number
of retries, retry interval, error tolerance, etc.), and reduce the frequency of
this problem.
+
+- How to delete all vertices and edges. There is no such interface in the
RESTful API. Calling `g.V().drop()` of `gremlin` will report an error `Vertices
in transaction have reached capacity xxx`
+
+ At present, there is really no good way to delete all the data. If the user
deploys the `Server` and the backend by himself, he can directly clear the
database and restart the `Server`. You can use the paging API or scan API to
get all the data first, and then delete them one by one.
+
+- The database has been cleared and `init-store` has been executed, but when
trying to add a schema, the prompt "xxx has existed" appeared.
+
+ There is a cache in the `HugeGraphServer`, and it is necessary to restart
the `Server` when the database is cleared, otherwise the residual cache will be
inconsistent.
+
+- An error is reported during the process of inserting vertices or edges: `Id
max length is 128, but got xxx {yyy}` or `Big id max length is 32768, but got
xxx`
+
+ In order to ensure query performance, the current backend storage limits the
length of the id column. The vertex id cannot exceed 128 bytes, the edge id
cannot exceed 32768 bytes, and the index id cannot exceed 128 bytes.
+
+- Is there support for nested attributes, and if not, are there any
alternatives?
+
+ Nested attributes are currently not supported. Alternative: Nested
attributes can be taken out as individual vertices and connected with edges.
+
+- Can an `EdgeLabel` connect multiple pairs of `VertexLabel`, such as
"investment" relationship, which can be "individual" investing in "enterprise",
or "enterprise" investing in "enterprise"?
+
+ An `EdgeLabel` does not support connecting multiple pairs of `VertexLabels`,
users need to split the `EdgeLabel` into finer details, such as: "personal
investment", "enterprise investment".
+
+- Prompt `HTTP 415 Unsupported Media Type` when sending a request through
`RestAPI`
+
+ `Content-Type: application/json` needs to be specified in the request header
+
+Other issues can be searched in the issue area of the corresponding project,
such as [Server-Issues](https://github.com/apache/hugegraph/issues) / [Loader
Issues](https://github.com/apache/hugegraph-loader/issues)
diff --git a/docs/guide/security.md b/docs/guide/security.md
new file mode 100644
index 00000000..8b60c14e
--- /dev/null
+++ b/docs/guide/security.md
@@ -0,0 +1,24 @@
+---
+id: 'security'
+title: 'Security Report'
+sidebar_label: 'Security'
+sidebar_position: 6
+---
+
+## Reporting New Security Problems with Apache HugeGraph
+
+Adhering to the specifications of ASF, the HugeGraph community maintains a
highly proactive and open attitude towards addressing security issues in the
**remediation** projects.
+
+We strongly recommend that users first report such issues to our dedicated
security email list, with detailed procedures specified in the [ASF
SEC](https://www.apache.org/security/committers.html) code of conduct.
+
+Please note that the security email group is reserved for reporting
**undisclosed** security vulnerabilities and following up on the vulnerability
resolution process. Regular software `Bug/Error` reports should be directed to
`Github Issue/Discussion` or the `HugeGraph-Dev` email group. Emails sent to
the security list that are unrelated to security issues will be ignored.
+
+The independent security email (group) address is:
`[email protected]`
+
+The general process for handling security vulnerabilities is as follows:
+
+- The reporter privately reports the vulnerability to the Apache HugeGraph SEC
email group (including as much information as possible, such as reproducible
versions, relevant descriptions, reproduction methods, and the scope of impact)
+- The HugeGraph project security team collaborates privately with the reporter
to discuss the vulnerability resolution (after preliminary confirmation, a
`CVE` number can be requested for registration)
+- The project creates a new version of the software package affected by the
vulnerability to provide a fix
+- At an appropriate time, a general description of the vulnerability and how
to apply the fix will be publicly disclosed (in compliance with ASF standards,
the announcement should not disclose sensitive information such as reproduction
details)
+- Official CVE release and related procedures follow the ASF-SEC page
\ No newline at end of file
diff --git a/static/img/guide/GraphCut.png b/static/img/guide/GraphCut.png
new file mode 100644
index 00000000..adf3852d
Binary files /dev/null and b/static/img/guide/GraphCut.png differ
diff --git a/static/img/guide/PropertyGraph.png
b/static/img/guide/PropertyGraph.png
new file mode 100644
index 00000000..12dbde1d
Binary files /dev/null and b/static/img/guide/PropertyGraph.png differ
diff --git a/static/img/guide/architectural-revised.png
b/static/img/guide/architectural-revised.png
new file mode 100644
index 00000000..2316ebf9
Binary files /dev/null and b/static/img/guide/architectural-revised.png differ