tristaZero commented on a change in pull request #101: Sharding-JDBC manual 
modification
URL: 
https://github.com/apache/incubator-shardingsphere-doc/pull/101#discussion_r267273391
 
 

 ##########
 File path: document/current/content/features/orchestration/encrypt.en.md
 ##########
 @@ -1,15 +1,42 @@
 +++
 pre = "<b>3.3.5. </b>"
 toc = true
-title = "Data Masking"
+title = "Data Desensitization"
 weight = 5
+
 +++
 
 ## Background
-TODO
 
-## Solutions
+Security control has always been a crucial link of data orchestration; data 
desensitization falls into this category. For both Internet enterprises and 
traditional sectors, data security has always been a highly focused and 
sensitive topic. Data desensitization refers to transforming some sensitive 
information through desensitization rules to safely protect the private data. 
Data that involves client security or business sensibility, such as ID number, 
phone number, card number, client number and other personal information, is 
required of data desensitization according to relevant regulations.
+
+Because of that, ShardingSphere has provided the function of data 
desensitization, which stores users' sensitive information in the database 
after encryption. When users search for them, they will be decrypted and 
returned to users as the original data. It has the encryption and decryption 
processes totally transparent to users, who can store desensitized data and 
acquire original data without any awareness. In addition, ShardingSphere has 
provided internal desensitization algorithm, which can directly used by users. 
In the same time, we have also provided desensitization algorithm related 
interfaces, which can be implemented by users themselves. Then, after simple 
configurations, ShardingSphere can use algorithms provided by users to perform 
encryption, decryption and desensitization operations.
+
+## Solution
+
+ShardingSphere has provided two data desensitization solutions, corresponding 
to two ShardingSphere encryption and decryption interfaces, i.e., 
`ShardingEncryptor` and `ShardingQueryAssistedEncryptor`.
+
+On the one hand, ShardingSphere has provided internal encryption and 
decryption implementations for users, which can be used by them only after 
configuration. On the other hand, to satisfy users' requirements for different 
scenarios, we have also opened relevant encryption and decryption interfaces, 
according to which users can provide specific implementation types. Then, after 
simple configurations, ShardingSphere can use encryption and decryption 
solutions defined by users themselves to desensitize data.
 
 ### ShardingEncryptor
 
+The solution has provided two methods, `encrypt()` and `decrypt()`, to encrypt 
and decrypt data to be  desensitized.
+
+When users perform `INSERT`,  `DELETE` and `UPDATE` operations, ShardingSphere 
will parse, rewrite and route SQL. It will also use `encrypt()` to encrypt data 
and store them in the database. When using `SELECT`, they will reversely 
decrypt sensitive data from the database with `decrypt()` and return them to 
users at last.
+Currently, ShardingSphere has provided two implementation types for this kind 
of desensitization solution, MD5 (irreversible) and AES (reversible), which can 
be used only after users' configuration.
+
 ### ShardingQueryAssistedEncryptor
+
+Compared with the first desensitization scheme, this one is more secure and 
complex. Its concept is: even the same data, two same user passwords for 
example, should not be stored as the same desensitized form in the database. It 
can help to protect user information and avoid credential stuffing.
+
+This scheme provides three functions to implement, `encrypt()`, `decrypt()` 
and  `queryAssistedEncrypt()`.
+In `encrypt()` phase, users can set some variable, timestamp for example, and 
encrypt a combination of original data + variable. This method can make sure 
the encrypted desensitization data of the same original data are different, due 
to the existence of variables. In `decrypt()` phase, users can use variable 
data to decrypt according to the encryption algorithms set formerly.
+
+Though this method can indeed increase data security, another problem can 
appear with it: as the same data is stored in the database in different 
content, users may not be able to find out all the same original data with 
equivalent query (`SELECT FROM table WHERE encryptedColumnn = ?`) according to 
this encryption column.
+
+Because of it, we have brought out the concept of assistant query column, 
which is generated by `queryAssistedEncrypt()`. Different from `decrypt()`, 
this method uses another way to encrypt the original data; but for the same 
original data, it can generate consistent encryption data. Users can store data 
processed by `queryAssistedEncrypt()` to assist the query of original data. So 
there may be one more assistant query column in the table.
+
+`queryAssistedEncrypt()` and `encrypt()` can generate and store different 
encryption data; `decrypt()` is reversible and `queryAssistedEncrypt()` is 
irreversible. So when querying the original data, we will parse, rewrite and 
route SQL automatically. We will also use assistant query column to do `WHERE` 
condition query and use `decrypt()` to decrypt `encrypt()` data and return them 
to users. All these can not be felt by users.
 
 Review comment:
   ``decrypt()` is reversible and `queryAssistedEncrypt()` is 
irreversible.`--->`The data generated by `encrypt()`  can be decrypted by 
`decrypt()`, but there is no function to decrypt the encrypted data from 
`queryAssistedEncrypt()` `

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to