[GitHub] [incubator-doris] morningman commented on a change in pull request #7497: Add user manual for hdfs load and transaction.

GitBox Tue, 28 Dec 2021 01:48:46 -0800


morningman commented on a change in pull request #7497:
URL: https://github.com/apache/incubator-doris/pull/7497#discussion_r775819294




##########
File path: docs/zh-CN/administrator-guide/outfile.md
##########
@@ -138,167 +138,8 @@ explain select xxx from xxx where xxx  into outfile 
"s3://xxx" format as csv pro
 
 ## 使用示例
 
-1. 示例1
+具体参阅[OUTFILE 文档](../sql-reference/sql-statements/Data 
Manipulation/OUTFILE.html)。

Review comment:
       ```suggestion
   具体参阅[OUTFILE 
文档](../sql-reference/sql-statements/Data%20Manipulation/OUTFILE.md)。
   ```

##########
File path: docs/en/administrator-guide/outfile.md
##########
@@ -139,166 +139,7 @@ Planning example for concurrent export:
 
 ## Usage example
 
-1. Example 1
-
-    Export simple query results to the file `hdfs://path/to/result.txt`. 
Specify the export format as CSV. Use `my_broker` and set kerberos 
authentication information. Specify the column separator as `,` and the line 
delimiter as `\n`.
-    
-    ```
-    SELECT * FROM tbl
-    INTO OUTFILE "hdfs://path/to/result_"
-    FORMAT AS CSV
-    PROPERTIES
-    (
-        "broker.name" = "my_broker",
-        "broker.hadoop.security.authentication" = "kerberos",
-        "broker.kerberos_principal" = "[email protected]",
-        "broker.kerberos_keytab" = "/home/doris/my.keytab",
-        "column_separator" = ",",
-        "line_delimiter" = "\n",
-        "max_file_size" = "100MB"
-    );
-    ```
-    
-    If the result is less than 100MB, file will be: `result_0.csv`.
-    
-    If larger than 100MB, may be: `result_0.csv, result_1.csv, ...`.
-
-2. Example 2
-
-    Export simple query results to the file `hdfs://path/to/result.parquet`. 
Specify the export format as PARQUET. Use `my_broker` and set kerberos 
authentication information. 
-    
-    ```
-    SELECT c1, c2, c3 FROM tbl
-    INTO OUTFILE "hdfs://path/to/result_"
-    FORMAT AS PARQUET
-    PROPERTIES
-    (
-        "broker.name" = "my_broker",
-        "broker.hadoop.security.authentication" = "kerberos",
-        "broker.kerberos_principal" = "[email protected]",
-        "broker.kerberos_keytab" = "/home/doris/my.keytab",
-        
"schema"="required,int32,c1;required,byte_array,c2;required,byte_array,c2"
-    );
-    ```
-   
-   If the exported file format is PARQUET, `schema` must be specified.
-
-3. Example 3
-
-    Export the query result of the CTE statement to the file 
`hdfs://path/to/result.txt`. The default export format is CSV. Use `my_broker` 
and set hdfs high availability information. Use the default column separators 
and line delimiter.
-
-    ```
-    WITH
-    x1 AS
-    (SELECT k1, k2 FROM tbl1),
-    x2 AS
-    (SELECT k3 FROM tbl2)
-    SELEC k1 FROM x1 UNION SELECT k3 FROM x2
-    INTO OUTFILE "hdfs://path/to/result_"
-    PROPERTIES
-    (
-        "broker.name" = "my_broker",
-        "broker.username"="user",
-        "broker.password"="passwd",
-        "broker.dfs.nameservices" = "my_ha",
-        "broker.dfs.ha.namenodes.my_ha" = "my_namenode1, my_namenode2",
-        "broker.dfs.namenode.rpc-address.my_ha.my_namenode1" = 
"nn1_host:rpc_port",
-        "broker.dfs.namenode.rpc-address.my_ha.my_namenode2" = 
"nn2_host:rpc_port",
-        "broker.dfs.client.failover.proxy.provider" = 
"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
-    );
-    ```
-    
-    If the result is less than 1GB, file will be: `result_0.csv`.
-    
-    If larger than 1GB, may be: `result_0.csv, result_1.csv, ...`.
-    
-4. Example 4
-
-    Export the query results of the UNION statement to the file 
`bos://bucket/result.parquet`. Specify the export format as PARQUET. Use 
`my_broker` and set hdfs high availability information. PARQUET format does not 
need to specify the column separator and line delimiter.
-    
-    ```
-    SELECT k1 FROM tbl1 UNION SELECT k2 FROM tbl1
-    INTO OUTFILE "bos://bucket/result_"
-    FORMAT AS PARQUET
-    PROPERTIES
-    (
-        "broker.name" = "my_broker",
-        "broker.bos_endpoint" = "http://bj.bcebos.com";,
-        "broker.bos_accesskey" = "xxxxxxxxxxxxxxxxxxxxxxxxxx",
-        "broker.bos_secret_accesskey" = "yyyyyyyyyyyyyyyyyyyyyyyyyy",
-        "schema"="required,int32,k1;required,byte_array,k2"
-    );
-    ```
-
-5. Example 5
-
-    Export simple query results to the file 
`cos://${bucket_name}/path/result.txt`. Specify the export format as CSV.
-    And create a mark file after export finished.
-    
-    ```
-    select k1,k2,v1 from tbl1 limit 100000
-    into outfile "s3a://my_bucket/export/my_file_"
-    FORMAT AS CSV
-    PROPERTIES
-    (
-       "broker.name" = "hdfs_broker",
-       "broker.fs.s3a.access.key" = "xxx",
-       "broker.fs.s3a.secret.key" = "xxxx",
-       "broker.fs.s3a.endpoint" = "https://cos.xxxxxx.myqcloud.com/";,
-       "column_separator" = ",",
-       "line_delimiter" = "\n",
-       "max_file_size" = "1024MB",
-       "success_file_name" = "SUCCESS"
-    )
-    ```
-    
-    If the result is less than 1GB, file will be: `my_file_0.csv`.
-    
-    If larger than 1GB, may be: `my_file_0.csv, result_1.csv, ...`.
-    
-    Please Note: 
-    1. Paths that do not exist are automatically created.
-    2. These parameters(access.key/secret.key/endpointneed) need to be 
confirmed with `Tecent Cloud COS`. In particular, the value of endpoint does 
not need to be filled in bucket_name.
-
-6. Example 6
-
-    Use the s3 protocol to export to bos, and concurrent export is enabled.
-
-    ```
-    set enable_parallel_outfile = true;
-    select k1 from tb1 limit 1000
-    into outfile "s3://my_bucket/export/my_file_"
-    format as csv
-    properties
-    (
-        "AWS_ENDPOINT" = "http://s3.bd.bcebos.com";,
-        "AWS_ACCESS_KEY" = "xxxx",
-        "AWS_SECRET_KEY" = "xxx",
-        "AWS_REGION" = "bd"
-    )
-    ```
-
-    The final generated file prefix is `my_file_{fragment_instance_id}_`。
-
-7. Example 7
-
-    Use the s3 protocol to export to bos, and enable concurrent export of 
session variables.
-
-    ```
-    set enable_parallel_outfile = true;
-    select k1 from tb1 order by k1 limit 1000
-    into outfile "s3://my_bucket/export/my_file_"
-    format as csv
-    properties
-    (
-        "AWS_ENDPOINT" = "http://s3.bd.bcebos.com";,
-        "AWS_ACCESS_KEY" = "xxxx",
-        "AWS_SECRET_KEY" = "xxx",
-        "AWS_REGION" = "bd"
-    )
-    ```
-
-    **But because the query statement has a top-level sorting node, even if 
the query is enabled for concurrently exported session variables, it cannot be 
exported concurrently.**
+For details, please refer to [OUTFILE 
Document](../sql-reference/sql-statements/Data Manipulation/OUTFILE.html).

Review comment:
       ```suggestion
   For details, please refer to [OUTFILE 
Document](../sql-reference/sql-statements/Data%20Manipulation/OUTFILE.md).
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-doris] morningman commented on a change in pull request #7497: Add user manual for hdfs load and transaction.

Reply via email to