dosubot[bot] commented on issue #7877:
URL:
https://github.com/apache/incubator-devlake/issues/7877#issuecomment-2284966048
<!-- Answer -->
To create and use the `customize` plugin within Apache DevLake, follow these
steps:
1. **Define the Transformation Rules**: Create a JSON configuration that
specifies the transformation rules for extracting data from the raw layer
tables and populating the `commit_files` table.
2. **Trigger Data Extraction**: Use the JSON configuration to trigger the
data extraction process.
Here is an example JSON configuration to extract file changes from a raw
layer table (e.g., `_raw_gitlab_api_commit_files`) and populate the
`commit_files` table:
```json
[
[
{
"plugin": "customize",
"options": {
"transformationRules": [
{
"table": "commit_files",
"rawDataTable": "_raw_gitlab_api_commit_files",
"rawDataParams": "{\"ConnectionId\":1,\"ProjectId\":123}",
"mapping": {
"x_file_path": "file_path",
"x_commit_sha": "commit_sha",
"x_change_type": "change_type"
}
}
]
}
}
]
]
```
In this example:
- `table`: Specifies the domain layer table name (`commit_files`).
- `rawDataTable`: Specifies the raw layer table from which data is extracted
(`_raw_gitlab_api_commit_files`).
- `rawDataParams`: Specifies the filter to select records from the raw layer
table.
- `mapping`: Specifies the extraction rules, where the key is the extension
field name, and the value is the JSON path to extract the value from the raw
data.
3. **Trigger Data Extraction via API**: You can trigger the data extraction
by making a POST request to the `/pipelines` endpoint with the JSON
configuration.
Here is an example curl command to trigger the data extraction:
```shell
curl 'http://localhost:8080/pipelines' \
--header 'Content-Type: application/json' \
--data-raw '
{
"name": "extract commit files",
"plan": [
[
{
"plugin": "customize",
"options": {
"transformationRules": [
{
"table": "commit_files",
"rawDataTable": "_raw_gitlab_api_commit_files",
"rawDataParams":
"{\"ConnectionId\":1,\"ProjectId\":123}",
"mapping": {
"x_file_path": "file_path",
"x_commit_sha": "commit_sha",
"x_change_type": "change_type"
}
}
]
}
}
]
]
}
'
```
This configuration and API request will ensure that the `commit_files` table
is populated with data from the raw layer tables, capturing file changes during
the data collection process
<sup>[[1]](https://github.com/apache/incubator-devlake-website/blob/main/docs/Plugins/customize.md)</sup>.
**Note:** The names of columns added via the `customize` plugin must start
with the prefix `x_`. The supported data types for the columns are
`varchar(255)`, `text`, `bigint`, `float`, `timestamp`, and `array`.
<!-- Dosu Comment Footer -->
*To continue talking to [Dosu](https://dosu.dev), mention @dosu.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]