liujiayi771 opened a new issue, #7999:
URL: https://github.com/apache/incubator-gluten/issues/7999
### Description
Currently, the parquet file name written by Gluten is
Gluten_Stage_3_TID_2124_VTID_257_0_3_0946dfb5-f773-42c9-ac8e-d4e70bede02b.parquet
which is generated by the default behavior in velox `HiveDataSink.cpp`
```cpp
targetFileName = fmt::format(
"{}_{}_{}_{}",
connectorQueryCtx_->taskId(),
connectorQueryCtx_->driverId(),
connectorQueryCtx_->planNodeId(),
makeUuid());
```
https://github.com/facebookincubator/velox/pull/10903 add a new
`targetFileName` in `LocationHandle`, so we can specify the `targetFileName`
that contains compression kind suffix from Gluten side, which is more
consistent with the parquet file name generated by vanilla Spark.
The parquet files generated by Spark are named
part-uuid.<codec-extension>.parquet. I have defined the name of the parquet
file written by Gluten as gluten-part-uuid.<codec-extension>.parquet, with the
gluten prefix added to indicate that the file is generated by Gluten.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]