ajeydudhe opened a new issue, #14083:
URL: https://github.com/apache/pinot/issues/14083
### Steps to reproduce
- Create schema for realtime table and define the table config.
- Use the attached job spec for spark-submit command.
- Note that there was issue with using http endpoint to fetch table config
since it seems to expect the config to be returned only for OFFLINE table.
Hence, using the local file path for realtime table. This is another issue.
- Following is the **_segmentNameGeneratorSpec_** used.
- The input file has format: uploaded__myTable__0__20220101T0000Z__suffix
- Tried using the type as inputFile and uploadedRealtime
- If type = uploadedRealtime then it fails with error "Creation time must be
set for uploaded realtime segment name generator"
- If type is inputFile and generated segment has same name format then
segment gets loaded but on server it fails to load.
```yaml
segmentNameGeneratorSpec:
# type: Current supported type is 'simple' and 'normalizedDate'.
type: uploadedRealtime
#type: inputFile
# configs: Configs to init SegmentNameGenerator.
configs:
#segment.name.prefix: 'uploaded__myTable__0__20220101T0000Z__suffix'
#exclude.sequence.id: true
# Below is for using file name as segment name
file.path.pattern: '.+/(.+)\.json'
segment.name.template: '\${filePathPattern:\1}'
```
- Please confirm on what should be the segmentNameGeneratorSpec.type used to
generate segments from json files for realtime table using Spark.
[sparkIngestionJobSpec_myTable.yaml.txt](https://github.com/user-attachments/files/17132166/sparkIngestionJobSpec_myTable.yaml.txt)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]