suneet-s commented on a change in pull request #10413:
URL: https://github.com/apache/druid/pull/10413#discussion_r493031245
##########
File path: core/src/main/java/org/apache/druid/timeline/DataSegment.java
##########
@@ -100,7 +100,7 @@
/**
* Stores some configurations of the compaction task which created this
segment.
* This field is filled in the metadata store only when
"storeCompactionState" is set true in the context of the
- * compaction task which is false by default.
+ * compaction task (true by default).
Review comment:
super nit - to link to where the default is.
```suggestion
* compaction task (true by default). {@link
org.apache.druid.indexing.common.task.Tasks#DEFAULT_STORE_COMPACTION_STATE}.
```
##########
File path: docs/querying/sql.md
##########
@@ -1086,6 +1086,7 @@ Segments table provides details on all Druid segments,
whether they are publishe
|shardSpec|STRING|The toString of specific `ShardSpec`|
|dimensions|STRING|The dimensions of the segment|
|metrics|STRING|The metrics of the segment|
+|last_compaction_state|STRING|The configurations of the compaction task which
created this segment. May be null if segment was not created by compaction
task.|
For example to retrieve all segments for datasource "wikipedia", use the query:
Review comment:
The PR description talks about being able to identify performance issues
where looking at last_compaction_state can help. Should we add some example sql
statements below to explain how to identify the 3 issues outlined in the PR
description?
##########
File path:
sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java
##########
@@ -311,7 +312,8 @@ public TableType getJdbcTableType()
val.isOvershadowed() ? IS_OVERSHADOWED_TRUE :
IS_OVERSHADOWED_FALSE,
segment.getShardSpec(),
segment.getDimensions(),
- segment.getMetrics()
+ segment.getMetrics(),
+ segment.getLastCompactionState()
Review comment:
`CompactionState` includes the `PartitionsSpec` and `indexSpec`.
What should a caller of the API do with these?
The compaction state appears to be an unbounded json object. Do we foresee
any issues around serializing the entire json blob every time we need to call
the sys table?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]