clintropolis opened a new pull request, #18901:
URL: https://github.com/apache/druid/pull/18901
### Description
Since there is no longer a `meta.smoosh`, this PR adds an option to the dump
segment tool to show v10 metadata. As a convenience to make this easy to run, I
have added a `bin/dump-segment` to druid packaging, so its really easy to run
against segments if you have a druid installation handy, for example:
```
$ ./bin/dump-segment --dump metadata_v10 -d
~/workspace/data/druid/segmentsCache/wikipedia-v10-no-rollup_2016-06-27T00\:00\:00.000Z_2016-06-28T00\:00\:00.000Z_2025-12-31T22\:19\:27.478Z/druid.segment
| jq .
{
"containers": [
{
"startOffset": 0,
"size": 7069778
}
],
"files": {
"__base/__time": {
"container": 0,
"startOffset": 0,
"size": 120208
},
"__base/added": {
"container": 0,
"startOffset": 4241057,
"size": 14
},
...
```
or like do fancy jq stuff like show biggest internal files or whatever
```
$ ./bin/dump-segment --dump metadata_v10 -d
~/workspace/data/druid/segmentsCache/wikip10-no-rollup_2016-06-27T00\:00\:00.000Z_2016-06-28T00\:00\:00.000Z_2025-12-31T22\:19\:27.478Z/druid.segment
| jq '.files | to_entries | map({file:.key, size:.value.size}) |
sort_by(.size) | reverse'
[
{
"file": "__base/diffUrl.__stringDictionary",
"size": 1908699
},
{
"file": "__base/comment.__stringDictionary",
"size": 1001152
},
{
"file": "__base/page.__stringDictionary",
"size": 732776
},
{
"file": "__base/diffUrl.__valueIndexes",
"size": 635276
},
{
"file": "__base/page.__valueIndexes",
"size": 586076
},
{
...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]