Re: [PR] HDDS-11190. Add --fields option to ldb scan command [ozone]

via GitHub Wed, 31 Jul 2024 13:49:15 -0700


errose28 commented on PR #6976:
URL: https://github.com/apache/ozone/pull/6976#issuecomment-2261435551

I also suggested using `jq`.

> but for larger dbs it will be reading the data twice. With adding an
option to our code, we will be reading the data only once and filtering it
simultaneously.

I don't think this how it would work. This seems to describe jq as blocking
until the whole DB is read, and only then beginning filtering on all the
objects before giving the final output. jq actually works on streams. Our ldb
process would read and print lines to stdout. After a line is printed, our
process moves on to read and print more of the DB while jq is filtering the
lines that were just printed at the same time.

If there is a speedup it would probably be because we are reducing the
amount of data that gets converted to json and printed. However, this benefit
might be negated because this filter is implemented with [Java
reflection](https://github.com/apache/ozone/blob/9b29eae46ad19ba648765f22c30a1c294f403243/hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/debug/ValueSchema.java#L159)
and jq filtering is [in C](https://github.com/jqlang/jq).

Can we get benchmarks of various filtering queries using jq vs this method?
Ideally on larger DBs with at least thousands of keys. Based on these results
we can decide whether this option is something we should support.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-11190. Add --fields option to ldb scan command [ozone]

Reply via email to