FrankChen021 opened a new pull request #10980:
URL: https://github.com/apache/druid/pull/10980
This PR fixes #10945
### Description
/druid/v2 endpoint supports jackson smile encoding both for request and
response. But there's a bug there when processing a request with
'Accept:application/x-jackson-smile' header.
Current implementation takes 'Accept' parameter, which indicates server that
smile encoding is accepted by client to decode the HTTP response, to decode a
JSON request. According to HTTP protocol on 'Content-Type' and 'Accept', server
should alway decode the body according to 'Content-Type' header, while encode
response by 'Accept' header( if not provided, 'Accept' defaults to
'Content-Type')
The reason why current unit tests fails to detect this bug is because test
cases in `QueryResourceTest` wrongly inject a Json ObjectMapper instead of
Smile ObjectMapper for Smile decoding. All related test cases are fixed in this
PR too.
### Smile encoding test
I have never heard of Smile encoding before, and there's little doc in Druid
to describe if we should exploit it. So I also make some investigation how it
saves response sizes in Druid to see if this feature could be used in our
environment.
### Test data
Wikipedia data shipped by Druid
### Tests
Sending following scan query by CURL with following parameters
- `limit` parameter range in [100,200,300,400,500]
- Accept:application/x-jackson-smile turned on and off
- Content-Encoding:gzip turned on and off
observe 'size_download' for each response outputted by CURL
```
{
"queryType": "scan",
"dataSource": {
"type": "table",
"name": "wikipedia"
},
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"virtualColumns": [],
"batchSize": 20480,
"limit": 100,
"legacy": false,
"descending": false,
"granularity": {
"type": "all"
}
}
```
#### Result
Limit |Response Bytes in JSON(GZip) | Response Bytes in JSON(No-GZip) |
Response Bytes in Smile(GZip) | Response Bytes in Smile(No-GZip)
-- | -- | -- | -- | --
100 | 6753 | 43084 | 6550 | 17874
200 | 12485 | 86427 | 11646 | 35781
300 | 18879 | 129798 | 17316 | 53797
400 | 24494 | 172290 | 22208 | 70896
500 | 30886 | 215331 | 27920 | 88595
The tests above do not cover all queries in druid, but the data in the table
generally reflects advantages of Smile encoding in one aspect.
From the table we can see that:
- when gzip is not enabled in response, Smile encoding(column 5) takes about
only 41% space over JSON(column 3)
- when gzip is enabled, Smile encoding performs a little improvement(column
4 vs columns 2)
In contrast to gzip which requires more resources on server side, Smile
saves serialization time compared to JSON according to [some
tests](http://www.theotherian.com/2013/12/kryo-vs-smile-vs-json-shootout-part-1.html).
And there's no too much work to do expect replacing default object mapper to
smile object mapper which usually is one line of work in many dependency
injection system, I think Smile encoding is worthy to try.
### Note
SQL endpoint does not support Smile encoding yet(#10931), this PR does not
add Smile encoding to SQL endpoint. I think we could have a discussion. Based
on the tests shown above, IMO, it's reasonable to provide such support because
SQL query is more popular than native query, at least in our environment.
<hr>
##### Key changed/added classes in this PR
* `ResourceIOWriter` is extracted from `ResourceIOReaderWriter` to hold
response content-type and right ObjectMapper while original
ResourceIOReaderWriter holds right ObjectMapper to deserialize input request.
<hr>
This PR has:
- [X] been self-reviewed.
- [ ] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [X] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [X] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [X] been tested in a test Druid cluster.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]