[
https://issues.apache.org/jira/browse/HIVE-23819?focusedWorklogId=457044&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-457044
]
ASF GitHub Bot logged work on HIVE-23819:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Jul/20 07:34
Start Date: 10/Jul/20 07:34
Worklog Time Spent: 10m
Work Description: pvargacl commented on pull request #1230:
URL: https://github.com/apache/hive/pull/1230#issuecomment-656531478
Benchmark results below. The 6000 sample is the one Rajesh provided, the
others have 300 open/aborted txns. After the first run i modified the code, to
not use ranges if the range < 5, this way is more effective. So i have cases
for:
- one single range
- singles
- ranges with length 2
- ranges with length 5
- many large ranges
Benchmark (txnString) (useRange) Mode
Cnt Score Error Units
ValidReadTxnListBenchmark.writeAndParse MANY_RANGE_6000 true avgt
15 95.017 ± 6.665 us/op
ValidReadTxnListBenchmark.writeAndParse MANY_RANGE_6000 false avgt
15 742.845 ± 40.087 us/op
ValidReadTxnListBenchmark.writeAndParse RANGE_300 true avgt
15 3.475 ± 0.243 us/op
ValidReadTxnListBenchmark.writeAndParse RANGE_300 false avgt
15 16.132 ± 0.422 us/op
ValidReadTxnListBenchmark.writeAndParse SINGLES_300 true avgt
15 18.861 ± 0.712 us/op
ValidReadTxnListBenchmark.writeAndParse SINGLES_300 false avgt
15 20.286 ± 1.183 us/op
ValidReadTxnListBenchmark.writeAndParse DOUBLES_300 true avgt
15 19.415 ± 0.902 us/op
ValidReadTxnListBenchmark.writeAndParse DOUBLES_300 false avgt
15 17.444 ± 0.298 us/op
ValidReadTxnListBenchmark.writeAndParse FIVES_300 true avgt
15 15.242 ± 0.247 us/op
ValidReadTxnListBenchmark.writeAndParse FIVES_300 false avgt
15 17.342 ± 0.475 us/op
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 457044)
Time Spent: 0.5h (was: 20m)
> Use ranges in ValidReadTxnList serialization
> --------------------------------------------
>
> Key: HIVE-23819
> URL: https://issues.apache.org/jira/browse/HIVE-23819
> Project: Hive
> Issue Type: Improvement
> Reporter: Peter Varga
> Assignee: Peter Varga
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Time to time we see a case, when the open / aborted transaction count is high
> and often the aborted transactions come in continues ranges.
> When the transaction count goes high the serialization / deserialization to
> hive.txn.valid.txns conf gets slower and produces a large config value.
> Using ranges in the string representation can mitigate the issue somewhat.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)