>From Ritik Raj <[email protected]>: Ritik Raj has posted comments on this change by Ritik Raj. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20863?usp=email )
Change subject: [ASTERIXDB-3702][RT][STO] LSM Sampling ...................................................................... Patch Set 3: (1 comment) Patchset: PS3: Unit test proving the error rate is within bounds for cardinality estimation using Theta Sketches. ==================================================================================================== TEST: Full Matrix - N Indexes x Upsert% x Delete% ==================================================================================================== Configuration: 5,000 keys per index, 4 flushes per index N Upsert % Delete % Expected Estimated Error % -------------------------------------------------------------------------------- 1 0 0 3125 3148 0.74 1 0 10 2635 2652 0.65 1 0 25 2089 2112 1.10 1 25 0 3125 3148 0.74 1 25 10 2723 2737 0.51 1 25 25 2265 2290 1.10 1 50 0 3125 3132 0.22 1 50 10 2804 2811 0.25 1 50 25 2428 2423 0.21 4 0 0 12500 12454 0.37 4 0 10 10521 10485 0.34 4 0 25 8332 8369 0.44 4 25 0 12500 12454 0.37 4 25 10 10960 10914 0.42 4 25 25 9133 9119 0.15 4 50 0 12500 12411 0.71 4 50 10 11226 11181 0.40 4 50 25 9785 9747 0.39 16 0 0 50000 49970 0.06 16 0 10 42063 42034 0.07 16 0 25 33289 33222 0.20 16 25 0 50000 49970 0.06 16 25 10 43748 43717 0.07 16 25 25 36496 36428 0.19 16 50 0 50000 49791 0.42 16 50 10 45091 44885 0.46 16 50 25 39260 39143 0.30 64 0 0 200000 199783 0.11 64 0 10 168224 167964 0.15 64 0 25 133278 133008 0.20 64 25 0 200000 199783 0.11 64 25 10 174800 174606 0.11 64 25 25 146073 145858 0.15 64 50 0 200000 199664 0.17 64 50 10 180104 179949 0.09 64 50 25 156918 156548 0.24 ================================================================================ TEST: Error Rate vs Combined Upsert + Delete Percentage ================================================================================ Configuration: 16 indexes, 15,000 keys per index, 6 flushes Upsert % Delete % Expected Estimated Error % ---------------------------------------------------------------------- 0 0 106640 106491 0.14 10 5 92964 92903 0.07 20 10 83651 83561 0.11 30 15 77456 77446 0.01 40 20 73303 73168 0.18 50 25 70459 70482 0.03 60 30 68686 68679 0.01 70 35 67817 67292 0.77 80 40 67359 67391 0.05 ================================================================================ TEST: Error Rate vs Number of LSM Indexes (N) ================================================================================ Configuration: 10,000 keys per index, 3 flushes per index, 0% upsert, 0% delete N Expected Estimated Error % Components ---------------------------------------------------------------------- 1 9999 9897 1.02 3 2 19998 19934 0.32 6 4 39996 40058 0.16 12 8 79992 79624 0.46 24 16 159984 159792 0.12 48 32 319968 319348 0.19 96 64 639936 640281 0.05 192 128 1279872 1279870 0.00 384 ================================================================================ TEST: Error Rate vs Upsert Percentage ================================================================================ Configuration: 8 indexes, 20,000 keys per index, 5 flushes, 0% delete Upsert % Expected Estimated Error % ------------------------------------------------------------ 0 96000 95855 0.15 10 96000 95855 0.15 20 96000 95855 0.15 30 96000 95842 0.16 40 96000 96088 0.09 50 96000 96076 0.08 60 96000 95996 0.00 70 96000 96081 0.08 80 96000 95750 0.26 90 96000 95906 0.10 ================================================================================ TEST: Accuracy Bounds Verification ================================================================================ Verifying error rates stay within expected bounds for various configurations Simple 3-flush, no overlap: Expected=30000, Estimated=30368, Error=1.23% 30% upsert scenario: Expected=20000, Estimated=20054, Error=0.27% With deletes scenario: Expected=21188, Estimated=22043, Error=4.04% All accuracy bounds verified successfully! ================================================================================ TEST: Large Scale LSM Simulation ================================================================================ Configuration: 32 indexes, 50,000 keys each, 10 flushes, mixed workload Results: ---------------------------------------- Total Indexes: 32 Total Components: 320 Expected Keys: 780,929 Estimated Keys: 780,120 Error Rate: 0.10% Time Elapsed: 393 ms ================================================================================ TEST: Error Rate vs Delete Percentage ================================================================================ Configuration: 8 indexes, 20,000 keys per index, 5 flushes, 0% upsert Delete % Expected Estimated Error % ------------------------------------------------------------ 0 96000 95855 0.15 5 85671 85641 0.04 10 76821 76819 0.00 15 69337 69302 0.05 20 62938 62785 0.24 25 57458 57258 0.35 30 52704 52698 0.01 40 45195 44745 1.00 50 39483 39463 0.05 ================================================================================ TEST: Error Rate vs Number of Components per Index ================================================================================ Configuration: 8 indexes, 30,000 total keys per index, varying flush count Flushes Keys/Flush Expected Estimated Error % --------------------------------------------------------------------------- 1 30000 240000 243312 1.38 2 15000 234716 236380 0.71 3 10000 229582 229802 0.10 5 6000 219797 218656 0.52 10 3000 197433 196504 0.47 15 2000 178376 177878 0.28 20 1500 161850 161293 0.34 30 1000 134360 134057 0.23 Process finished with exit code 0 -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/20863?usp=email To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings?usp=email Gerrit-MessageType: comment Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ieaeb919c3b058955860385012b4d1bb738fc1cfa Gerrit-Change-Number: 20863 Gerrit-PatchSet: 3 Gerrit-Owner: Ritik Raj <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Comment-Date: Sat, 07 Feb 2026 14:28:15 +0000 Gerrit-HasComments: Yes Gerrit-Has-Labels: No
