[
https://issues.apache.org/jira/browse/PHOENIX-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mujtaba Chohan updated PHOENIX-2929:
------------------------------------
Description:
Write to transactional table get up to ~3X slower compared to non-transactional
table.
Details:
* Transaction Manager (TM) is *always* kept up and running between all data
loads
* JVM for data loader is stopped after each 1M rows upsert
* Data loader uses 10 threads to write in parallel with 10K batch size
* Data load time is comparable for first few runs and it gets ~3X slow after
transaction manager is up for 5-10 hours
Data loader client details for txn-table: ~100sec GC pause with transactional
table. Total time to load 1M rows ~4mins. ~3.2G physical mem used. Java XMX
param was set to 3G.
{code}
JSTAT for data loader - 10 sec interval
S0C S1C S0U S1U EC EU OC OU MC MU
CCSC CCSU YGC YGCT FGC FGCT GCT
40448.0 40960.0 0.0 21920.0 571904.0 0.0 349696.0 87959.7 42024.0
41469.1 4904.0 4739.9 13 0.174 2 0.072 0.246
333824.0 333824.0 0.0 121135.1 380928.0 84826.0 2097152.0 2009035.3
42280.0 41652.1 4904.0 4740.4 29 1.172 6 5.231 6.403
349184.0 349184.0 0.0 0.0 350208.0 350205.8 2097152.0 1952578.1 42536.0
41758.6 4904.0 4742.5 36 1.353 12 9.882 11.234
322560.0 336896.0 214566.6 104273.9 344576.0 344561.2 2097152.0 1859305.5
42536.0 41774.5 4904.0 4742.5 45 1.643 17 14.684 16.327
349184.0 349184.0 41351.1 0.0 350208.0 48147.1 2097152.0 1935605.0
42536.0 41811.3 4904.0 4742.5 54 2.081 24 19.862 21.943
349184.0 349184.0 156685.4 0.0 350208.0 0.0 2097152.0 2073856.9
42536.0 41817.5 4904.0 4742.5 68 2.632 30 22.649 25.281
344576.0 52224.0 0.0 51789.6 350208.0 0.0 2097152.0 1871862.2 42536.0
41823.4 4904.0 4742.5 79 2.991 37 27.764 30.756
349184.0 349184.0 302724.6 0.0 350208.0 0.0 2097152.0 2023258.1
42536.0 41830.3 4904.0 4742.5 92 3.559 43 30.363 33.922
305664.0 104448.0 0.0 104434.1 358400.0 114713.9 2097152.0 1857545.7
42536.0 41846.0 4904.0 4742.5 101 3.804 50 36.314 40.118
335872.0 66048.0 0.0 65680.0 350208.0 247000.5 2097152.0 1923473.1 42536.0
41852.3 4904.0 4742.5 115 4.364 56 39.622 43.986
349184.0 349184.0 0.0 349163.6 350208.0 0.0 2097152.0 1975419.5
42536.0 41861.6 4904.0 4742.5 125 4.863 60 44.782 49.645
349184.0 349184.0 0.0 17375.8 350208.0 0.0 2097152.0 1999279.6
42536.0 41863.4 4904.0 4742.5 133 5.091 66 50.602 55.693
340992.0 345088.0 0.0 0.0 350208.0 350208.0 2097152.0 1977601.3 42536.0
41894.1 4904.0 4742.5 146 5.566 73 54.388 59.953
349184.0 349184.0 0.0 0.0 350208.0 350208.0 2097152.0 1963150.5 42536.0
41906.5 4904.0 4742.5 158 6.036 80 58.024 64.060
349184.0 349184.0 0.0 63008.0 350208.0 56631.7 2097152.0 1942708.7
42536.0 41916.8 4904.0 4742.5 177 6.751 84 60.799 67.550
345600.0 349184.0 263005.5 0.0 350208.0 55075.8 2097152.0 1883022.2
42536.0 41925.8 4904.0 4742.5 192 7.295 88 65.144 72.439
349184.0 344064.0 139656.9 0.0 355328.0 0.0 2097152.0 2070558.8
42536.0 41935.4 4904.0 4742.5 204 7.803 95 68.618 76.421
349184.0 349184.0 63008.0 0.0 350208.0 0.0 2097152.0 1963035.3
42536.0 41971.9 4904.0 4742.5 218 8.339 102 72.534 80.873
349184.0 349184.0 0.0 17984.0 350208.0 253707.1 2097152.0 1848396.0
42536.0 41996.1 4904.0 4742.5 229 8.686 108 76.526 85.212
349184.0 324608.0 0.0 284036.3 351744.0 0.0 2097152.0 1776761.8
42536.0 42021.6 4904.0 4742.5 247 9.320 113 79.669 88.989
349184.0 349184.0 38510.2 0.0 350208.0 160454.7 2097152.0 1896405.6
42536.0 42036.0 4904.0 4742.5 256 9.555 115 80.723 90.278
342528.0 333824.0 147271.1 0.0 358912.0 312219.3 2097152.0 1882525.2
42792.0 42065.0 4904.0 4742.5 278 10.620 116 81.143 91.763
349184.0 283648.0 0.0 283645.6 370176.0 249240.7 2097152.0 1893727.0
42792.0 42066.5 4904.0 4742.5 303 11.738 117 81.684 93.422
349184.0 349184.0 49417.0 0.0 350208.0 172262.6 2097152.0 1127855.3
42792.0 42071.0 4904.0 4742.5 326 12.787 119 82.895 95.682
349184.0 349184.0 0.0 131245.0 350208.0 133019.3 2097152.0 1074752.6
42792.0 42081.4 4904.0 4742.5 349 13.943 120 83.448 97.391
{code}
Data loader client details for non-txn table: ~12sec GC pause with
non-transactional table. Total time to load 1M rows ~1mins. ~2G physical mem
used. Java XMX param was set to 3G.
{code}
JSTAT for data loader - 10 sec interval
S0C S1C S0U S1U EC EU OC OU MC MU
CCSC CCSU YGC YGCT FGC FGCT GCT
24576.0 28160.0 0.0 0.0 195584.0 28008.1 349696.0 25257.3 35496.0
35020.3 4264.0 4037.5 6 0.056 2 0.071 0.128
79872.0 79872.0 0.0 79860.8 888832.0 0.0 753664.0 753092.4 41000.0
40361.6 4648.0 4563.6 17 0.616 3 0.071 0.688
259584.0 260096.0 129075.4 161535.6 528384.0 528384.0 937472.0 767859.8
41000.0 40458.7 4648.0 4564.6 31 1.319 4 1.622 2.941
283136.0 278016.0 23716.8 0.0 487424.0 143734.6 1498624.0 662740.5
41000.0 40479.8 4648.0 4564.6 48 2.459 7 3.122 5.582
240640.0 251392.0 162230.5 31674.1 545792.0 545792.0 1530368.0 642312.2
41000.0 40531.4 4648.0 4567.6 69 3.733 9 4.069 7.802
289280.0 282624.0 19730.4 0.0 468480.0 403700.7 1786880.0 672352.5
41256.0 40645.2 4648.0 4567.6 86 4.974 11 5.160 10.134
282624.0 290816.0 163586.1 0.0 466944.0 268299.4 1950208.0 1491203.8
41256.0 40667.0 4648.0 4567.6 108 6.288 12 5.723 12.011
{code}
* TM uses ~4G physical memory (there's no GC pause for TM)
* After restarting TM, TM memory usage drops to 0.5G and data loader GC pauses
decrease to ~15sec for transactional table and 1M gets loaded in ~1min.
This hints to possible leak in TM. Since this test was on a remote secure
cluster, I haven't been able to profile it so I'll try to repro it locally to
get more info.
was:
Write to transactional table get up to ~3X slower compared to non-transactional
table.
Details:
* Transaction Manager (TM) is *always* kept up and running between all data
loads
* JVM for data loader is stopped after each 1M rows upsert
* Data loader uses 10 threads to write in parallel with 10K batch size
* Data load time is comparable for first few runs and it gets ~3X slow after
transaction manager is up for 5-10 hours
Data loader client details for txn-table: ~100sec GC pause with transactional
table. Total time to load 1M rows ~4mins. ~3.2G physical mem used. Java XMX
param was set to 3G.
{code}
JSTAT for data loader - 10 sec interval
S0C S1C S0U S1U EC EU OC OU MC MU
CCSC CCSU YGC YGCT FGC FGCT GCT
40448.0 40960.0 0.0 21920.0 571904.0 0.0 349696.0 87959.7 42024.0
41469.1 4904.0 4739.9 13 0.174 2 0.072 0.246
333824.0 333824.0 0.0 121135.1 380928.0 84826.0 2097152.0 2009035.3
42280.0 41652.1 4904.0 4740.4 29 1.172 6 5.231 6.403
349184.0 349184.0 0.0 0.0 350208.0 350205.8 2097152.0 1952578.1 42536.0
41758.6 4904.0 4742.5 36 1.353 12 9.882 11.234
322560.0 336896.0 214566.6 104273.9 344576.0 344561.2 2097152.0 1859305.5
42536.0 41774.5 4904.0 4742.5 45 1.643 17 14.684 16.327
349184.0 349184.0 41351.1 0.0 350208.0 48147.1 2097152.0 1935605.0
42536.0 41811.3 4904.0 4742.5 54 2.081 24 19.862 21.943
349184.0 349184.0 156685.4 0.0 350208.0 0.0 2097152.0 2073856.9
42536.0 41817.5 4904.0 4742.5 68 2.632 30 22.649 25.281
344576.0 52224.0 0.0 51789.6 350208.0 0.0 2097152.0 1871862.2 42536.0
41823.4 4904.0 4742.5 79 2.991 37 27.764 30.756
349184.0 349184.0 302724.6 0.0 350208.0 0.0 2097152.0 2023258.1
42536.0 41830.3 4904.0 4742.5 92 3.559 43 30.363 33.922
305664.0 104448.0 0.0 104434.1 358400.0 114713.9 2097152.0 1857545.7
42536.0 41846.0 4904.0 4742.5 101 3.804 50 36.314 40.118
335872.0 66048.0 0.0 65680.0 350208.0 247000.5 2097152.0 1923473.1 42536.0
41852.3 4904.0 4742.5 115 4.364 56 39.622 43.986
349184.0 349184.0 0.0 349163.6 350208.0 0.0 2097152.0 1975419.5
42536.0 41861.6 4904.0 4742.5 125 4.863 60 44.782 49.645
349184.0 349184.0 0.0 17375.8 350208.0 0.0 2097152.0 1999279.6
42536.0 41863.4 4904.0 4742.5 133 5.091 66 50.602 55.693
340992.0 345088.0 0.0 0.0 350208.0 350208.0 2097152.0 1977601.3 42536.0
41894.1 4904.0 4742.5 146 5.566 73 54.388 59.953
349184.0 349184.0 0.0 0.0 350208.0 350208.0 2097152.0 1963150.5 42536.0
41906.5 4904.0 4742.5 158 6.036 80 58.024 64.060
349184.0 349184.0 0.0 63008.0 350208.0 56631.7 2097152.0 1942708.7
42536.0 41916.8 4904.0 4742.5 177 6.751 84 60.799 67.550
345600.0 349184.0 263005.5 0.0 350208.0 55075.8 2097152.0 1883022.2
42536.0 41925.8 4904.0 4742.5 192 7.295 88 65.144 72.439
349184.0 344064.0 139656.9 0.0 355328.0 0.0 2097152.0 2070558.8
42536.0 41935.4 4904.0 4742.5 204 7.803 95 68.618 76.421
349184.0 349184.0 63008.0 0.0 350208.0 0.0 2097152.0 1963035.3
42536.0 41971.9 4904.0 4742.5 218 8.339 102 72.534 80.873
349184.0 349184.0 0.0 17984.0 350208.0 253707.1 2097152.0 1848396.0
42536.0 41996.1 4904.0 4742.5 229 8.686 108 76.526 85.212
349184.0 324608.0 0.0 284036.3 351744.0 0.0 2097152.0 1776761.8
42536.0 42021.6 4904.0 4742.5 247 9.320 113 79.669 88.989
349184.0 349184.0 38510.2 0.0 350208.0 160454.7 2097152.0 1896405.6
42536.0 42036.0 4904.0 4742.5 256 9.555 115 80.723 90.278
342528.0 333824.0 147271.1 0.0 358912.0 312219.3 2097152.0 1882525.2
42792.0 42065.0 4904.0 4742.5 278 10.620 116 81.143 91.763
349184.0 283648.0 0.0 283645.6 370176.0 249240.7 2097152.0 1893727.0
42792.0 42066.5 4904.0 4742.5 303 11.738 117 81.684 93.422
349184.0 349184.0 49417.0 0.0 350208.0 172262.6 2097152.0 1127855.3
42792.0 42071.0 4904.0 4742.5 326 12.787 119 82.895 95.682
349184.0 349184.0 0.0 131245.0 350208.0 133019.3 2097152.0 1074752.6
42792.0 42081.4 4904.0 4742.5 349 13.943 120 83.448 97.391
{code}
Data loader client details for non-txn table: ~12sec GC pause with
non-transactional table. Total time to load 1M rows ~1mins. ~2G physical mem
used. Java XMX param was set to 3G.
{code}
JSTAT for data loader - 10 sec interval
S0C S1C S0U S1U EC EU OC OU MC MU
CCSC CCSU YGC YGCT FGC FGCT GCT
24576.0 28160.0 0.0 0.0 195584.0 28008.1 349696.0 25257.3 35496.0
35020.3 4264.0 4037.5 6 0.056 2 0.071 0.128
79872.0 79872.0 0.0 79860.8 888832.0 0.0 753664.0 753092.4 41000.0
40361.6 4648.0 4563.6 17 0.616 3 0.071 0.688
259584.0 260096.0 129075.4 161535.6 528384.0 528384.0 937472.0 767859.8
41000.0 40458.7 4648.0 4564.6 31 1.319 4 1.622 2.941
283136.0 278016.0 23716.8 0.0 487424.0 143734.6 1498624.0 662740.5
41000.0 40479.8 4648.0 4564.6 48 2.459 7 3.122 5.582
240640.0 251392.0 162230.5 31674.1 545792.0 545792.0 1530368.0 642312.2
41000.0 40531.4 4648.0 4567.6 69 3.733 9 4.069 7.802
289280.0 282624.0 19730.4 0.0 468480.0 403700.7 1786880.0 672352.5
41256.0 40645.2 4648.0 4567.6 86 4.974 11 5.160 10.134
282624.0 290816.0 163586.1 0.0 466944.0 268299.4 1950208.0 1491203.8
41256.0 40667.0 4648.0 4567.6 108 6.288 12 5.723 12.011
{code}
* TM uses ~4G physical memory (there's no GC pause for RM)
* After restarting TM, TM memory usage drops to 0.5G and data loader GC pauses
decrease to ~15sec for transactional table and 1M gets loaded in ~1min.
This hints to possible leak in TM. Since this test was on a remote secure
cluster, I haven't been able to profile it so I'll try to repro it locally to
get more info.
> Transactional writes get slow over time due to possible memory leak
> -------------------------------------------------------------------
>
> Key: PHOENIX-2929
> URL: https://issues.apache.org/jira/browse/PHOENIX-2929
> Project: Phoenix
> Issue Type: Bug
> Reporter: Mujtaba Chohan
> Attachments: txn-manager-log-during-slow-writes.log
>
>
> Write to transactional table get up to ~3X slower compared to
> non-transactional table.
> Details:
> * Transaction Manager (TM) is *always* kept up and running between all data
> loads
> * JVM for data loader is stopped after each 1M rows upsert
> * Data loader uses 10 threads to write in parallel with 10K batch size
> * Data load time is comparable for first few runs and it gets ~3X slow after
> transaction manager is up for 5-10 hours
> Data loader client details for txn-table: ~100sec GC pause with transactional
> table. Total time to load 1M rows ~4mins. ~3.2G physical mem used. Java XMX
> param was set to 3G.
> {code}
> JSTAT for data loader - 10 sec interval
> S0C S1C S0U S1U EC EU OC OU MC
> MU CCSC CCSU YGC YGCT FGC FGCT GCT
> 40448.0 40960.0 0.0 21920.0 571904.0 0.0 349696.0 87959.7
> 42024.0 41469.1 4904.0 4739.9 13 0.174 2 0.072 0.246
> 333824.0 333824.0 0.0 121135.1 380928.0 84826.0 2097152.0 2009035.3
> 42280.0 41652.1 4904.0 4740.4 29 1.172 6 5.231 6.403
> 349184.0 349184.0 0.0 0.0 350208.0 350205.8 2097152.0 1952578.1
> 42536.0 41758.6 4904.0 4742.5 36 1.353 12 9.882 11.234
> 322560.0 336896.0 214566.6 104273.9 344576.0 344561.2 2097152.0 1859305.5
> 42536.0 41774.5 4904.0 4742.5 45 1.643 17 14.684 16.327
> 349184.0 349184.0 41351.1 0.0 350208.0 48147.1 2097152.0 1935605.0
> 42536.0 41811.3 4904.0 4742.5 54 2.081 24 19.862 21.943
> 349184.0 349184.0 156685.4 0.0 350208.0 0.0 2097152.0 2073856.9
> 42536.0 41817.5 4904.0 4742.5 68 2.632 30 22.649 25.281
> 344576.0 52224.0 0.0 51789.6 350208.0 0.0 2097152.0 1871862.2
> 42536.0 41823.4 4904.0 4742.5 79 2.991 37 27.764 30.756
> 349184.0 349184.0 302724.6 0.0 350208.0 0.0 2097152.0 2023258.1
> 42536.0 41830.3 4904.0 4742.5 92 3.559 43 30.363 33.922
> 305664.0 104448.0 0.0 104434.1 358400.0 114713.9 2097152.0 1857545.7
> 42536.0 41846.0 4904.0 4742.5 101 3.804 50 36.314 40.118
> 335872.0 66048.0 0.0 65680.0 350208.0 247000.5 2097152.0 1923473.1
> 42536.0 41852.3 4904.0 4742.5 115 4.364 56 39.622 43.986
> 349184.0 349184.0 0.0 349163.6 350208.0 0.0 2097152.0 1975419.5
> 42536.0 41861.6 4904.0 4742.5 125 4.863 60 44.782 49.645
> 349184.0 349184.0 0.0 17375.8 350208.0 0.0 2097152.0 1999279.6
> 42536.0 41863.4 4904.0 4742.5 133 5.091 66 50.602 55.693
> 340992.0 345088.0 0.0 0.0 350208.0 350208.0 2097152.0 1977601.3
> 42536.0 41894.1 4904.0 4742.5 146 5.566 73 54.388 59.953
> 349184.0 349184.0 0.0 0.0 350208.0 350208.0 2097152.0 1963150.5
> 42536.0 41906.5 4904.0 4742.5 158 6.036 80 58.024 64.060
> 349184.0 349184.0 0.0 63008.0 350208.0 56631.7 2097152.0 1942708.7
> 42536.0 41916.8 4904.0 4742.5 177 6.751 84 60.799 67.550
> 345600.0 349184.0 263005.5 0.0 350208.0 55075.8 2097152.0 1883022.2
> 42536.0 41925.8 4904.0 4742.5 192 7.295 88 65.144 72.439
> 349184.0 344064.0 139656.9 0.0 355328.0 0.0 2097152.0 2070558.8
> 42536.0 41935.4 4904.0 4742.5 204 7.803 95 68.618 76.421
> 349184.0 349184.0 63008.0 0.0 350208.0 0.0 2097152.0 1963035.3
> 42536.0 41971.9 4904.0 4742.5 218 8.339 102 72.534 80.873
> 349184.0 349184.0 0.0 17984.0 350208.0 253707.1 2097152.0 1848396.0
> 42536.0 41996.1 4904.0 4742.5 229 8.686 108 76.526 85.212
> 349184.0 324608.0 0.0 284036.3 351744.0 0.0 2097152.0 1776761.8
> 42536.0 42021.6 4904.0 4742.5 247 9.320 113 79.669 88.989
> 349184.0 349184.0 38510.2 0.0 350208.0 160454.7 2097152.0 1896405.6
> 42536.0 42036.0 4904.0 4742.5 256 9.555 115 80.723 90.278
> 342528.0 333824.0 147271.1 0.0 358912.0 312219.3 2097152.0 1882525.2
> 42792.0 42065.0 4904.0 4742.5 278 10.620 116 81.143 91.763
> 349184.0 283648.0 0.0 283645.6 370176.0 249240.7 2097152.0 1893727.0
> 42792.0 42066.5 4904.0 4742.5 303 11.738 117 81.684 93.422
> 349184.0 349184.0 49417.0 0.0 350208.0 172262.6 2097152.0 1127855.3
> 42792.0 42071.0 4904.0 4742.5 326 12.787 119 82.895 95.682
> 349184.0 349184.0 0.0 131245.0 350208.0 133019.3 2097152.0 1074752.6
> 42792.0 42081.4 4904.0 4742.5 349 13.943 120 83.448 97.391
> {code}
> Data loader client details for non-txn table: ~12sec GC pause with
> non-transactional table. Total time to load 1M rows ~1mins. ~2G physical mem
> used. Java XMX param was set to 3G.
> {code}
> JSTAT for data loader - 10 sec interval
> S0C S1C S0U S1U EC EU OC OU MC
> MU CCSC CCSU YGC YGCT FGC FGCT GCT
> 24576.0 28160.0 0.0 0.0 195584.0 28008.1 349696.0 25257.3 35496.0
> 35020.3 4264.0 4037.5 6 0.056 2 0.071 0.128
> 79872.0 79872.0 0.0 79860.8 888832.0 0.0 753664.0 753092.4
> 41000.0 40361.6 4648.0 4563.6 17 0.616 3 0.071 0.688
> 259584.0 260096.0 129075.4 161535.6 528384.0 528384.0 937472.0 767859.8
> 41000.0 40458.7 4648.0 4564.6 31 1.319 4 1.622 2.941
> 283136.0 278016.0 23716.8 0.0 487424.0 143734.6 1498624.0 662740.5
> 41000.0 40479.8 4648.0 4564.6 48 2.459 7 3.122 5.582
> 240640.0 251392.0 162230.5 31674.1 545792.0 545792.0 1530368.0 642312.2
> 41000.0 40531.4 4648.0 4567.6 69 3.733 9 4.069 7.802
> 289280.0 282624.0 19730.4 0.0 468480.0 403700.7 1786880.0 672352.5
> 41256.0 40645.2 4648.0 4567.6 86 4.974 11 5.160 10.134
> 282624.0 290816.0 163586.1 0.0 466944.0 268299.4 1950208.0 1491203.8
> 41256.0 40667.0 4648.0 4567.6 108 6.288 12 5.723 12.011
> {code}
> * TM uses ~4G physical memory (there's no GC pause for TM)
> * After restarting TM, TM memory usage drops to 0.5G and data loader GC
> pauses decrease to ~15sec for transactional table and 1M gets loaded in ~1min.
> This hints to possible leak in TM. Since this test was on a remote secure
> cluster, I haven't been able to profile it so I'll try to repro it locally to
> get more info.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)