[jira] [Assigned] (IGNITE-21051) Fix javadocs for IndexQuery
[ https://issues.apache.org/jira/browse/IGNITE-21051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleg Valuyskiy reassigned IGNITE-21051: --- Assignee: Oleg Valuyskiy > Fix javadocs for IndexQuery > --- > > Key: IGNITE-21051 > URL: https://issues.apache.org/jira/browse/IGNITE-21051 > Project: Ignite > Issue Type: Improvement >Reporter: Maksim Timonin >Assignee: Oleg Valuyskiy >Priority: Major > Labels: ise, newbie > > It's required to fix javadoc formatting in the `IndexQuery` class. Now it > renders the algorithm list in single line. Should use "ul", "li" tags for > correct rendering. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20992) ODBC 3.0: Propagate username to connection_info
[ https://issues.apache.org/jira/browse/IGNITE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Sapego updated IGNITE-20992: - Fix Version/s: 3.0.0-beta2 > ODBC 3.0: Propagate username to connection_info > --- > > Key: IGNITE-20992 > URL: https://issues.apache.org/jira/browse/IGNITE-20992 > Project: Ignite > Issue Type: New Feature > Components: odbc >Reporter: Dmitrii Zabotlin >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3, odbc > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20992) ODBC 3.0: Propagate username to connection_info
[ https://issues.apache.org/jira/browse/IGNITE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Sapego updated IGNITE-20992: - Ignite Flags: (was: Docs Required,Release Notes Required) > ODBC 3.0: Propagate username to connection_info > --- > > Key: IGNITE-20992 > URL: https://issues.apache.org/jira/browse/IGNITE-20992 > Project: Ignite > Issue Type: New Feature > Components: odbc >Reporter: Dmitrii Zabotlin >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3, odbc > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20992) ODBC 3.0: Propagate username to connection_info
[ https://issues.apache.org/jira/browse/IGNITE-20992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795904#comment-17795904 ] Igor Sapego commented on IGNITE-20992: -- Looks good to me. > ODBC 3.0: Propagate username to connection_info > --- > > Key: IGNITE-20992 > URL: https://issues.apache.org/jira/browse/IGNITE-20992 > Project: Ignite > Issue Type: New Feature > Components: odbc >Reporter: Dmitrii Zabotlin >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3, odbc > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795899#comment-17795899 ] Vipul Thakur commented on IGNITE-21059: --- Thank you for your response [~zstan] Will make the above changes and let you know how it goes, will also provide you the logs from all nodes. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795883#comment-17795883 ] Evgeny Stanilovsky commented on IGNITE-21059: - 1. yep just erase or comment them in config 2. ok here 3. 30 sec too match, if you detect tx rollback by timeout you can rerun it (check - optimistic tx may be more faster) but there are some differences tx write AND read can throws exception ! https://ignite.apache.org/docs/latest/key-value-api/transactions 4. can`t suggest here need to consider concrete usage. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795867#comment-17795867 ] Vipul Thakur commented on IGNITE-21059: --- So as per my understanding i will be doing the following, please correct me if am wrong : failureDetectionTimeout , clientFailureDetectionTimeout will switch back to default values which is 10secs and 30secs will increase the walSegmentSize from default 64mb to bigger value maybe around 512mb. [limit value being 2Gb] Any comments regarding the txn timeout value which is 30secs at client. TcpDiscoveryVmIpFinder – socket timeout is 60secs at server end and 5secs at client end. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795864#comment-17795864 ] Evgeny Stanilovsky commented on IGNITE-21059: - if you are talking about TcpDiscoveryVmIpFinder socket timeout - this is not a linked things ... i suggest to stay with both failureDetectionTimeout, clientFailureDetectionTimeout defaults and tune it only if you really found it would be helpful, but all failure issues need to be investigated, if system detects slow client (no matter where problem is, io\net\jvm pause) seems you no need such a client and need to fix the problem which leads to such situation at first. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795861#comment-17795861 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 5:33 PM: - We have daily requirement of 90-120 millions request for read and around 15-20 millions write requests current values : failureDetectionTimeout=12 clientFailureDetectionTimeout= 12 What would be the suggested values should we bring this closer to what socketTimeout is like 5secs and should these configuration be same at both server and client end? was (Author: vipul.thakur): We have daily requirement of 90-120 millions request for read and around 15-20 millions write requests current values : failureDetectionTimeout=12 clientFailureDetectionTimeout= 12 What would be the suggested value should bring this closer to what socketTimeout is like 5secs and should these configuration be same at both server and client end? > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795861#comment-17795861 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 5:32 PM: - We have daily requirement of 90-120 millions request for read and around 15-20 millions write requests current values : failureDetectionTimeout=12 clientFailureDetectionTimeout= 12 What would be the suggested value should bring this closer to what socketTimeout is like 5secs and should these configuration be same at both server and client end? was (Author: vipul.thakur): We have daily requirement of 90-120 millions request for read and around 15-20 millions current values : failureDetectionTimeout=12 clientFailureDetectionTimeout= 12 What would be the suggested value should bring this closer to what socketTimeout is like 5secs and should these configuration be same at both server and client end? > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795861#comment-17795861 ] Vipul Thakur commented on IGNITE-21059: --- We have daily requirement of 90-120 millions request for read and around 15-20 millions current values : failureDetectionTimeout=12 clientFailureDetectionTimeout= 12 What would be the suggested value should bring this closer to what socketTimeout is like 5secs and should these configuration be same at both server and client end? > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795860#comment-17795860 ] Evgeny Stanilovsky commented on IGNITE-21059: - failureDetectionTimeout - too huge as for me, if someone will hangs, grid will wait until this timeout, problem with txs are expected here. clientFailureDetectionTimeout - the same rebalanceBatchSize and rebalanceThrottle i suggest defaults. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795859#comment-17795859 ] Vipul Thakur commented on IGNITE-21059: --- We also have configured socket timeout at server and client end but from thread dump is seems like its stuck at get call in all the txns. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795859#comment-17795859 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 5:12 PM: - We also have configured socket timeout at server and client end but from thread dump its seems like its stuck at get call in all the txns. was (Author: vipul.thakur): We also have configured socket timeout at server and client end but from thread dump is seems like its stuck at get call in all the txns. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795858#comment-17795858 ] Vipul Thakur commented on IGNITE-21059: --- In 2.7.6 we use to observe long jvm pause logger in read services and not that much in write. Such behavior is not observed in 2.14 we have another such setup with same amount of nodes in cluster and same amount client serving as another datacenter for our api endpoint it has been running with no problems over a month now , but when we upgraded our other data center this issue occurred after just 3 days of upgrade. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795855#comment-17795855 ] Evgeny Stanilovsky commented on IGNITE-21059: - i suppose you no need such huge amount of readers writers, client nodes are not the narrow place at all (but it`s not a root cause of course) long jvm pause on *client* node - can lead to your problem i think. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795851#comment-17795851 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 4:59 PM: - we have two k8s cluster connected to that datacenter where in each k8s cluster 10 are read , 10 are write and 2 are kind of admin service. So in total of 44 client nodes. And i have also updated our cluster spec its 5 nodes , 400GB RAM and 1 Tb SDD. Long JVM pauses were observed in in 2.7.6. was (Author: vipul.thakur): we have two k8s cluster connected to that datacenter where in each k8s cluster 10 are read , 10 are write and 2 are kind of admin service. So in total of 44 client nodes. And i have also updated our cluster spec its 5 nodes , 400GB RAM and 1 Tb SDD. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795851#comment-17795851 ] Vipul Thakur commented on IGNITE-21059: --- we have two k8s cluster connected to that datacenter where in each k8s cluster 10 are read , 10 are write and 2 are kind of admin service. So in total of 44 client nodes. And i have also updated our cluster spec its 5 nodes , 400GB RAM and 1 Tb SDD. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795851#comment-17795851 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 4:59 PM: - we have two k8s cluster connected to that datacenter where in each k8s cluster 10 are read , 10 are write and 2 are kind of admin service. So in total of 44 client nodes. And i have also updated our cluster spec its 5 nodes , 400GB RAM and 1 Tb SDD Long JVM pauses were observed in in 2.7.6. was (Author: vipul.thakur): we have two k8s cluster connected to that datacenter where in each k8s cluster 10 are read , 10 are write and 2 are kind of admin service. So in total of 44 client nodes. And i have also updated our cluster spec its 5 nodes , 400GB RAM and 1 Tb SDD. Long JVM pauses were observed in in 2.7.6. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vipul Thakur updated IGNITE-21059: -- Description: We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in production environment where cluster would go in hang state due to partition map exchange. Please find the below ticket which i created a while back for ignite 2.7.6 https://issues.apache.org/jira/browse/IGNITE-13298 So we migrated the apache ignite version to 2.14 and upgrade happened smoothly but on the third day we could see cluster traffic dip again. We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 TB SDD. PFB for the attached config.[I have added it as attachment for review] I have also added the server logs from the same time when issue happened. We have set txn timeout as well as socket timeout both at server and client end for our write operations but seems like sometimes cluster goes into hang state and all our get calls are stuck and slowly everything starts to freeze our jms listener threads and every thread reaches a choked up state in sometime. Due to which our read services which does not even use txn to retrieve data also starts to choke. Ultimately leading to end user traffic dip. We were hoping product upgrade will help but that has not been the case till now. was: We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in production environment where cluster would go in hang state due to partition map exchange. Please find the below ticket which i created a while back for ignite 2.7.6 https://issues.apache.org/jira/browse/IGNITE-13298 So we migrated the apache ignite version to 2.14 and upgrade happened smoothly but on the third day we could see cluster traffic dip again. We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 TB HDD. PFB for the attached config.[I have added it as attachment for review] I have also added the server logs from the same time when issue happened. We have set txn timeout as well as socket timeout both at server and client end for our write operations but seems like sometimes cluster goes into hang state and all our get calls are stuck and slowly everything starts to freeze our jms listener threads and every thread reaches a choked up state in sometime. Due to which our read services which does not even use txn to retrieve data also starts to choke. Ultimately leading to end user traffic dip. We were hoping product upgrade will help but that has not been the case till now. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 5 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB SDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795846#comment-17795846 ] Evgeny Stanilovsky commented on IGNITE-21059: - do you really need 44 client nodes ? seems that client nodes restart help here ? is it all ok with client nodes ? no long jvm pauses ? > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795838#comment-17795838 ] Vipul Thakur commented on IGNITE-21059: --- Ok please give me sometime and we will change the wal size and let u know. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795836#comment-17795836 ] Evgeny Stanilovsky commented on IGNITE-21059: - need all logs from all nodes for further analyze, also check : https://ignite.apache.org/docs/latest/tools/control-script#transaction-management and change wal size > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20652) .NET: Thin 3.0: add SQL script execution API
[ https://issues.apache.org/jira/browse/IGNITE-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795818#comment-17795818 ] Pavel Tupitsyn commented on IGNITE-20652: - Comment addressed. Merged to main: 5bc3d2ccd22e2493231dfa857d24bed27a8373bd > .NET: Thin 3.0: add SQL script execution API > > > Key: IGNITE-20652 > URL: https://issues.apache.org/jira/browse/IGNITE-20652 > Project: Ignite > Issue Type: Improvement > Components: platforms, thin client >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Pereslegin >Assignee: Pavel Tupitsyn >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 40m > Remaining Estimate: 0h > > Support SQL script execution in dotnet thin client -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-19215) ODBC 3.0: Implement DML data batching
[ https://issues.apache.org/jira/browse/IGNITE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Sapego reassigned IGNITE-19215: Assignee: Dmitrii Zabotlin (was: Igor Sapego) > ODBC 3.0: Implement DML data batching > - > > Key: IGNITE-19215 > URL: https://issues.apache.org/jira/browse/IGNITE-19215 > Project: Ignite > Issue Type: Improvement > Components: odbc >Reporter: Igor Sapego >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > Scope: > - Implement server side request handling; > - Port client side functionality; > - Port applicable tests; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-19720) ODBC 3.0: Implement retrieval of Ignite version on handshake
[ https://issues.apache.org/jira/browse/IGNITE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Sapego reassigned IGNITE-19720: Assignee: Dmitrii Zabotlin (was: Igor Sapego) > ODBC 3.0: Implement retrieval of Ignite version on handshake > > > Key: IGNITE-19720 > URL: https://issues.apache.org/jira/browse/IGNITE-19720 > Project: Ignite > Issue Type: New Feature > Components: odbc >Affects Versions: 3.0.0-beta1 >Reporter: Igor Sapego >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3 > > SQLGetInfo(SQL_DBMS_VER) should return current version of cluster. Currently, > ODBC driver have not this information. Need to implement retrieval of this > information on handshake. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-19969) ODBC 3.0: Add support for period, duration and big_integer types
[ https://issues.apache.org/jira/browse/IGNITE-19969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Sapego reassigned IGNITE-19969: Assignee: Dmitrii Zabotlin > ODBC 3.0: Add support for period, duration and big_integer types > > > Key: IGNITE-19969 > URL: https://issues.apache.org/jira/browse/IGNITE-19969 > Project: Ignite > Issue Type: Improvement > Components: odbc >Reporter: Igor Sapego >Assignee: Dmitrii Zabotlin >Priority: Major > Labels: ignite-3 > > We didn't have support for such types in Ignite 2, so need to implement it > from scratch and add some tests for them as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances
[ https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795811#comment-17795811 ] Simon Greatrix commented on IGNITE-21032: - Fixed checkstyle issues and updated the pull request. Assigning back to [~slava.koptilin] for review and merging > ReadOnlyDynamicMBean.getAttributes may return a list of attribute values > instead of Attribute instances > --- > > Key: IGNITE-21032 > URL: https://issues.apache.org/jira/browse/IGNITE-21032 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Simon Greatrix >Priority: Major > Fix For: 2.16 > > Time Spent: 40m > Remaining Estimate: 0h > > When supplying JMX information, the AttributeList class should contain > Attributes, however the existing code returns attribute values. This can > cause ClassCastExceptions in code that attempts to read an AttributeList. > > [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances
[ https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Greatrix reassigned IGNITE-21032: --- Assignee: Vyacheslav Koptilin (was: Simon Greatrix) > ReadOnlyDynamicMBean.getAttributes may return a list of attribute values > instead of Attribute instances > --- > > Key: IGNITE-21032 > URL: https://issues.apache.org/jira/browse/IGNITE-21032 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Fix For: 2.16 > > Time Spent: 40m > Remaining Estimate: 0h > > When supplying JMX information, the AttributeList class should contain > Attributes, however the existing code returns attribute values. This can > cause ClassCastExceptions in code that attempts to read an AttributeList. > > [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20652) .NET: Thin 3.0: add SQL script execution API
[ https://issues.apache.org/jira/browse/IGNITE-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Tupitsyn updated IGNITE-20652: Summary: .NET: Thin 3.0: add SQL script execution API (was: .NET: Thin 3.0: support SQL script execution in dotnet thin client) > .NET: Thin 3.0: add SQL script execution API > > > Key: IGNITE-20652 > URL: https://issues.apache.org/jira/browse/IGNITE-20652 > Project: Ignite > Issue Type: Improvement > Components: platforms, thin client >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Pereslegin >Assignee: Pavel Tupitsyn >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Support SQL script execution in dotnet thin client -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795714#comment-17795714 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 2:42 PM: - Hi Thank you for quick response, we have configured tx timeout at client end our clients are written in spring boot and java , any config needed at server's config.xml also ? We will also read about changing-wal-segment-size and make the changes accordingly was (Author: vipul.thakur): Hi Thank you for quick response, we have configured tx timeout at client end our clients are written in spring boot and java , is it needed at server's config.xml also ? We will also read about chaning-wal-segment-size and make the changes accordingly > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21072) NamedListConfiguration#get(java.util.UUID) should cast to polymorphic type
[ https://issues.apache.org/jira/browse/IGNITE-21072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vadim Pakhnushev updated IGNITE-21072: -- Summary: NamedListConfiguration#get(java.util.UUID) should cast to polymorphic type (was: NamedListConfiguration#get(java.util.UUID) doesn't work with polymorphic configurations) > NamedListConfiguration#get(java.util.UUID) should cast to polymorphic type > -- > > Key: IGNITE-21072 > URL: https://issues.apache.org/jira/browse/IGNITE-21072 > Project: Ignite > Issue Type: Bug >Reporter: Vadim Pakhnushev >Assignee: Vadim Pakhnushev >Priority: Major > Labels: ignite-3 > > {{NamedListConfiguration#get(java.util.UUID)}} doesn't call > {{specificConfigTree}} so it doesn't cast the value to the specific type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20652) .NET: Thin 3.0: support SQL script execution in dotnet thin client
[ https://issues.apache.org/jira/browse/IGNITE-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795773#comment-17795773 ] Igor Sapego commented on IGNITE-20652: -- Approved with a little comment. > .NET: Thin 3.0: support SQL script execution in dotnet thin client > -- > > Key: IGNITE-20652 > URL: https://issues.apache.org/jira/browse/IGNITE-20652 > Project: Ignite > Issue Type: Improvement > Components: platforms, thin client >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Pereslegin >Assignee: Pavel Tupitsyn >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > Support SQL script execution in dotnet thin client -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21072) NamedListConfiguration#get(java.util.UUID) doesn't work with polymorphic configurations
Vadim Pakhnushev created IGNITE-21072: - Summary: NamedListConfiguration#get(java.util.UUID) doesn't work with polymorphic configurations Key: IGNITE-21072 URL: https://issues.apache.org/jira/browse/IGNITE-21072 Project: Ignite Issue Type: Bug Reporter: Vadim Pakhnushev Assignee: Vadim Pakhnushev {{NamedListConfiguration#get(java.util.UUID)}} doesn't call {{specificConfigTree}} so it doesn't cast the value to the specific type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21071) Rollback the transaction on primary failure if replication is not finished
[ https://issues.apache.org/jira/browse/IGNITE-21071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-21071: - Description: h3. Motivation Despite the fact that it's not always necessary within initial implementation it's required to rollback the transaction if a node that host primary replica failed pripor to replication finalization. h3. Definition of Done Transaction should be eventually rolled back in case of primary replica host failure while transaction is in infligts awaiting state. h3. Implementation Notes Primary replica host failure will end with corresponding primary replica expiration, thus within initial implementation it's required to listen primary replica expirations on tx finish while waiting infligts to complete. Corresponding SQL based case should also be checked. > Rollback the transaction on primary failure if replication is not finished > -- > > Key: IGNITE-21071 > URL: https://issues.apache.org/jira/browse/IGNITE-21071 > Project: Ignite > Issue Type: New Feature >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > > h3. Motivation > Despite the fact that it's not always necessary within initial implementation > it's required to rollback the transaction if a node that host primary replica > failed pripor to replication finalization. > h3. Definition of Done > Transaction should be eventually rolled back in case of primary replica host > failure while transaction is in infligts awaiting state. > h3. Implementation Notes > Primary replica host failure will end with corresponding primary replica > expiration, thus within initial implementation it's required to listen > primary replica expirations on tx finish while waiting infligts to complete. > Corresponding SQL based case should also be checked. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20994) Adjust writeIntentResolution logic in order to initiate the recovery if coordinator is dead
[ https://issues.apache.org/jira/browse/IGNITE-20994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795748#comment-17795748 ] Vladislav Pyatkov commented on IGNITE-20994: Merged 64d3bcb345a20e1dbcc554e40d8e709111b02110 > Adjust writeIntentResolution logic in order to initiate the recovery if > coordinator is dead > --- > > Key: IGNITE-20994 > URL: https://issues.apache.org/jira/browse/IGNITE-20994 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Denis Chudov >Priority: Major > Labels: ignite-3 > Time Spent: 3.5h > Remaining Estimate: 0h > > h3. Motivation > Besides lock conflict, writeIntent resolution may also detect abandoned > transactions, thus it also should trigger initiate recovery logic. Probably > with some write intent resolution specifics. > h3. Definition of Done > Write intent resolution will initiate the recovery in case of dead tx > coordinator within commit partition path. > h3. Implementation Notes > Basically within commit partition path in case of pending state write intent > should check whether coordinator is dead (that part might be tricky because > we may lose volatile state of where coordinator is) and if it is: rollback > the transaction, and send rolled back state backwards, the one that should > change local txn state to ABORTED on initial trigger. It's not clear whether > it's required to send unlock or special sort of cleanup message to the > trigger node. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21071) Rollback the transaction on primary failure if replication is not finished
Alexander Lapin created IGNITE-21071: Summary: Rollback the transaction on primary failure if replication is not finished Key: IGNITE-21071 URL: https://issues.apache.org/jira/browse/IGNITE-21071 Project: Ignite Issue Type: New Feature Reporter: Alexander Lapin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21070) Ensure that data node's primary replica expiration properly handled
[ https://issues.apache.org/jira/browse/IGNITE-21070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-21070: - Labels: ignite-3 (was: ) > Ensure that data node's primary replica expiration properly handled > --- > > Key: IGNITE-21070 > URL: https://issues.apache.org/jira/browse/IGNITE-21070 > Project: Ignite > Issue Type: New Feature >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > > h3. Motivation > Corresponding primary replica expiration logic should already be implemented, > mainly within coordinator recovery thus within a given ticket it's only > required to extend test coverage. Primary replica expiration handling logic > differs whether expiration itself occurred before or after following flow > splitters: > * Replication finishing. (Inflights == 0) > * Commit timestamp evolution. // Not sure actually, maybe there's no > difference between replication finishing and commit timestamp evolution > splitters. > * Finish request handling. > * Cleanup request handling. > * WriteIntent switch request handling. > Specific test scenarious will be specified during ticket implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20652) .NET: Thin 3.0: support SQL script execution in dotnet thin client
[ https://issues.apache.org/jira/browse/IGNITE-20652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Tupitsyn updated IGNITE-20652: Ignite Flags: (was: Docs Required,Release Notes Required) > .NET: Thin 3.0: support SQL script execution in dotnet thin client > -- > > Key: IGNITE-20652 > URL: https://issues.apache.org/jira/browse/IGNITE-20652 > Project: Ignite > Issue Type: Improvement > Components: platforms, thin client >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Pereslegin >Assignee: Pavel Tupitsyn >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 10m > Remaining Estimate: 0h > > Support SQL script execution in dotnet thin client -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21070) Ensure that data node's primary replica expiration properly handled
[ https://issues.apache.org/jira/browse/IGNITE-21070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Lapin updated IGNITE-21070: - Description: h3. Motivation Corresponding primary replica expiration logic should already be implemented, mainly within coordinator recovery thus within a given ticket it's only required to extend test coverage. Primary replica expiration handling logic differs whether expiration itself occurred before or after following flow splitters: * Replication finishing. (Inflights == 0) * Commit timestamp evolution. // Not sure actually, maybe there's no difference between replication finishing and commit timestamp evolution splitters. * Finish request handling. * Cleanup request handling. * WriteIntent switch request handling. Specific test scenarious will be specified during ticket implementation. > Ensure that data node's primary replica expiration properly handled > --- > > Key: IGNITE-21070 > URL: https://issues.apache.org/jira/browse/IGNITE-21070 > Project: Ignite > Issue Type: New Feature >Reporter: Alexander Lapin >Priority: Major > > h3. Motivation > Corresponding primary replica expiration logic should already be implemented, > mainly within coordinator recovery thus within a given ticket it's only > required to extend test coverage. Primary replica expiration handling logic > differs whether expiration itself occurred before or after following flow > splitters: > * Replication finishing. (Inflights == 0) > * Commit timestamp evolution. // Not sure actually, maybe there's no > difference between replication finishing and commit timestamp evolution > splitters. > * Finish request handling. > * Cleanup request handling. > * WriteIntent switch request handling. > Specific test scenarious will be specified during ticket implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20994) Adjust writeIntentResolution logic in order to initiate the recovery if coordinator is dead
[ https://issues.apache.org/jira/browse/IGNITE-20994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denis Chudov updated IGNITE-20994: -- Reviewer: Vladislav Pyatkov > Adjust writeIntentResolution logic in order to initiate the recovery if > coordinator is dead > --- > > Key: IGNITE-20994 > URL: https://issues.apache.org/jira/browse/IGNITE-20994 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Denis Chudov >Priority: Major > Labels: ignite-3 > Time Spent: 3h 20m > Remaining Estimate: 0h > > h3. Motivation > Besides lock conflict, writeIntent resolution may also detect abandoned > transactions, thus it also should trigger initiate recovery logic. Probably > with some write intent resolution specifics. > h3. Definition of Done > Write intent resolution will initiate the recovery in case of dead tx > coordinator within commit partition path. > h3. Implementation Notes > Basically within commit partition path in case of pending state write intent > should check whether coordinator is dead (that part might be tricky because > we may lose volatile state of where coordinator is) and if it is: rollback > the transaction, and send rolled back state backwards, the one that should > change local txn state to ABORTED on initial trigger. It's not clear whether > it's required to send unlock or special sort of cleanup message to the > trigger node. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21070) Ensure that data node's primary replica expiration properly handled
Alexander Lapin created IGNITE-21070: Summary: Ensure that data node's primary replica expiration properly handled Key: IGNITE-21070 URL: https://issues.apache.org/jira/browse/IGNITE-21070 Project: Ignite Issue Type: New Feature Reporter: Alexander Lapin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21069) Tx on unstable topology: data node recovery
Alexander Lapin created IGNITE-21069: Summary: Tx on unstable topology: data node recovery Key: IGNITE-21069 URL: https://issues.apache.org/jira/browse/IGNITE-21069 Project: Ignite Issue Type: Epic Reporter: Alexander Lapin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795735#comment-17795735 ] Vipul Thakur edited comment on IGNITE-21059 at 12/12/23 1:08 PM: - Evidence that txn timeout is enabled at client end : Below are the server logs: 2023-11-30T14:19:01,783][ERROR][grid-timeout-worker-#326%EVENT_PROCESSING%|#326%EVENT_PROCESSING%][GridDhtColocatedCache] Failed to acquire lock for request: GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=93, minorTopVer=0], miniId=1, dhtVers=GridCacheVersion[] [null], taskNameHash=0, createTtl=-1, accessTtl=-1, flags=3, txLbl=null, filter=null, super=GridDistributedLockRequest [nodeId=62fdf256-6130-4ef3-842c-b2078f6e6c07, nearXidVer=GridCacheVersion [topVer=312674007, order=1701333641101, nodeOrder=53, dataCenterId=0], threadId=372, futId=9c4a6212c81-c17f568a-3419-42a6-9042-7a1f3281301c, timeout=3, isInTx=true, isInvalidate=false, isRead=true, isolation=REPEATABLE_READ, retVals=[true], txSize=0, flags=0, keysCnt=1, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=312674007, order=1701333641101, nodeOrder=53, dataCenterId=0], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=-885490198, super=GridCacheMessage [msgId=55444220, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=53, minorTopVer=0], err=null, skipPrepare=false] [2023-11-30T14:19:44,579][ERROR][grid-timeout-worker-#326%EVENT_PROCESSING%|#326%EVENT_PROCESSING%][GridDhtColocatedCache] Failed to acquire lock for request: GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=93, minorTopVer=0], miniId=1, dhtVers=GridCacheVersion[] [null], taskNameHash=0, createTtl=-1, accessTtl=-1, flags=3, txLbl=null, filter=null, super=GridDistributedLockRequest [nodeId=62fdf256-6130-4ef3-842c-b2078f6e6c07, nearXidVer=GridCacheVersion [topVer=312674007, order=1701333641190, nodeOrder=53, dataCenterId=0], threadId=897, futId=a3ba6212c81-c17f568a-3419-42a6-9042-7a1f3281301c, *timeout=3, isInTx=true, isInvalidate=false, isRead=true, isolation=REPEATABLE_READ,* retVals=[true], txSize=0, flags=0, keysCnt=1, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=312674007, order=1701333641190, nodeOrder=53, dataCenterId=0], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=-885490198, super=GridCacheMessage [msgId=55444392, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=53, minorTopVer=0], err=null, skipPrepare=false] org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException: Failed to acquire lock within provided timeout for transaction [timeout=3, tx=GridDhtTxLocal[xid=c8a166f1c81--12a3-06d7--0001, xidVersion=GridCacheVersion [topVer=312674007, order=1701333834380, nodeOrder=1, dataCenterId=0], nearXidVersion=GridCacheVersion [topVer=312674007, order=1701333641190, nodeOrder=53, dataCenterId=0], concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, state=MARKED_ROLLBACK, invalidate=false, rollbackOnly=true, nodeId=f751efe5-c44c-4b3c-bcd3-dd5866ec0bdd, timeout=3, startTime=1701334154571, {*}duration=30003]{*}] was (Author: vipul.thakur): Evidence that txn timeout is enabled at client end : 2023-11-30T14:19:01,783][ERROR][grid-timeout-worker-#326%EVENT_PROCESSING%][GridDhtColocatedCache] Failed to acquire lock for request: GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=93, minorTopVer=0], miniId=1, dhtVers=GridCacheVersion[] [null], taskNameHash=0, createTtl=-1, accessTtl=-1, flags=3, txLbl=null, filter=null, super=GridDistributedLockRequest [nodeId=62fdf256-6130-4ef3-842c-b2078f6e6c07, nearXidVer=GridCacheVersion [topVer=312674007, order=1701333641101, nodeOrder=53, dataCenterId=0], threadId=372, futId=9c4a6212c81-c17f568a-3419-42a6-9042-7a1f3281301c, timeout=3, isInTx=true, isInvalidate=false, isRead=true, isolation=REPEATABLE_READ, retVals=[true], txSize=0, flags=0, keysCnt=1, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=312674007, order=1701333641101, nodeOrder=53, dataCenterId=0], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=-885490198, super=GridCacheMessage [msgId=55444220, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=53, minorTopVer=0], err=null, skipPrepare=false] [2023-11-30T14:19:44,579][ERROR][grid-timeout-worker-#326%EVENT_PROCESSING%][GridDhtColocatedCache] Failed to acquire lock for request: GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=93, minorTopVer=0], miniId=1, dhtVers=GridCacheVersion[] [null], taskNameHash=0, createTtl=-1, accessTtl=-1, flags=3, txLbl=null, filter=null, super=GridDistributedLockRequest [nodeId=62fdf256-6130-4ef3-842c-b2078f6e6c07, nearXidVer=GridCacheVersion [topVer=312674007,
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795735#comment-17795735 ] Vipul Thakur commented on IGNITE-21059: --- Evidence that txn timeout is enabled at client end : 2023-11-30T14:19:01,783][ERROR][grid-timeout-worker-#326%EVENT_PROCESSING%][GridDhtColocatedCache] Failed to acquire lock for request: GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=93, minorTopVer=0], miniId=1, dhtVers=GridCacheVersion[] [null], taskNameHash=0, createTtl=-1, accessTtl=-1, flags=3, txLbl=null, filter=null, super=GridDistributedLockRequest [nodeId=62fdf256-6130-4ef3-842c-b2078f6e6c07, nearXidVer=GridCacheVersion [topVer=312674007, order=1701333641101, nodeOrder=53, dataCenterId=0], threadId=372, futId=9c4a6212c81-c17f568a-3419-42a6-9042-7a1f3281301c, timeout=3, isInTx=true, isInvalidate=false, isRead=true, isolation=REPEATABLE_READ, retVals=[true], txSize=0, flags=0, keysCnt=1, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=312674007, order=1701333641101, nodeOrder=53, dataCenterId=0], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=-885490198, super=GridCacheMessage [msgId=55444220, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=53, minorTopVer=0], err=null, skipPrepare=false] [2023-11-30T14:19:44,579][ERROR][grid-timeout-worker-#326%EVENT_PROCESSING%][GridDhtColocatedCache] Failed to acquire lock for request: GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=93, minorTopVer=0], miniId=1, dhtVers=GridCacheVersion[] [null], taskNameHash=0, createTtl=-1, accessTtl=-1, flags=3, txLbl=null, filter=null, super=GridDistributedLockRequest [nodeId=62fdf256-6130-4ef3-842c-b2078f6e6c07, nearXidVer=GridCacheVersion [topVer=312674007, order=1701333641190, nodeOrder=53, dataCenterId=0], threadId=897, futId=a3ba6212c81-c17f568a-3419-42a6-9042-7a1f3281301c, *timeout=3, isInTx=true, isInvalidate=false, isRead=true, isolation=REPEATABLE_READ,* retVals=[true], txSize=0, flags=0, keysCnt=1, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=312674007, order=1701333641190, nodeOrder=53, dataCenterId=0], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=-885490198, super=GridCacheMessage [msgId=55444392, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=53, minorTopVer=0], err=null, skipPrepare=false] org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException: Failed to acquire lock within provided timeout for transaction [timeout=3, tx=GridDhtTxLocal[xid=c8a166f1c81--12a3-06d7--0001, xidVersion=GridCacheVersion [topVer=312674007, order=1701333834380, nodeOrder=1, dataCenterId=0], nearXidVersion=GridCacheVersion [topVer=312674007, order=1701333641190, nodeOrder=53, dataCenterId=0], concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, state=MARKED_ROLLBACK, invalidate=false, rollbackOnly=true, nodeId=f751efe5-c44c-4b3c-bcd3-dd5866ec0bdd, timeout=3, startTime=1701334154571, {*}duration=30003]{*}] > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything
[jira] [Created] (IGNITE-21068) Ignite node must not communicate with a node removed from the Physical Topology
Roman Puchkovskiy created IGNITE-21068: -- Summary: Ignite node must not communicate with a node removed from the Physical Topology Key: IGNITE-21068 URL: https://issues.apache.org/jira/browse/IGNITE-21068 Project: Ignite Issue Type: Improvement Reporter: Roman Puchkovskiy Fix For: 3.0.0-beta2 It is possible for a node to be considered DEAD due to a timeout (because it did not respond to a series of pings in a timely manner) even though the network channel is still operational. Currently, even after a nodes is removed from the Physical Topology, it can still send/receive messages. This breaks an invariant that such a node must not be able to communicate with the cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-20745) TableManager.tableAsync(int tableId) is slowing down thin clients
[ https://issues.apache.org/jira/browse/IGNITE-20745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795722#comment-17795722 ] Igor Sapego edited comment on IGNITE-20745 at 12/12/23 12:44 PM: - [~ibessonov] suggested to try the following quick fix: Let's make the following call synchronous, and check if this will help: At org.apache.ignite.internal.table.distributed.TableManager#tableAsyncInternal: {code:java} return orStopManagerFuture(schemaSyncService.waitForMetadataCompleteness(now)) .thenComposeAsync(unused -> inBusyLockAsync(busyLock, () -> { ... {code} Replace this with: {code:java} if (fut.isDone()) ... {code} In most cases, the future that was returned from {{waitForMetadataCompleteness}} has already completed, and using async is a waste of time. There is only a reading from the map inside, and probably nothing more. This should be the most heavy place in terms of execution time. was (Author: isapego): [~ibessonov] suggested to try the following quick fix: Let's make the following call synchronous, and check if this will help: At org.apache.ignite.internal.table.distributed.TableManager#tableAsyncInternal: {code:java} return orStopManagerFuture(schemaSyncService.waitForMetadataCompleteness(now)) .thenComposeAsync(unused -> inBusyLockAsync(busyLock, () -> { ... {code} Replace this with: {code:java} if (fut.isDone()) ... {code} > TableManager.tableAsync(int tableId) is slowing down thin clients > - > > Key: IGNITE-20745 > URL: https://issues.apache.org/jira/browse/IGNITE-20745 > Project: Ignite > Issue Type: Improvement >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Igor Sapego >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Attachments: ItThinClientPutGetBenchmark.java > > > Performance difference between embedded and client modes is affected > considerably by the call to *IgniteTablesInternal#tableAsync(int id)*. This > call has to be performed on every individual table operation. > We should make it as fast as possible. Something like a dictionary lookup + > quick check for deleted table. > ||Part||Duration, us|| > |Network & msgpack|19.30| > |Get table|14.29| > |Get tuple & serialize|12.86| -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-20745) TableManager.tableAsync(int tableId) is slowing down thin clients
[ https://issues.apache.org/jira/browse/IGNITE-20745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795722#comment-17795722 ] Igor Sapego edited comment on IGNITE-20745 at 12/12/23 12:42 PM: - [~ibessonov] suggested to try the following quick fix: Let's make the following call synchronous, and check if this will help: At org.apache.ignite.internal.table.distributed.TableManager#tableAsyncInternal: {code:java} return orStopManagerFuture(schemaSyncService.waitForMetadataCompleteness(now)) .thenComposeAsync(unused -> inBusyLockAsync(busyLock, () -> { ... {code} Replace this with: {code:java} if (fut.isDone()) ... {code} was (Author: isapego): [~ibessonov] suggested to try the following quick fix: Let's make the following call synchronous, and check if this will help: At org.apache.ignite.internal.table.distributed.TableManager#tableAsyncInternal: {code:java} return orStopManagerFuture(schemaSyncService.waitForMetadataCompleteness(now)) .thenComposeAsync(unused -> inBusyLockAsync(busyLock, () -> { ... {code} > TableManager.tableAsync(int tableId) is slowing down thin clients > - > > Key: IGNITE-20745 > URL: https://issues.apache.org/jira/browse/IGNITE-20745 > Project: Ignite > Issue Type: Improvement >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Igor Sapego >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Attachments: ItThinClientPutGetBenchmark.java > > > Performance difference between embedded and client modes is affected > considerably by the call to *IgniteTablesInternal#tableAsync(int id)*. This > call has to be performed on every individual table operation. > We should make it as fast as possible. Something like a dictionary lookup + > quick check for deleted table. > ||Part||Duration, us|| > |Network & msgpack|19.30| > |Get table|14.29| > |Get tuple & serialize|12.86| -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20745) TableManager.tableAsync(int tableId) is slowing down thin clients
[ https://issues.apache.org/jira/browse/IGNITE-20745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795722#comment-17795722 ] Igor Sapego commented on IGNITE-20745: -- [~ibessonov] suggested to try the following quick fix: Let's make the following call synchronous, and check if this will help: At org.apache.ignite.internal.table.distributed.TableManager#tableAsyncInternal: {code:java} return orStopManagerFuture(schemaSyncService.waitForMetadataCompleteness(now)) .thenComposeAsync(unused -> inBusyLockAsync(busyLock, () -> { ... {code} > TableManager.tableAsync(int tableId) is slowing down thin clients > - > > Key: IGNITE-20745 > URL: https://issues.apache.org/jira/browse/IGNITE-20745 > Project: Ignite > Issue Type: Improvement >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Igor Sapego >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Attachments: ItThinClientPutGetBenchmark.java > > > Performance difference between embedded and client modes is affected > considerably by the call to *IgniteTablesInternal#tableAsync(int id)*. This > call has to be performed on every individual table operation. > We should make it as fast as possible. Something like a dictionary lookup + > quick check for deleted table. > ||Part||Duration, us|| > |Network & msgpack|19.30| > |Get table|14.29| > |Get tuple & serialize|12.86| -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances
[ https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Greatrix reassigned IGNITE-21032: --- Assignee: Simon Greatrix Taking ownership to fix checkstyle issues. > ReadOnlyDynamicMBean.getAttributes may return a list of attribute values > instead of Attribute instances > --- > > Key: IGNITE-21032 > URL: https://issues.apache.org/jira/browse/IGNITE-21032 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Assignee: Simon Greatrix >Priority: Major > Fix For: 2.16 > > Time Spent: 40m > Remaining Estimate: 0h > > When supplying JMX information, the AttributeList class should contain > Attributes, however the existing code returns attribute values. This can > cause ClassCastExceptions in code that attempts to read an AttributeList. > > [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21032) ReadOnlyDynamicMBean.getAttributes may return a list of attribute values instead of Attribute instances
[ https://issues.apache.org/jira/browse/IGNITE-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795719#comment-17795719 ] Simon Greatrix commented on IGNITE-21032: - Yes it is. However, the supplementary issue identified in the comment from [~sato_eiichi] is not covered by this. > ReadOnlyDynamicMBean.getAttributes may return a list of attribute values > instead of Attribute instances > --- > > Key: IGNITE-21032 > URL: https://issues.apache.org/jira/browse/IGNITE-21032 > Project: Ignite > Issue Type: Bug >Reporter: Vyacheslav Koptilin >Priority: Major > Fix For: 2.16 > > Time Spent: 40m > Remaining Estimate: 0h > > When supplying JMX information, the AttributeList class should contain > Attributes, however the existing code returns attribute values. This can > cause ClassCastExceptions in code that attempts to read an AttributeList. > > [GitHub Issue #11045|https://github.com/apache/ignite/issues/11045] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21063: -- Assignee: Ivan Bessonov > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM after a while, managing to create about 500 tales locally. We > need to research, why it happens. Is there a leak, or we simply use too much > memory. > Main candidate: thread-local marshallers. For some reason, we use too many > threads, I guess? Meta-storage entries may be up to several megabytes in > current implementation. > We should limit the size of cached buffers, and number of threads in general. > Shared pool (priority-queue) of pre-allocated buffers would solve the issue, > they don't have to be thread-local. It's a bit slower, but it's not a problem > until proven otherwise -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795714#comment-17795714 ] Vipul Thakur commented on IGNITE-21059: --- Hi Thank you for quick response, we have configured tx timeout at client end our clients are written in spring boot and java , is it needed at server's config.xml also ? We will also read about chaning-wal-segment-size and make the changes accordingly > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21060) Extract ClusterNodeResolver as a separate entity
[ https://issues.apache.org/jira/browse/IGNITE-21060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795710#comment-17795710 ] Vladislav Pyatkov commented on IGNITE-21060: Merged 492f01b6423c3514cbe4b3ac45b3d84e70080318 > Extract ClusterNodeResolver as a separate entity > > > Key: IGNITE-21060 > URL: https://issues.apache.org/jira/browse/IGNITE-21060 > Project: Ignite > Issue Type: Task >Reporter: Kirill Sizov >Assignee: Kirill Sizov >Priority: Major > Labels: ignite-3 > Time Spent: 20m > Remaining Estimate: 0h > > *Motivation* > There are many places in the code having a parameter and/or a field like > {code} > Function clusterNodeResolver > {code} > Instead of a generic function we want to have a specific ClusterNodeResolver > entity for the better code readability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21067) Clean up documents related to MVCC
[ https://issues.apache.org/jira/browse/IGNITE-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YuJue Li updated IGNITE-21067: -- Parent: IGNITE-13871 Issue Type: Sub-task (was: Improvement) > Clean up documents related to MVCC > -- > > Key: IGNITE-21067 > URL: https://issues.apache.org/jira/browse/IGNITE-21067 > Project: Ignite > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.15 >Reporter: YuJue Li >Assignee: YuJue Li >Priority: Minor > Fix For: 2.17 > > > Clean up documents related to MVCC -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21016) ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand is flaky
[ https://issues.apache.org/jira/browse/IGNITE-21016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795708#comment-17795708 ] Konstantin Orlov commented on IGNITE-21016: --- [~xtern], [~vpyatkov], folks, do a review please > ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand is flaky > --- > > Key: IGNITE-21016 > URL: https://issues.apache.org/jira/browse/IGNITE-21016 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Yury Gerzhedovich >Assignee: Konstantin Orlov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > The test > org.apache.ignite.internal.sql.engine.ItMixedQueriesTest#testIgniteSchemaAwaresAlterTableCommand > is flacky. > The issue periodically appears on TC, also reproducable on local environment. > {code:java} > org.opentest4j.AssertionFailedError: Column metadata doesn't match ==> > expected: <3> but was: <2> > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:560) > at > app//org.apache.ignite.internal.sql.engine.util.QueryCheckerImpl.check(QueryCheckerImpl.java:322) > at > app//org.apache.ignite.internal.sql.engine.util.QueryCheckerFactoryImpl$1.check(QueryCheckerFactoryImpl.java:90) > at > app//org.apache.ignite.internal.sql.engine.ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand(ItMixedQueriesTest.java:221) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21039) Network performance optimization
[ https://issues.apache.org/jira/browse/IGNITE-21039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-21039: - Labels: ignite-3 (was: ) > Network performance optimization > > > Key: IGNITE-21039 > URL: https://issues.apache.org/jira/browse/IGNITE-21039 > Project: Ignite > Issue Type: Improvement > Components: networking >Affects Versions: 3.0 >Reporter: Alexander Belyak >Priority: Major > Labels: ignite-3 > > I've run several test to find out the MessagingService performance metrics > and that is what I've found: > {noformat} > TestBoolaMessage 139MB/sec WARD > TestByteaMessage 132MB/sec WARD > TestDoubleaMessage 102MB/sec WARD > TestFloataMessage 132MB/sec WARD > TestDoubleaMessage 130MB/sec WARD > TestLongaMessage 131MB/sec WARD > TestDoubleaMessage 131MB/sec WARD > TestStringaMessage 280MB/sec WARD > TestBoolMessage 11MB/sec WARD WARD > TestByteMessage 12MB/sec WARD > TestDoubleMessage 12MB/sec WARD > TestFloatMessage 13MB/sec WARD > TestIntMessage 12MB/sec WARD > TestLongMessage 11MB/sec WARD > TestShortMessage 12MB/sec WARD > TestStringMessage 18MB/sec WARD > TestBool20Message 15MB/sec WARD > TestByte20Message 12MB/sec WARD > TestDouble20Message 32MB/sec WARD > TestFloat20Message 22MB/sec WARD > TestInt20Message 13MB/sec WARD > TestLong20Message 14MB/sec WARD > TestShort20Message 14MB/sec WARD > TestString20Message 65MB/sec WARD > {noformat} > All messages were sent in the same setup: 2 server nodes, connected with a > *10GBit* interface. *Iperf3* (iperf3 --time 30 --zerocopy --client > 192.168.1.126 --omit 3 --interval 1 --length 16384 --window 131072 --parallel > 2 --json --version4) shows about *850MB/sec* network throughput. But the > *best AI3* result was only {*}280MB/sec{*}. Upper results use 3 type of > messages: > 1. {*}TestaMessage{*}: array of 163840 elements (primitive, except > String) of type . > 2. {*}TestMessage{*}: single property (primitive, except String) of > type > 3. {*}Test20Message{*}: 20 property (primitive, except String) of type > > All the messages were sent in parallel from the single thread with the window > of 100 messages (right after getting another first ack - the new message were > sent). > It was expected, that network utilization low for the very short messages > (like 1 int or 20 int fields), but in comparison with the iperf3 results, the > performance of MessagingService for 163KBytes messages was very low. It > became significantly better only while sending huge array of strings (same > string "{color:#067d17}Test string to check message service > performance.{color}"). > > I've run another butch of tests with 1KB byte[] property in the message in 1 > and 8 threads and without send window at all (each thread sends next message > after getting the ack for the previous one): > * *1 thread* and got *37 MBytes/sec* > * *8 threads* and got *63 MBytes/sec* result. > So I suppose there is pretty much contention. > All messages were sent in the followin manner: > {code:java} > private void send(ClusterNode target, NetworkMessage msg) { > messagingService.send(target, msg).handle((v, t) -> { > if (t != null) { > LOG.info("Error while sending huge message", t); > } > if (time() < timeout) { > send(target, msg); > } > }{code} > > > * -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21067) Clean up documents related to MVCC
YuJue Li created IGNITE-21067: - Summary: Clean up documents related to MVCC Key: IGNITE-21067 URL: https://issues.apache.org/jira/browse/IGNITE-21067 Project: Ignite Issue Type: Improvement Components: documentation Affects Versions: 2.15 Reporter: YuJue Li Assignee: YuJue Li Fix For: 2.17 Clean up documents related to MVCC -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20506) CacheAtomicityMode#TRANSACTIONAL_SNAPSHOT removal
[ https://issues.apache.org/jira/browse/IGNITE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YuJue Li reassigned IGNITE-20506: - Assignee: Anton Vinogradov (was: YuJue Li) > CacheAtomicityMode#TRANSACTIONAL_SNAPSHOT removal > - > > Key: IGNITE-20506 > URL: https://issues.apache.org/jira/browse/IGNITE-20506 > Project: Ignite > Issue Type: Sub-task >Reporter: Anton Vinogradov >Assignee: Anton Vinogradov >Priority: Major > Labels: important > Fix For: 2.16 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Description: Fails with OOM after a while, managing to create about 500 tales locally. We need to research, why it happens. Is there a leak, or we simply use too much memory. Main candidate: thread-local marshallers. For some reason, we use too many threads, I guess? Meta-storage entries may be up to several megabytes in current implementation. We should limit the size of cached buffers, and number of threads in general. Shared pool (priority-queue) of pre-allocated buffers would solve the issue, they don't have to be thread-local. It's a bit slower, but it's not a problem until proven otherwise was:Fails with OOM after a while, managing to create about 500 tales locally. We need to research, why it happens. Is there a leak, or we simply use too much memory > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM after a while, managing to create about 500 tales locally. We > need to research, why it happens. Is there a leak, or we simply use too much > memory. > Main candidate: thread-local marshallers. For some reason, we use too many > threads, I guess? Meta-storage entries may be up to several megabytes in > current implementation. > We should limit the size of cached buffers, and number of threads in general. > Shared pool (priority-queue) of pre-allocated buffers would solve the issue, > they don't have to be thread-local. It's a bit slower, but it's not a problem > until proven otherwise -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21066) Create job priority change API
[ https://issues.apache.org/jira/browse/IGNITE-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pochatkin updated IGNITE-21066: --- Summary: Create job priority change API (was: Change job priority API) > Create job priority change API > -- > > Key: IGNITE-21066 > URL: https://issues.apache.org/jira/browse/IGNITE-21066 > Project: Ignite > Issue Type: Improvement > Components: compute >Reporter: Mikhail Pochatkin >Priority: Major > Labels: ignite-3 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21066) Create job priority change API
[ https://issues.apache.org/jira/browse/IGNITE-21066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pochatkin updated IGNITE-21066: --- Description: Once a job has been accepted for execution and is in the queue, we should be able to dynamically change its priority in order to move it up or down in the execution queue. > Create job priority change API > -- > > Key: IGNITE-21066 > URL: https://issues.apache.org/jira/browse/IGNITE-21066 > Project: Ignite > Issue Type: Improvement > Components: compute >Reporter: Mikhail Pochatkin >Priority: Major > Labels: ignite-3 > > Once a job has been accepted for execution and is in the queue, we should be > able to dynamically change its priority in order to move it up or down in the > execution queue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795690#comment-17795690 ] Evgeny Stanilovsky commented on IGNITE-21059: - also you have infinite tx timeouts, plz configure : https://ignite.apache.org/docs/latest/key-value-api/transactions#deadlock-detection > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795688#comment-17795688 ] Evgeny Stanilovsky commented on IGNITE-21059: - also plz increase https://ignite.apache.org/docs/latest/persistence/native-persistence#changing-wal-segment-size numerous "Starting to clean WAL archive" in log > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21066) Change job priority API
Mikhail Pochatkin created IGNITE-21066: -- Summary: Change job priority API Key: IGNITE-21066 URL: https://issues.apache.org/jira/browse/IGNITE-21066 Project: Ignite Issue Type: Improvement Components: compute Reporter: Mikhail Pochatkin -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-20847) Ownership mechanism for Compute Jobs
[ https://issues.apache.org/jira/browse/IGNITE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pochatkin resolved IGNITE-20847. Resolution: Won't Fix > Ownership mechanism for Compute Jobs > > > Key: IGNITE-20847 > URL: https://issues.apache.org/jira/browse/IGNITE-20847 > Project: Ignite > Issue Type: Improvement > Components: compute >Reporter: Mikhail Pochatkin >Priority: Major > Labels: ignite-3 > > If we have enabled authentication on AI3 each Compute job execution start > should store owner user. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-20463) Sql. Integration of TX-related statements into sql script processor
[ https://issues.apache.org/jira/browse/IGNITE-20463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795679#comment-17795679 ] Pavel Pereslegin edited comment on IGNITE-20463 at 12/12/23 10:58 AM: -- [~zstan], [~amashenkov], thanks for the review! Merged to the main branch ([8ed1326|https://github.com/apache/ignite-3/commit/8ed13261f523b2c432ccb3702599df2106388f3c]). was (Author: xtern): Merged to the main branch ([8ed1326|https://github.com/apache/ignite-3/commit/8ed13261f523b2c432ccb3702599df2106388f3c]). [~zstan], [~amashenkov], thanks for the review! > Sql. Integration of TX-related statements into sql script processor > --- > > Key: IGNITE-20463 > URL: https://issues.apache.org/jira/browse/IGNITE-20463 > Project: Ignite > Issue Type: New Feature > Components: sql >Affects Versions: 3.0.0-beta1 >Reporter: Evgeny Stanilovsky >Assignee: Pavel Pereslegin >Priority: Major > Labels: ignite-3 > Time Spent: 21h 20m > Remaining Estimate: 0h > > Script processor [1] need to process tx related statements. > After parsing appropriate transaction syntax it need to be retained and > processed in script processor. > [1] https://issues.apache.org/jira/browse/IGNITE-20443 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20905) Make it possible to add an explicitly NULL column via ADD COLUMN
[ https://issues.apache.org/jira/browse/IGNITE-20905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maksim Zhuravkov updated IGNITE-20905: -- Fix Version/s: 3.0.0-beta2 > Make it possible to add an explicitly NULL column via ADD COLUMN > > > Key: IGNITE-20905 > URL: https://issues.apache.org/jira/browse/IGNITE-20905 > Project: Ignite > Issue Type: Improvement > Components: sql >Reporter: Roman Puchkovskiy >Assignee: Maksim Zhuravkov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 10m > Remaining Estimate: 0h > > When creating a table, it's possible to specify that a column is nullable by > explicitly using NULL: > CREATE TABLE t(id INT PRIMARY KEY, col1 INT NULL) > But, if we add a column to an existing table, this does not work: > ALTER TABLE t ADD COLUMN col2 INT NULL > -> Failed to parse query: Encountered "NULL" at line 1, column X > It seems that for consistency ADD COLUMN should support same syntax as CREATE > TABLE does. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21065) Enhance granularity of authentication events
[ https://issues.apache.org/jira/browse/IGNITE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Gagarkin updated IGNITE-21065: --- Summary: Enhance granularity of authentication events (was: Extend AuthenticationEvent with USER_CREATED and USER_CHANGED) > Enhance granularity of authentication events > > > Key: IGNITE-21065 > URL: https://issues.apache.org/jira/browse/IGNITE-21065 > Project: Ignite > Issue Type: Bug > Components: security, thin client >Reporter: Ivan Gagarkin >Priority: Major > Labels: ignite-3 > > Since the basic authenticator stores a list of users, we need to extend > authentication events to improve granularity. This is to ensure that the > connection is not closed for all users if just one of them changes their > password. Update the tests in > {{org.apache.ignite.client.handler.ClientInboundMessageHandlerTest}} > accordingly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20506) CacheAtomicityMode#TRANSACTIONAL_SNAPSHOT removal
[ https://issues.apache.org/jira/browse/IGNITE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795678#comment-17795678 ] Anton Vinogradov commented on IGNITE-20506: --- [~liyuj], This Issue is already closed at 2.16. If we need to change something, let's create new linked issue. > CacheAtomicityMode#TRANSACTIONAL_SNAPSHOT removal > - > > Key: IGNITE-20506 > URL: https://issues.apache.org/jira/browse/IGNITE-20506 > Project: Ignite > Issue Type: Sub-task >Reporter: Anton Vinogradov >Assignee: YuJue Li >Priority: Major > Labels: important > Fix For: 2.16 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21065) Extend AuthenticationEvent with USER_CREATED and USER_CHANGED
Ivan Gagarkin created IGNITE-21065: -- Summary: Extend AuthenticationEvent with USER_CREATED and USER_CHANGED Key: IGNITE-21065 URL: https://issues.apache.org/jira/browse/IGNITE-21065 Project: Ignite Issue Type: Bug Components: security, thin client Reporter: Ivan Gagarkin Since the basic authenticator stores a list of users, we need to extend authentication events to improve granularity. This is to ensure that the connection is not closed for all users if just one of them changes their password. Update the tests in {{org.apache.ignite.client.handler.ClientInboundMessageHandlerTest}} accordingly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21064) Refactor authentication naming and enum in Thin Client for clarity
Ivan Gagarkin created IGNITE-21064: -- Summary: Refactor authentication naming and enum in Thin Client for clarity Key: IGNITE-21064 URL: https://issues.apache.org/jira/browse/IGNITE-21064 Project: Ignite Issue Type: Improvement Components: thin client Reporter: Ivan Gagarkin Currently, the Thin Client utilizes {{org.apache.ignite.security.AuthenticationType}} to specify the authentication method during the handshake process. This approach can be confusing due to its interaction with the type of authentication defined in the configuration. To resolve this, we propose creating a separate enumeration specifically for the client. Additionally, the 'BASIC' authentication type should be renamed to 'PASSWORD' for clearer understanding. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20995) Add more integration tests for tx recovery on unstable topology
[ https://issues.apache.org/jira/browse/IGNITE-20995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-20995: Assignee: Kirill Sizov > Add more integration tests for tx recovery on unstable topology > --- > > Key: IGNITE-20995 > URL: https://issues.apache.org/jira/browse/IGNITE-20995 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Lapin >Assignee: Kirill Sizov >Priority: Major > Labels: ignite-3 > > h3. Motivation > Surprisingly it might be useful to check tx recovery implementation with some > tests. > h3. Defintion of Done > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Description: Fails with OOM after a while, managing to create about 500 tales locally. We need to research, why it happens. Is there a leak, or we simply use too much memory (was: Fails with OOM on TC. We need to research, why it happens. Is there a leak, or we simply use too much memory) > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM after a while, managing to create about 500 tales locally. We > need to research, why it happens. Is there a leak, or we simply use too much > memory -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20918) Leases expire after a node has been restarted
[ https://issues.apache.org/jira/browse/IGNITE-20918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-20918: Assignee: Alexander Lapin (was: Vladislav Pyatkov) > Leases expire after a node has been restarted > - > > Key: IGNITE-20918 > URL: https://issues.apache.org/jira/browse/IGNITE-20918 > Project: Ignite > Issue Type: Bug >Reporter: Aleksandr Polovtcev >Assignee: Alexander Lapin >Priority: Critical > Labels: ignite-3 > Time Spent: 1.5h > Remaining Estimate: 0h > > IGNITE-20910 introduces a test that inserts some data after restarting a > node. For some reason, after some time, I can see the following messages in > the log: > {noformat} > [2023-11-22T10:00:17,056][INFO > ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] > Primary replica expired [grp=5_part_19] > [2023-11-22T10:00:17,057][INFO > ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] > Primary replica expired [grp=5_part_0] > [2023-11-22T10:00:17,057][INFO > ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] > Primary replica expired [grp=5_part_9] > [2023-11-22T10:00:17,057][INFO > ][%isnt_tmpar_0%metastorage-watch-executor-3][PartitionReplicaListener] > Primary replica expired [grp=5_part_10] > {noformat} > After that, the test fails with a {{PrimaryReplicaMissException}}. The > problem here, that it is expected that a single node should never have > expired leases, they should be prolongated automatically. I think that this > happens because the initial lease that was issued before the node was > restarted is still accepted by the node after restart. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20993) Make the tables recover on the same assignments on different nodes
[ https://issues.apache.org/jira/browse/IGNITE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-20993: Assignee: Denis Chudov > Make the tables recover on the same assignments on different nodes > -- > > Key: IGNITE-20993 > URL: https://issues.apache.org/jira/browse/IGNITE-20993 > Project: Ignite > Issue Type: Improvement >Reporter: Denis Chudov >Assignee: Denis Chudov >Priority: Major > Labels: ignite-3 > > *Motivation* > Currently the following is possible: > * node A performs the recovery on revision _x_ and in case of absence of > stable assignments calculates the assignments from data nodes for this > revision; > * node B is doing the same for recovery revision _y_ which is not equal to > {_}x{_}. > As a result, they can start partitions for different assignments which are > not consistent, and this can lead to side effects like unavailability of the > majority for some partitions, and ambiguous unpredictable assignments will be > written to meta storate due to the race between these nodes. > *Definition of done* > The multiple nodes performing recovery always start partitions for > assignments calculated for the same revision. > *Implementation notes* > The common revision for different nodes in this case can be revision on which > the table was created, it should be done under IGNITE-21014 . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Description: Fails with OOM on TC. We need to research, why it happens. Is there a leak, or we simply use too much memory (was: Fails with OOM on TC) > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM on TC. We need to research, why it happens. Is there a leak, > or we simply use too much memory -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21014) Table creation revision for table descriptor
[ https://issues.apache.org/jira/browse/IGNITE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mirza Aliev reassigned IGNITE-21014: Assignee: Mirza Aliev > Table creation revision for table descriptor > > > Key: IGNITE-21014 > URL: https://issues.apache.org/jira/browse/IGNITE-21014 > Project: Ignite > Issue Type: Improvement >Reporter: Denis Chudov >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > > *Motivation* > In order to be able to correctly recover data nodes for tables that don't > have stable assignments in meta storage (see IGNITE-20993 ) there should be > some special revision for the recovery. We can use the revision on which such > tables were created. Table creation revision should be added to the table > descriptor. > *Definition of done* > Creation revision is added to the table descriptor. > *Implementation notes* > This creation revision shouldn't change in new versions of the descriptor - > it should be taken from the previous version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21014) Table creation revision for table descriptor
[ https://issues.apache.org/jira/browse/IGNITE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin updated IGNITE-21014: - Epic Link: (was: IGNITE-19170) > Table creation revision for table descriptor > > > Key: IGNITE-21014 > URL: https://issues.apache.org/jira/browse/IGNITE-21014 > Project: Ignite > Issue Type: Improvement >Reporter: Denis Chudov >Priority: Major > Labels: ignite-3 > > *Motivation* > In order to be able to correctly recover data nodes for tables that don't > have stable assignments in meta storage (see IGNITE-20993 ) there should be > some special revision for the recovery. We can use the revision on which such > tables were created. Table creation revision should be added to the table > descriptor. > *Definition of done* > Creation revision is added to the table descriptor. > *Implementation notes* > This creation revision shouldn't change in new versions of the descriptor - > it should be taken from the previous version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21059) We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running cache operations
[ https://issues.apache.org/jira/browse/IGNITE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795675#comment-17795675 ] Evgeny Stanilovsky commented on IGNITE-21059: - [~vipul.thakur] can you attach logs corresponding to observed problem (some time earlier and some after) the incident ? Thread dumps seems can`t help here ... If logs are already rotated plz make fresh copy if incident repeats. > We have upgraded our ignite instance from 2.7.6 to 2.14. Found long running > cache operations > > > Key: IGNITE-21059 > URL: https://issues.apache.org/jira/browse/IGNITE-21059 > Project: Ignite > Issue Type: Bug > Components: binary, clients >Affects Versions: 2.14 >Reporter: Vipul Thakur >Priority: Critical > Attachments: cache-config-1.xml, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt2, > digiapi-eventprocessing-app-zone1-696c8c4946-62jbx-jstck.txt3, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt1, > digiapi-eventprocessing-app-zone1-696c8c4946-7d57w-jstck.txt2, > ignite-server-nohup.out > > > We have recently upgraded from 2.7.6 to 2.14 due to the issue observed in > production environment where cluster would go in hang state due to partition > map exchange. > Please find the below ticket which i created a while back for ignite 2.7.6 > https://issues.apache.org/jira/browse/IGNITE-13298 > So we migrated the apache ignite version to 2.14 and upgrade happened > smoothly but on the third day we could see cluster traffic dip again. > We have 4 nodes in a cluster where we provide 400 GB of RAM and more than 1 > TB HDD. > PFB for the attached config.[I have added it as attachment for review] > I have also added the server logs from the same time when issue happened. > We have set txn timeout as well as socket timeout both at server and client > end for our write operations but seems like sometimes cluster goes into hang > state and all our get calls are stuck and slowly everything starts to freeze > our jms listener threads and every thread reaches a choked up state in > sometime. > Due to which our read services which does not even use txn to retrieve data > also starts to choke. Ultimately leading to end user traffic dip. > We were hoping product upgrade will help but that has not been the case till > now. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21063) Cannot create 1000 tables
[ https://issues.apache.org/jira/browse/IGNITE-21063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov updated IGNITE-21063: --- Ignite Flags: (was: Docs Required,Release Notes Required) > Cannot create 1000 tables > - > > Key: IGNITE-21063 > URL: https://issues.apache.org/jira/browse/IGNITE-21063 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > Fails with OOM on TC -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20745) TableManager.tableAsync(int tableId) is slowing down thin clients
[ https://issues.apache.org/jira/browse/IGNITE-20745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vyacheslav Koptilin reassigned IGNITE-20745: Assignee: Igor Sapego (was: Kirill Gusakov) > TableManager.tableAsync(int tableId) is slowing down thin clients > - > > Key: IGNITE-20745 > URL: https://issues.apache.org/jira/browse/IGNITE-20745 > Project: Ignite > Issue Type: Improvement >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Igor Sapego >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Attachments: ItThinClientPutGetBenchmark.java > > > Performance difference between embedded and client modes is affected > considerably by the call to *IgniteTablesInternal#tableAsync(int id)*. This > call has to be performed on every individual table operation. > We should make it as fast as possible. Something like a dictionary lookup + > quick check for deleted table. > ||Part||Duration, us|| > |Network & msgpack|19.30| > |Get table|14.29| > |Get tuple & serialize|12.86| -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21016) ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand is flaky
[ https://issues.apache.org/jira/browse/IGNITE-21016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Orlov updated IGNITE-21016: -- Ignite Flags: (was: Docs Required,Release Notes Required) > ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand is flaky > --- > > Key: IGNITE-21016 > URL: https://issues.apache.org/jira/browse/IGNITE-21016 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Yury Gerzhedovich >Assignee: Konstantin Orlov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > The test > org.apache.ignite.internal.sql.engine.ItMixedQueriesTest#testIgniteSchemaAwaresAlterTableCommand > is flacky. > The issue periodically appears on TC, also reproducable on local environment. > {code:java} > org.opentest4j.AssertionFailedError: Column metadata doesn't match ==> > expected: <3> but was: <2> > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:560) > at > app//org.apache.ignite.internal.sql.engine.util.QueryCheckerImpl.check(QueryCheckerImpl.java:322) > at > app//org.apache.ignite.internal.sql.engine.util.QueryCheckerFactoryImpl$1.check(QueryCheckerFactoryImpl.java:90) > at > app//org.apache.ignite.internal.sql.engine.ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand(ItMixedQueriesTest.java:221) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21016) ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand is flaky
[ https://issues.apache.org/jira/browse/IGNITE-21016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Orlov updated IGNITE-21016: -- Fix Version/s: 3.0.0-beta2 > ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand is flaky > --- > > Key: IGNITE-21016 > URL: https://issues.apache.org/jira/browse/IGNITE-21016 > Project: Ignite > Issue Type: Bug > Components: sql >Reporter: Yury Gerzhedovich >Assignee: Konstantin Orlov >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > The test > org.apache.ignite.internal.sql.engine.ItMixedQueriesTest#testIgniteSchemaAwaresAlterTableCommand > is flacky. > The issue periodically appears on TC, also reproducable on local environment. > {code:java} > org.opentest4j.AssertionFailedError: Column metadata doesn't match ==> > expected: <3> but was: <2> > at > app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > at > app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > at > app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197) > at > app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) > at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:560) > at > app//org.apache.ignite.internal.sql.engine.util.QueryCheckerImpl.check(QueryCheckerImpl.java:322) > at > app//org.apache.ignite.internal.sql.engine.util.QueryCheckerFactoryImpl$1.check(QueryCheckerFactoryImpl.java:90) > at > app//org.apache.ignite.internal.sql.engine.ItMixedQueriesTest.testIgniteSchemaAwaresAlterTableCommand(ItMixedQueriesTest.java:221) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20361) Implement the table storage description
[ https://issues.apache.org/jira/browse/IGNITE-20361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mirza Aliev reassigned IGNITE-20361: Assignee: Mirza Aliev > Implement the table storage description > --- > > Key: IGNITE-20361 > URL: https://issues.apache.org/jira/browse/IGNITE-20361 > Project: Ignite > Issue Type: Task >Reporter: Kirill Gusakov >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > > *Motivation* > According to IGNITE-20357 we need to have an appropriate table storage > configuration, which can be used on the table create/alter to check if the > chosen zone has appropriate storage requirements. > *Definition of done* > - Table has the storage configuration, which can be used for early validation > if table can be "deployed" in its zone correctly. > - Data storage configuration removed from zone configuration > *Notes* > - Avoid altering of this field for now with appropriate exception -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21063) Cannot create 1000 tables
Ivan Bessonov created IGNITE-21063: -- Summary: Cannot create 1000 tables Key: IGNITE-21063 URL: https://issues.apache.org/jira/browse/IGNITE-21063 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov Fails with OOM on TC -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21062) Safe time reordering in partitions
[ https://issues.apache.org/jira/browse/IGNITE-21062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Bessonov reassigned IGNITE-21062: -- Assignee: Ivan Bessonov > Safe time reordering in partitions > -- > > Key: IGNITE-21062 > URL: https://issues.apache.org/jira/browse/IGNITE-21062 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Bessonov >Assignee: Ivan Bessonov >Priority: Major > Labels: ignite-3 > > In the scenario of creating a lot of table and having slow system > (presumably), it's possible to notice {{Safe time reordering detected > [current=...}} assertion error in logs. > It happens with safe-time sync commands, in the absence of transactional load. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21062) Safe time reordering in partitions
Ivan Bessonov created IGNITE-21062: -- Summary: Safe time reordering in partitions Key: IGNITE-21062 URL: https://issues.apache.org/jira/browse/IGNITE-21062 Project: Ignite Issue Type: Bug Reporter: Ivan Bessonov In the scenario of creating a lot of table and having slow system (presumably), it's possible to notice {{Safe time reordering detected [current=...}} assertion error in logs. It happens with safe-time sync commands, in the absence of transactional load. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21061) Durable cleanup requires additional replication group command
Vladislav Pyatkov created IGNITE-21061: -- Summary: Durable cleanup requires additional replication group command Key: IGNITE-21061 URL: https://issues.apache.org/jira/browse/IGNITE-21061 Project: Ignite Issue Type: Improvement Reporter: Vladislav Pyatkov h3. Motivation After locks are released, it is required to write the information to transaction persistent storage and replicate it on all commit partition replication group nodes. That is performed by the replication command ({{MarkLocksReleasedCommand}}). As a result, we have an additional replication command for the entire transaction process. h3. Implementation notes In my opinion, we can resolve this situation in the transaction resolution procedure ({{OrphanDetector}}). It just needs to check the lease on the commit partition. If the lease is changed when we are faced with a transaction lock, the replication process should start. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-21060) Extract ClusterNodeResolver as a separate entity
[ https://issues.apache.org/jira/browse/IGNITE-21060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill Sizov reassigned IGNITE-21060: -- Assignee: Kirill Sizov > Extract ClusterNodeResolver as a separate entity > > > Key: IGNITE-21060 > URL: https://issues.apache.org/jira/browse/IGNITE-21060 > Project: Ignite > Issue Type: Task >Reporter: Kirill Sizov >Assignee: Kirill Sizov >Priority: Major > Labels: ignite-3 > > *Motivation* > There are many places in the code having a parameter and/or a field like > {code} > Function clusterNodeResolver > {code} > Instead of a generic function we want to have a specific ClusterNodeResolver > entity for the better code readability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-21060) Extract ClusterNodeResolver as a separate entity
Kirill Sizov created IGNITE-21060: -- Summary: Extract ClusterNodeResolver as a separate entity Key: IGNITE-21060 URL: https://issues.apache.org/jira/browse/IGNITE-21060 Project: Ignite Issue Type: Task Reporter: Kirill Sizov *Motivation* There are many places in the code having a parameter and/or a field like {code} Function clusterNodeResolver {code} Instead of a generic function we want to have a specific ClusterNodeResolver entity for the better code readability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20909) Thin 3.0: Compute jobs should use server notification to signal completion to the client
[ https://issues.apache.org/jira/browse/IGNITE-20909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Tupitsyn updated IGNITE-20909: Ignite Flags: (was: Docs Required,Release Notes Required) > Thin 3.0: Compute jobs should use server notification to signal completion to > the client > > > Key: IGNITE-20909 > URL: https://issues.apache.org/jira/browse/IGNITE-20909 > Project: Ignite > Issue Type: Improvement > Components: compute, thin client >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Pavel Tupitsyn >Priority: Major > Labels: iep-42, ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 40m > Remaining Estimate: 0h > > Compute jobs can be long-lived and even out-live the client connection. New > Compute API is coming that will return some "execution" object immediately, > which can be used to monitor or cancel the job. Therefore, job startup and > completion should be separated - normal request-response approach is not > suitable. > * Use request-response to initiate the job execution and return an ID to the > client > * Use server -> client notification to signal about completion > This is a tried approach from Ignite 2.x, see linked > [IEP-42|https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+Client+Compute] > and related discussion -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20909) Thin 3.0: Compute jobs should use server notification to signal completion to the client
[ https://issues.apache.org/jira/browse/IGNITE-20909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795641#comment-17795641 ] Pavel Tupitsyn commented on IGNITE-20909: - Merged to main: 0f5434baf426f9c56fb1d511364f9d83a5dfe24d > Thin 3.0: Compute jobs should use server notification to signal completion to > the client > > > Key: IGNITE-20909 > URL: https://issues.apache.org/jira/browse/IGNITE-20909 > Project: Ignite > Issue Type: Improvement > Components: compute, thin client >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Pavel Tupitsyn >Priority: Major > Labels: iep-42, ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 40m > Remaining Estimate: 0h > > Compute jobs can be long-lived and even out-live the client connection. New > Compute API is coming that will return some "execution" object immediately, > which can be used to monitor or cancel the job. Therefore, job startup and > completion should be separated - normal request-response approach is not > suitable. > * Use request-response to initiate the job execution and return an ID to the > client > * Use server -> client notification to signal about completion > This is a tried approach from Ignite 2.x, see linked > [IEP-42|https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+Client+Compute] > and related discussion -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-21056) Use thread local buffer for encrypted dump
[ https://issues.apache.org/jira/browse/IGNITE-21056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov resolved IGNITE-21056. -- Resolution: Fixed > Use thread local buffer for encrypted dump > -- > > Key: IGNITE-21056 > URL: https://issues.apache.org/jira/browse/IGNITE-21056 > Project: Ignite > Issue Type: Task >Reporter: Yuri Naryshkin >Assignee: Yuri Naryshkin >Priority: Minor > Labels: IEP-109, ise > Fix For: 2.16 > > Time Spent: 20m > Remaining Estimate: 0h > > When encrypted dump taken, expanded byte buffer doesn't replace thread local > one. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-21056) Use thread local buffer for encrypted dump
[ https://issues.apache.org/jira/browse/IGNITE-21056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov updated IGNITE-21056: - Fix Version/s: 2.16 > Use thread local buffer for encrypted dump > -- > > Key: IGNITE-21056 > URL: https://issues.apache.org/jira/browse/IGNITE-21056 > Project: Ignite > Issue Type: Task >Reporter: Yuri Naryshkin >Assignee: Yuri Naryshkin >Priority: Minor > Labels: IEP-109, ise > Fix For: 2.16 > > Time Spent: 10m > Remaining Estimate: 0h > > When encrypted dump taken, expanded byte buffer doesn't replace thread local > one. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-21056) Use thread local buffer for encrypted dump
[ https://issues.apache.org/jira/browse/IGNITE-21056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795636#comment-17795636 ] Ignite TC Bot commented on IGNITE-21056: {panel:title=Branch: [pull/11086/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/11086/head] Base: [master] : No new tests found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}{panel} [TeamCity *-- Run :: All* Results|https://ci2.ignite.apache.org/viewLog.html?buildId=7652539buildTypeId=IgniteTests24Java8_RunAll] > Use thread local buffer for encrypted dump > -- > > Key: IGNITE-21056 > URL: https://issues.apache.org/jira/browse/IGNITE-21056 > Project: Ignite > Issue Type: Task >Reporter: Yuri Naryshkin >Assignee: Yuri Naryshkin >Priority: Minor > Labels: IEP-109, ise > Time Spent: 10m > Remaining Estimate: 0h > > When encrypted dump taken, expanded byte buffer doesn't replace thread local > one. -- This message was sent by Atlassian Jira (v8.20.10#820010)