Understood thanks ! On Sun, 6 Nov 2022, 21:33 Jeremy McMillan, <[email protected]> wrote:
> Think of each AZ as being a massive piece of server hardware running VMs > or workloads for you. When hardware (or infrastructure maintenance process) > fails, assume everything on one AZ is lost at the same time. > > On Sun, Nov 6, 2022, 09:58 Surinder Mehra <[email protected]> wrote: > >> That's partially true. Whole excercise of configuring AZ as backup filter >> is because we want to handle AZ level failure. >> >> Anyway, thanks for inputs. Will figure out further steps >> >> On Sun, 6 Nov 2022, 20:55 Jeremy McMillan, <[email protected]> >> wrote: >> >>> Don't configure 2 backups when you only have two failure domains. >>> >>> You're worried about node level failure, but you're telling Ignite to >>> worry about AZ level failure. >>> >>> >>> On Sat, Nov 5, 2022, 21:57 Surinder Mehra <[email protected]> wrote: >>> >>>> Yeah I think there is a misunderstanding. Although I figured out my >>>> answers from our discussion, I will try one final attempt to clarify my >>>> point on 2X space for node3 >>>> >>>> Node setup: >>>> Node1 and node 2 placed in AZ1 >>>> Node 3 placed in AZ2 >>>> >>>> Since I am using AZ as backup filter as I mentioned in my first >>>> message. Back up if node 1 cannot be placed on node2 and back up of node 2 >>>> cannot be placed on node1 as they are in same AZ. This simply means their >>>> backups would go to node3 which in another AZ. Hence node 3 space =(node3 >>>> primary partitions+node 1 back up partitions+node2 backup partitions) >>>> >>>> Wouldn't this mean node 3 need 2X space as compared to node 1 and node2. >>>> Assuming backup partitions of node 3 would be equally distributed among >>>> other two nodes. They would need almost same space. >>>> >>>> >>>> On Tue, 1 Nov 2022, 23:30 Jeremy McMillan, < >>>> [email protected]> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Nov 1, 2022 at 10:02 AM Surinder Mehra <[email protected]> >>>>> wrote: >>>>> >>>>>> Even if we have 2 copies of data and primary and backup copy would be >>>>>> stored in different AZs. My question remains valid in this case as well. >>>>>> >>>>> >>>>> I think additional backup copies in the same AZ are superfluous if we >>>>> start with the assumption that multiple concurrent failures are most >>>>> likely >>>>> to affect resources in the concurrent AZ. A second node failure, if that's >>>>> your failure budget, is likely to corrupt all the backup copies in the >>>>> second AZ. >>>>> >>>>> If you only have two AZs available in some data centers/deployments, >>>>> but you need 3-way redundancy on certain caches/tables, then using AZ node >>>>> attribute for backup filtering is too coarse grained. Using AZ is a >>>>> general >>>>> case best practice which gives your cluster the best chance of surviving >>>>> multiple hardware failures in AWS because they pool hardware resources in >>>>> AZs. Maybe you just need three AZs? Maybe AZ isn't the correct failure >>>>> domain for your use case? >>>>> >>>>> >>>>>> Do we have to ensure nodes in two AZs are always present or does >>>>>> ignite have a way to indicate it couldn't create backups. Silently >>>>>> killing >>>>>> backups is not desirable state. >>>>>> >>>>> >>>>> Do you use synchronous or asynchronous backups? >>>>> >>>>> https://ignite.apache.org/docs/2.11.1/configuring-caches/configuring-backups#synchronous-and-asynchronous-backups >>>>> >>>>> You can periodically poll caches' configurations or hook a cluster >>>>> state event, and re-compare the cache backup configuration against the >>>>> enumerated available AZs, and raise an exception or log a message or >>>>> whatever to detect the issue as soon as AZ count drops below minimum. This >>>>> way might also be good for fuzzy warning condition detection point for >>>>> proactive infrastructure operations. If you count all of the nodes in each >>>>> AZ, you can detect and track AZ load imbalances as the ratio between the >>>>> smallest AZ node count and the average AZ node count. >>>>> >>>>> >>>>>> 2. In my original message with 2 nodes(node1 and node2) in AZ1, and >>>>>> 3rdnode in second AZ, backups of node1 and node2 would be placed one >>>>>> node 3 >>>>>> in AZ2. It would mean it need to have 2X space to store backups. >>>>>> Just trying to ensure my understanding is correct. >>>>>> >>>>> >>>>> If you have three nodes, you divide your total footprint by three to >>>>> get the minimum node capacity. >>>>> >>>>> If you have 2 backups, that is one primary copy plus two more backup >>>>> copies, so you multiply your total footprint by 3. >>>>> >>>>> If you multiply, say 32GB by three for redundancy, that would be 96GB >>>>> total space needed for the sum of all nodes' footprint. >>>>> >>>>> If you divide the 96GB storage commitment among three nodes, then each >>>>> node must have a minimum of 32GB. That's what we started with as a nominal >>>>> data footprint, so 1x not 2x. Node 1 will need to accommodate backups from >>>>> node 2 and node 3. Node 2 will need to accommodate backups from node 1 and >>>>> node 3. Each node has one primary and two backup partition copies for each >>>>> partition of each cache with two backups. >>>>> >>>>> >>>>>> Hope my queries are clear to you now >>>>>> >>>>> >>>>> I still don't understand your operational goals, so I feel like we may >>>>> be dancing around a misunderstanding. >>>>> >>>>> >>>>>> On Tue, 1 Nov 2022, 20:19 Surinder Mehra, <[email protected]> wrote: >>>>>> >>>>>>> Thanks for your reply. Let me try to answer your 2 questions below. >>>>>>> 1. I understand that it sacrifices the backups incase it can't place >>>>>>> backups appropriately. Question is, is it possible to fail the >>>>>>> deployment >>>>>>> rather than risking single copy of data present in cluster. If this only >>>>>>> copy goes down, we will have downtime as data won't be present in >>>>>>> cluster. >>>>>>> We should rather throw error if enough hardware is not present than >>>>>>> risking >>>>>>> data unavailability issue during business activity >>>>>>> >>>>>>> 2. Why we want 3 copies of data. It's a design choice. We want to >>>>>>> ensure even if 2 nodes go down, we still have 3rd present to serve the >>>>>>> data. >>>>>>> >>>>>>> Hope I answered your question >>>>>>> >>>>>>> On Tue, 1 Nov 2022, 19:40 Jeremy McMillan, < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> This question is a design question. >>>>>>>> >>>>>>>> What kids of fault states do you expect to tolerate? What is your >>>>>>>> failure budget? >>>>>>>> >>>>>>>> Why are you trying to make more than 2 copies of the data >>>>>>>> distribute across only two failure domains? >>>>>>>> >>>>>>>> Also "fail fast" means discover your implementation defects faster >>>>>>>> than your release cycle, not how fast you can cause data loss. >>>>>>>> >>>>>>>> On Tue, Nov 1, 2022, 09:01 Surinder Mehra <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> gentle reminder. >>>>>>>>> One additional question: We have observed that if available AZs >>>>>>>>> are less than backups count, ignite skips creating backups. Is this >>>>>>>>> correct >>>>>>>>> understanding? If yes, how can we fail fast if backups can not be >>>>>>>>> placed >>>>>>>>> due to AZ limitation? >>>>>>>>> >>>>>>>>> On Mon, Oct 31, 2022 at 6:30 PM Surinder Mehra <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> As per link attached, to ensure primary and backup partitions are >>>>>>>>>> not stored on same node, We used AWS AZ as backup filter and now I >>>>>>>>>> can see >>>>>>>>>> if I start two ignite nodes on the same machine, primary partitions >>>>>>>>>> are >>>>>>>>>> evenly distributed but backups are always zero which is expected. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://www.gridgain.com/docs/latest/installation-guide/aws/multiple-availability-zone-aws >>>>>>>>>> >>>>>>>>>> My question is what would happen if AZ-1 has 2 machines and AZ-2 >>>>>>>>>> has 1 machine and ignite cluster has only 3 nodes, each machine >>>>>>>>>> having one >>>>>>>>>> ignite node. >>>>>>>>>> >>>>>>>>>> Node1[AZ1] - keys 1-100 >>>>>>>>>> Node2[AZ1] - keys 101-200 >>>>>>>>>> Node3[AZ2] - keys 201 -300 >>>>>>>>>> >>>>>>>>>> In the above scenario, if the backup count is 2, how would back >>>>>>>>>> up partitions be distributed. >>>>>>>>>> >>>>>>>>>> 1. Would it mean node3 will have 2 backup copies of primary >>>>>>>>>> partitions of node 1 and 2 ? >>>>>>>>>> 2. If we have a 4 node cluster with 2 nodes in each AZ, would >>>>>>>>>> backup copies also be placed on different nodes(In other words, does >>>>>>>>>> the >>>>>>>>>> backup filter also apply to how backup copies are placed on nodes) ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>
