Re: Preserving transactions accross Xenstored Live-Update
On 19.05.21 19:10, Julien Grall wrote: Hi Juergen, On 19/05/2021 13:50, Juergen Gross wrote: On 19.05.21 14:33, Julien Grall wrote: On 19/05/2021 13:32, Julien Grall wrote: Hi Juergen, On 19/05/2021 10:09, Juergen Gross wrote: On 18.05.21 20:11, Julien Grall wrote: I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In theory, opening a new transaction means you will not see any modification in the global database until the transaction has been committed. What you describe would break that because a client would be able to see new nodes added outside of the transaction. However, C Xenstored implements neither of the two. Currently, when a node is accessed within the transaction, we will also store the names of the current children. To give an example with access to the global DB (prefixed with TID0) and within a transaction (TID1) 1) TID0: MKDIR "data/bar" 2) Start transaction TID1 3) TID1: DIRECTORY "data" -> This will cache thenode data 4) TID0: MKDIR "data/foo" -> This will create "foo" in the global database 5) TID1: MKDIR "data/fish" -> This will create "fish" inthe transaction 5) TID1: DIRECTORY "data" -> This will only return "bar" and "fish" If we Live-Update between 4) and 5). Then we should make sure that "bar" cannot be seen in the listing by TID1. I meant "foo" here. Sorry for the confusion. Therefore, I don't think we can restore the children using the global node here. Instead we need to find a way to transfer the list of known children within the transaction. As a fun fact, C Xenstored implements weirdly the transaction, so TID1 will be able to access "bar" if it knows the name but not list it. And this is the basic problem, I think. C Xenstored should be repaired by adding all (remaining) children of a node into the TID's database when the list of children is modified either globally or in a transaction. A child having been added globally needs to be added as "deleted" into the TID's database. IIUC, for every modifications in the global database, we would need to walk every single transactions and check whether a parent was accessed. Am I correct? Not really. When a node is being read during a transaction and it is found in the global data base only, its gen-count can be tested for being older or newer than the transaction start. If it is newer we can traverse the path up to "/" and treat each parent the same way (so if a parent is found in the transaction data base, the presence of the child can be verified, and if it is global only, the gen-count can be tested against the transaction again). If so, I don't think this is a workable solution because of the cost to execute a single command. My variant will affect the transaction internal reads only, and the additional cost will be limited by the distance of the read node from the root node. Is it something you plan to address differently with your rework of the DB? Yes. I want to have the transaction specific variants of nodes linked to the global ones, which solves this problem in an easy way. Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: Preserving transactions accross Xenstored Live-Update
Hi Juergen, On 19/05/2021 13:50, Juergen Gross wrote: On 19.05.21 14:33, Julien Grall wrote: On 19/05/2021 13:32, Julien Grall wrote: Hi Juergen, On 19/05/2021 10:09, Juergen Gross wrote: On 18.05.21 20:11, Julien Grall wrote: I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In theory, opening a new transaction means you will not see any modification in the global database until the transaction has been committed. What you describe would break that because a client would be able to see new nodes added outside of the transaction. However, C Xenstored implements neither of the two. Currently, when a node is accessed within the transaction, we will also store the names of the current children. To give an example with access to the global DB (prefixed with TID0) and within a transaction (TID1) 1) TID0: MKDIR "data/bar" 2) Start transaction TID1 3) TID1: DIRECTORY "data" -> This will cache the node data 4) TID0: MKDIR "data/foo" -> This will create "foo" in the global database 5) TID1: MKDIR "data/fish" -> This will create "fish" inthe transaction 5) TID1: DIRECTORY "data" -> This will only return "bar" and "fish" If we Live-Update between 4) and 5). Then we should make sure that "bar" cannot be seen in the listing by TID1. I meant "foo" here. Sorry for the confusion. Therefore, I don't think we can restore the children using the global node here. Instead we need to find a way to transfer the list of known children within the transaction. As a fun fact, C Xenstored implements weirdly the transaction, so TID1 will be able to access "bar" if it knows the name but not list it. And this is the basic problem, I think. C Xenstored should be repaired by adding all (remaining) children of a node into the TID's database when the list of children is modified either globally or in a transaction. A child having been added globally needs to be added as "deleted" into the TID's database. IIUC, for every modifications in the global database, we would need to walk every single transactions and check whether a parent was accessed. Am I correct? If so, I don't think this is a workable solution because of the cost to execute a single command. Is it something you plan to address differently with your rework of the DB? Cheers, -- Julien Grall
Re: Preserving transactions accross Xenstored Live-Update
On Wed, 2021-05-19 at 11:09 +0200, Juergen Gross wrote: On 18.05.21 20:11, Julien Grall wrote: Hi Juergen, I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In case a child has been deleted in the transaction, the stream should contain a node record for that child with the transaction-id and the number of permissions being zero: see docs/designs/xenstore-migration.md The problem for oxenstored is that you might've taken a snapshot in the past, your root has moved on, but you have in your snapshot a lot of nodes that have been deleted in the latest root. A brute force way might be to diff the transaction's state and the latest root state and dump the delta entries as adding/deleting nodes in the migration stream. This could lead to dumping a lot of duplicate state, and result in an explosion of file size (e.g. if you run 1000 domain, the current max supported limit and each has one tiny transaction from the past this will lead to 1000x amplification of xenstore size in the dump. In-memory is fine because OCaml will share common tree nodes that are unchanged). This should correctly restore content but have a bad effect on conflict semantics: your migrated transactions will all then likely conflict at the root, or near the root and fail anyway. Whereas without a live-update as long as you do not modify any of the old state you would get the conflict marker further down the tree and most of the time able to avoid conflicts. I've tried implementing this last year: https://github.com/edwintorok/xen/pull/2/commits/a9f057131b75e1bd2dcb49c795630ab5875b7f76#diff-0f4826471775d78bfc6922c63152e268ef386171ebd985208cb82e21c621e749R288-R365 (ignore the awful indentation that code has been rebased with ignore_all_space so many times between different branches of Xen that whitespace correctness has been lost) I've got a fuzzer/unit test for live-update (see xen-devel), but it has transactions turned off currently because I couldn't get it to work reliably, it always found examples where the transaction conflict state was not identical pre/post update. If we abort all transactions after migration as discussed previously then it might be possible to get this to work if we accept the size explosion as a possibility and dump transaction state to /var/tmp, not to /tmp (which might be a tmpfs that gives you ENOSPC). Live updates are a fairly niche use case and I'd like to see the current variant without transactions proven to work on an actual XSA (likely the next oxenstored XSA about queue limits if we find a solution to that), and only after that deploy live-update support with transactions. We also completely lack any unit tests for transactions (aside from the fuzzer that I started writing, which does just some very minimal state comparisons), we do not have a formal model on how transactions and transaction conflicts should be handled to check whether transactions behave correctly, though a fairly good appromixation is: run 2 oxenstored one with and without live-update and check that they produce equivalent (not necessarily identical, txid can change) answers. As long as we do not have to change the transaction semantics or code in any way to support live update. Best regards, --Edwin Juergen [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
Re: Preserving transactions accross Xenstored Live-Update
On 19.05.21 14:33, Julien Grall wrote: On 19/05/2021 13:32, Julien Grall wrote: Hi Juergen, On 19/05/2021 10:09, Juergen Gross wrote: On 18.05.21 20:11, Julien Grall wrote: I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In theory, opening a new transaction means you will not see any modification in the global database until the transaction has been committed. What you describe would break that because a client would be able to see new nodes added outside of the transaction. However, C Xenstored implements neither of the two. Currently, when a node is accessed within the transaction, we will also store the names of the current children. To give an example with access to the global DB (prefixed with TID0) and within a transaction (TID1) 1) TID0: MKDIR "data/bar" 2) Start transaction TID1 3) TID1: DIRECTORY "data" -> This will cache the node data 4) TID0: MKDIR "data/foo" -> This will create "foo" in the global database 5) TID1: MKDIR "data/fish" -> This will create "fish" inthe transaction 5) TID1: DIRECTORY "data" -> This will only return "bar" and "fish" If we Live-Update between 4) and 5). Then we should make sure that "bar" cannot be seen in the listing by TID1. I meant "foo" here. Sorry for the confusion. Therefore, I don't think we can restore the children using the global node here. Instead we need to find a way to transfer the list of known children within the transaction. As a fun fact, C Xenstored implements weirdly the transaction, so TID1 will be able to access "bar" if it knows the name but not list it. And this is the basic problem, I think. C Xenstored should be repaired by adding all (remaining) children of a node into the TID's database when the list of children is modified either globally or in a transaction. A child having been added globally needs to be added as "deleted" into the TID's database. Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: Preserving transactions accross Xenstored Live-Update
On 19/05/2021 13:32, Julien Grall wrote: Hi Juergen, On 19/05/2021 10:09, Juergen Gross wrote: On 18.05.21 20:11, Julien Grall wrote: I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In theory, opening a new transaction means you will not see any modification in the global database until the transaction has been committed. What you describe would break that because a client would be able to see new nodes added outside of the transaction. However, C Xenstored implements neither of the two. Currently, when a node is accessed within the transaction, we will also store the names of the current children. To give an example with access to the global DB (prefixed with TID0) and within a transaction (TID1) 1) TID0: MKDIR "data/bar" 2) Start transaction TID1 3) TID1: DIRECTORY "data" -> This will cache the node data 4) TID0: MKDIR "data/foo" -> This will create "foo" in the global database 5) TID1: MKDIR "data/fish" -> This will create "fish" in the transaction 5) TID1: DIRECTORY "data" -> This will only return "bar" and "fish" If we Live-Update between 4) and 5). Then we should make sure that "bar" cannot be seen in the listing by TID1. I meant "foo" here. Sorry for the confusion. Therefore, I don't think we can restore the children using the global node here. Instead we need to find a way to transfer the list of known children within the transaction. As a fun fact, C Xenstored implements weirdly the transaction, so TID1 will be able to access "bar" if it knows the name but not list it. In case a child has been deleted in the transaction, the stream should contain a node record for that child with the transaction-id and the number of permissions being zero: see docs/designs/xenstore-migration.md See above why this is not sufficient. Cheers, -- Julien Grall
Re: Preserving transactions accross Xenstored Live-Update
Hi Juergen, On 19/05/2021 10:09, Juergen Gross wrote: On 18.05.21 20:11, Julien Grall wrote: I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In theory, opening a new transaction means you will not see any modification in the global database until the transaction has been committed. What you describe would break that because a client would be able to see new nodes added outside of the transaction. However, C Xenstored implements neither of the two. Currently, when a node is accessed within the transaction, we will also store the names of the current children. To give an example with access to the global DB (prefixed with TID0) and within a transaction (TID1) 1) TID0: MKDIR "data/bar" 2) Start transaction TID1 3) TID1: DIRECTORY "data" -> This will cache the node data 4) TID0: MKDIR "data/foo" -> This will create "foo" in the global database 5) TID1: MKDIR "data/fish" -> This will create "fish" in the transaction 5) TID1: DIRECTORY "data" -> This will only return "bar" and "fish" If we Live-Update between 4) and 5). Then we should make sure that "bar" cannot be seen in the listing by TID1. Therefore, I don't think we can restore the children using the global node here. Instead we need to find a way to transfer the list of known children within the transaction. As a fun fact, C Xenstored implements weirdly the transaction, so TID1 will be able to access "bar" if it knows the name but not list it. In case a child has been deleted in the transaction, the stream should contain a node record for that child with the transaction-id and the number of permissions being zero: see docs/designs/xenstore-migration.md See above why this is not sufficient. Cheers, -- Julien Grall
Re: Preserving transactions accross Xenstored Live-Update
On 18.05.21 20:11, Julien Grall wrote: Hi Juergen, I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Not performance reasons, but because there isn't any need for that: The children are either unchanged (so the non-transaction node records apply), or they will be among the tracked nodes (transaction node records apply). So in both cases all children should be known. In case a child has been deleted in the transaction, the stream should contain a node record for that child with the transaction-id and the number of permissions being zero: see docs/designs/xenstore-migration.md Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Preserving transactions accross Xenstored Live-Update
Hi Juergen, I have started to look at preserving transaction accross Live-update in C Xenstored. So far, I managed to transfer transaction that read/write existing nodes. Now, I am running into trouble to transfer new/deleted node within a transaction with the existing migration format. C Xenstored will keep track of nodes accessed during the transaction but not the children (AFAICT for performance reason). Therefore we have the name of the children but not the content (i.e. permission, data...). I have been exploring a couple of approaches: 1) Introducing a flag to indicate there is a child but no content. Pros: * Close to the existing stream. * Fairly implementation agnostic. Cons: * Memory overhead as we need to transfer the full path (rather than the child name) * Checking for duplication (if the node was actually accessed) will introduce runtime overhead. 2) Extend XS_STATE_TYPE_NODE (or introduce a new record) to allow transferring the children name for transaction Pros: * The implementation is more straight forward Cons: * The stream becomes implementation specific Neither approach looks very appealing to me. So I would like to request some feedback for other proposals or preference between the two options. Note that I haven't looked into much detail how transactions works on OCaml Xenstored. Cheers, -- Julien Grall