[jira] [Commented] (OMID-90) Reducing begin/commit latency by distributing the write to the commit table
[ https://issues.apache.org/jira/browse/OMID-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652959#comment-16652959 ] Yonatan Gottesman commented on OMID-90: --- Hi I dont want to mixup the low latency changes before all tests pass so we dont have trouble debugging. when phoenix branch in stable and 109 runs without errors i will merge low latency after testing with phoenix. > Reducing begin/commit latency by distributing the write to the commit table > --- > > Key: OMID-90 > URL: https://issues.apache.org/jira/browse/OMID-90 > Project: Apache Omid > Issue Type: Sub-task >Reporter: Ohad Shacham >Assignee: Yonatan Gottesman >Priority: Major > Attachments: OmidCloud-VLDB.pdf, omid90.patch > > > Today, Omid's commits are done by the transaction manager. In order to > efficiently write to the commit table, the transaction manager batches these > writes. This optimization, even thought reduces the write time to HBase, > significantly increases the begin and commit latency. The commit latency > increases since a commit operation returns only after its commit timestamp > was persisted in the commit table. And the begin latency increases since > begin returns a transaction id that is also used by the transaction to > identify its snapshot and therefore, begin returns only after all commits > with commit id smaller than the begin id was persisted in the commit table. > This is crucial, since a snapshot change during a transaction run may violate > snapshot isolation. > > The idea of this feature is to distribute the commit by moving the write to > the commit table from the server to the client. The transaction manager does > conflict analysis and returns a commit timestamp. While the client atomically > persists this commit in the commit table. > This significantly reduces the begin and commit latency, since batching is > not required anymore. A begin operation can immediately returns and a commit > operation returns after conflict detection. > This can introduce snapshot isolation violation since a slow client can > commit and change other transaction's snapsho. Therefore, we use an > invalidation technique which is similar to the one Omid uses today to > maintain snapshot isolation in high availability mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-117) Ensure timeouts are configured low for RPCs to commit table
[ https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652958#comment-16652958 ] Yonatan Gottesman commented on OMID-117: looks good > Ensure timeouts are configured low for RPCs to commit table > --- > > Key: OMID-117 > URL: https://issues.apache.org/jira/browse/OMID-117 > Project: Apache Omid > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Attachments: OMID-117.patch, OMID-117_addendum1.patch, > OMID-117_hbase2.patch, OMID-117_v2.patch, OMID-117_v3.patch, > OMID-117_v4.patch, OMID-117_v5.patch, OMID-117_v6.patch, OMID-117_v7.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-117) Ensure timeouts are configured low for RPCs to commit table
[ https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652692#comment-16652692 ] James Taylor commented on OMID-117: --- [~yonigo] - please review my addendum patch. I lowered the default number of read timeouts to one - this seems to have fixed the issue and wouldn't be an issue in production. Also, worst case, we could always override the default (without needing a release) to be larger if we encounter issues. > Ensure timeouts are configured low for RPCs to commit table > --- > > Key: OMID-117 > URL: https://issues.apache.org/jira/browse/OMID-117 > Project: Apache Omid > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Attachments: OMID-117.patch, OMID-117_addendum1.patch, > OMID-117_hbase2.patch, OMID-117_v2.patch, OMID-117_v3.patch, > OMID-117_v4.patch, OMID-117_v5.patch, OMID-117_v6.patch, OMID-117_v7.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OMID-117) Ensure timeouts are configured low for RPCs to commit table
[ https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated OMID-117: -- Attachment: OMID-117_addendum1.patch > Ensure timeouts are configured low for RPCs to commit table > --- > > Key: OMID-117 > URL: https://issues.apache.org/jira/browse/OMID-117 > Project: Apache Omid > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Attachments: OMID-117.patch, OMID-117_addendum1.patch, > OMID-117_hbase2.patch, OMID-117_v2.patch, OMID-117_v3.patch, > OMID-117_v4.patch, OMID-117_v5.patch, OMID-117_v6.patch, OMID-117_v7.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-90) Reducing begin/commit latency by distributing the write to the commit table
[ https://issues.apache.org/jira/browse/OMID-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652519#comment-16652519 ] James Taylor commented on OMID-90: -- FYI, the test that inserts rows as the index is being initially populated is BaseIndexIT.testCreateIndexAfterUpsertStarted(). That'll be tested when GlobalMutableTxIndexIT is run. Would be good to try this with the low latency version of Omid. Even better would be to try a complete run with the low latency version. > Reducing begin/commit latency by distributing the write to the commit table > --- > > Key: OMID-90 > URL: https://issues.apache.org/jira/browse/OMID-90 > Project: Apache Omid > Issue Type: Sub-task >Reporter: Ohad Shacham >Assignee: Yonatan Gottesman >Priority: Major > Attachments: OmidCloud-VLDB.pdf, omid90.patch > > > Today, Omid's commits are done by the transaction manager. In order to > efficiently write to the commit table, the transaction manager batches these > writes. This optimization, even thought reduces the write time to HBase, > significantly increases the begin and commit latency. The commit latency > increases since a commit operation returns only after its commit timestamp > was persisted in the commit table. And the begin latency increases since > begin returns a transaction id that is also used by the transaction to > identify its snapshot and therefore, begin returns only after all commits > with commit id smaller than the begin id was persisted in the commit table. > This is crucial, since a snapshot change during a transaction run may violate > snapshot isolation. > > The idea of this feature is to distribute the commit by moving the write to > the commit table from the server to the client. The transaction manager does > conflict analysis and returns a commit timestamp. While the client atomically > persists this commit in the commit table. > This significantly reduces the begin and commit latency, since batching is > not required anymore. A begin operation can immediately returns and a commit > operation returns after conflict detection. > This can introduce snapshot isolation violation since a slow client can > commit and change other transaction's snapsho. Therefore, we use an > invalidation technique which is similar to the one Omid uses today to > maintain snapshot isolation in high availability mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job
[ https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652287#comment-16652287 ] James Taylor commented on OMID-109: --- If [~ohads] is ok with committing omid low latency, I as as well. Will that commit change the default to use low latency? Would be good to do a Phoenix omid2 jenkins run using the low latency version. Now that the build is working again, do any of these configuration properties that you have Mujtaba set need to be folded into the pom in some way (so it'll build once it gets committed to Phoenix)? > Unable to build phoenix-integration branch through Jenkins job > -- > > Key: OMID-109 > URL: https://issues.apache.org/jira/browse/OMID-109 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Blocker > Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch > > > Based on Jenkins job failures > (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom > needs to be updated to > https://raw.githubusercontent.com/synergian/wagon-git/releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-90) Reducing begin/commit latency by distributing the write to the commit table
[ https://issues.apache.org/jira/browse/OMID-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651850#comment-16651850 ] James Taylor commented on OMID-90: -- In PHOENIX-4943, I write the shadow cells of the index rows when the index rows themselves are written. Does that solve the issue? The fence was more to guard against the case in which a write to the data table occurs before we start maintaining the index. Prior to PHOENIX-4943, we were writing the shadow cells in the standard way (i.e. after the commit succeeded). There's a test for this - would be good if you guys ran the Phoenix unit tests with the low latency version. I don't remember the particular test that tests writing to a table while the index is being created - [~tdsilva] - do you remember? > Reducing begin/commit latency by distributing the write to the commit table > --- > > Key: OMID-90 > URL: https://issues.apache.org/jira/browse/OMID-90 > Project: Apache Omid > Issue Type: Sub-task >Reporter: Ohad Shacham >Assignee: Yonatan Gottesman >Priority: Major > Attachments: OmidCloud-VLDB.pdf, omid90.patch > > > Today, Omid's commits are done by the transaction manager. In order to > efficiently write to the commit table, the transaction manager batches these > writes. This optimization, even thought reduces the write time to HBase, > significantly increases the begin and commit latency. The commit latency > increases since a commit operation returns only after its commit timestamp > was persisted in the commit table. And the begin latency increases since > begin returns a transaction id that is also used by the transaction to > identify its snapshot and therefore, begin returns only after all commits > with commit id smaller than the begin id was persisted in the commit table. > This is crucial, since a snapshot change during a transaction run may violate > snapshot isolation. > > The idea of this feature is to distribute the commit by moving the write to > the commit table from the server to the client. The transaction manager does > conflict analysis and returns a commit timestamp. While the client atomically > persists this commit in the commit table. > This significantly reduces the begin and commit latency, since batching is > not required anymore. A begin operation can immediately returns and a commit > operation returns after conflict detection. > This can introduce snapshot isolation violation since a slow client can > commit and change other transaction's snapsho. Therefore, we use an > invalidation technique which is similar to the one Omid uses today to > maintain snapshot isolation in high availability mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job
[ https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651830#comment-16651830 ] Yonatan Gottesman commented on OMID-109: OK so im waiting before commiting the omid low latency. > Unable to build phoenix-integration branch through Jenkins job > -- > > Key: OMID-109 > URL: https://issues.apache.org/jira/browse/OMID-109 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Blocker > Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch > > > Based on Jenkins job failures > (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom > needs to be updated to > https://raw.githubusercontent.com/synergian/wagon-git/releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job
[ https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651825#comment-16651825 ] James Taylor commented on OMID-109: --- Thanks for offering to help. I'm fixing some of the obvious ones. Main cause is that I've made it so that local indexes are not allowed to be used with Omid, but some tests are still expecting that they would work. Other tests are failing because the shadow cells are throwing off expected number of Cells and/or byte sizes. Let's see how things look after my next commit. Will let you know if I need help. > Unable to build phoenix-integration branch through Jenkins job > -- > > Key: OMID-109 > URL: https://issues.apache.org/jira/browse/OMID-109 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Blocker > Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch > > > Based on Jenkins job failures > (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom > needs to be updated to > https://raw.githubusercontent.com/synergian/wagon-git/releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-90) Reducing begin/commit latency by distributing the write to the commit table
[ https://issues.apache.org/jira/browse/OMID-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651230#comment-16651230 ] Ohad Shacham commented on OMID-90: -- [~jamestaylor], sorry for the late response. In the low latency mode, the fence information is not written to the commit table and the fence id is only returned to the client. This is why using auto commit in this case is important. In -PHOENIX-4943- you solved this issue, I just wanted to make sure it does. In the non low latency mode the fence info is written to the commit and can hide an incorrect auto commit (with runtime penalty of course :)). > Reducing begin/commit latency by distributing the write to the commit table > --- > > Key: OMID-90 > URL: https://issues.apache.org/jira/browse/OMID-90 > Project: Apache Omid > Issue Type: Sub-task >Reporter: Ohad Shacham >Assignee: Yonatan Gottesman >Priority: Major > Attachments: OmidCloud-VLDB.pdf, omid90.patch > > > Today, Omid's commits are done by the transaction manager. In order to > efficiently write to the commit table, the transaction manager batches these > writes. This optimization, even thought reduces the write time to HBase, > significantly increases the begin and commit latency. The commit latency > increases since a commit operation returns only after its commit timestamp > was persisted in the commit table. And the begin latency increases since > begin returns a transaction id that is also used by the transaction to > identify its snapshot and therefore, begin returns only after all commits > with commit id smaller than the begin id was persisted in the commit table. > This is crucial, since a snapshot change during a transaction run may violate > snapshot isolation. > > The idea of this feature is to distribute the commit by moving the write to > the commit table from the server to the client. The transaction manager does > conflict analysis and returns a commit timestamp. While the client atomically > persists this commit in the commit table. > This significantly reduces the begin and commit latency, since batching is > not required anymore. A begin operation can immediately returns and a commit > operation returns after conflict detection. > This can introduce snapshot isolation violation since a slow client can > commit and change other transaction's snapsho. Therefore, we use an > invalidation technique which is similar to the one Omid uses today to > maintain snapshot isolation in high availability mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)