[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job
[ https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651163#comment-16651163 ] Yonatan Gottesman commented on OMID-109: OK cools, looks like many tests fail. Are some of them expected to fail before I jump to debugging? > Unable to build phoenix-integration branch through Jenkins job > -- > > Key: OMID-109 > URL: https://issues.apache.org/jira/browse/OMID-109 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Blocker > Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch > > > Based on Jenkins job failures > (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom > needs to be updated to > https://raw.githubusercontent.com/synergian/wagon-git/releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job
[ https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650588#comment-16650588 ] Yonatan Gottesman commented on OMID-109: https://issues.apache.org/jira/browse/INFRA-17149 Is this ok? > Unable to build phoenix-integration branch through Jenkins job > -- > > Key: OMID-109 > URL: https://issues.apache.org/jira/browse/OMID-109 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Blocker > Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch > > > Based on Jenkins job failures > (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom > needs to be updated to > https://raw.githubusercontent.com/synergian/wagon-git/releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-109) Unable to build phoenix-integration branch through Jenkins job
[ https://issues.apache.org/jira/browse/OMID-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650580#comment-16650580 ] James Taylor commented on OMID-109: --- [~yonigo] & [~ohads] - would you guys mind filing INFRA Jira tickets so that you can make changes to these Jenkins jobs? > Unable to build phoenix-integration branch through Jenkins job > -- > > Key: OMID-109 > URL: https://issues.apache.org/jira/browse/OMID-109 > Project: Apache Omid > Issue Type: Sub-task >Reporter: James Taylor >Assignee: Yonatan Gottesman >Priority: Blocker > Attachments: OMID-109.patch, omid109.patch, omid109_v2.patch > > > Based on Jenkins job failures > (https://builds.apache.org/job/Phoenix-omid2/81/), the repo URL in the pom > needs to be updated to > https://raw.githubusercontent.com/synergian/wagon-git/releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-90) Reducing begin/commit latency by distributing the write to the commit table
[ https://issues.apache.org/jira/browse/OMID-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650361#comment-16650361 ] James Taylor commented on OMID-90: -- Ping, [~ohads]. I didn't understand your question: {quote}This commit removes the addition of the fence information to the commit table. Could you please check if it hurts correctness? It is related to the index population with auto commit that you have just added. {quote} We need the fence functionality in Phoenix so that when an index is created, we can guarantee that we don't miss creation of any data rows. > Reducing begin/commit latency by distributing the write to the commit table > --- > > Key: OMID-90 > URL: https://issues.apache.org/jira/browse/OMID-90 > Project: Apache Omid > Issue Type: Sub-task >Reporter: Ohad Shacham >Assignee: Yonatan Gottesman >Priority: Major > Attachments: OmidCloud-VLDB.pdf, omid90.patch > > > Today, Omid's commits are done by the transaction manager. In order to > efficiently write to the commit table, the transaction manager batches these > writes. This optimization, even thought reduces the write time to HBase, > significantly increases the begin and commit latency. The commit latency > increases since a commit operation returns only after its commit timestamp > was persisted in the commit table. And the begin latency increases since > begin returns a transaction id that is also used by the transaction to > identify its snapshot and therefore, begin returns only after all commits > with commit id smaller than the begin id was persisted in the commit table. > This is crucial, since a snapshot change during a transaction run may violate > snapshot isolation. > > The idea of this feature is to distribute the commit by moving the write to > the commit table from the server to the client. The transaction manager does > conflict analysis and returns a commit timestamp. While the client atomically > persists this commit in the commit table. > This significantly reduces the begin and commit latency, since batching is > not required anymore. A begin operation can immediately returns and a commit > operation returns after conflict detection. > This can introduce snapshot isolation violation since a slow client can > commit and change other transaction's snapsho. Therefore, we use an > invalidation technique which is similar to the one Omid uses today to > maintain snapshot isolation in high availability mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-117) Ensure timeouts are configured low for RPCs to commit table
[ https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650343#comment-16650343 ] James Taylor commented on OMID-117: --- {quote}About the retries, what the worst thing that can happen with this? how bad is it to have it like this? {quote} Check out the comment I added to RegionConnectionFactory: {code:java} // This setting controls how many retries occur on the region server if an // IOException occurs while trying to access the commit table. Because a // handler thread will be in use while these retries occur and the client // will be blocked waiting, it must not tie up the call for longer than // the client RPC timeout. Otherwise, the client will initiate retries on it's // end, tying up yet another handler thread. It's best if the retries can be // zero, as in that case the handler is released and the retries occur on the // client side. In testing, we've seen NoServerForRegionException occur which // is a DoNotRetryIOException which are not retried on the client. It's not // clear if this is a real issue or a test-only issue. private static final int DEFAULT_COMMIT_TABLE_ACCESS_ON_READ_RETRIES_NUMBER = 11; private static final int DEFAULT_COMMIT_TABLE_ACCESS_ON_READ_RETRY_PAUSE = 100; {code} As it is with this patch, if retries are necessary to reach the RS hosting the commit table, they will occur from the RS handling the scan for 48 seconds. During this time, the handler thread will be tied up (i.e. it won't be able to be used by any other HBase client). If this occurs for all the handler threads on the RS, then all incoming requests would be queued. For example, non transactional queries would potentially not be processed during this time. If the retries (and pauses) occur on the client side, then non transactional work loads wouldn't be impacted. Ideally, we'd have a test that reproduces this NoServerForRegionException and see if any changes are needed to handle this situation. You might be able to repro this by manually splitting the commit table and then performing a read against a transactional table. It also may just occur the very first time the commit table is attempted to be reached from a RS after the commit table is created. > Ensure timeouts are configured low for RPCs to commit table > --- > > Key: OMID-117 > URL: https://issues.apache.org/jira/browse/OMID-117 > Project: Apache Omid > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Attachments: OMID-117.patch, OMID-117_hbase2.patch, > OMID-117_v2.patch, OMID-117_v3.patch, OMID-117_v4.patch, OMID-117_v5.patch, > OMID-117_v6.patch, OMID-117_v7.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OMID-117) Ensure timeouts are configured low for RPCs to commit table
[ https://issues.apache.org/jira/browse/OMID-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649798#comment-16649798 ] Yonatan Gottesman commented on OMID-117: Hi it looks good. I dont understand what you said about changing the poms, all tests pass without changing anything (hbase2 too). About the retries, what the worst thing that can happen with this? how bad is it to have it like this? if its bad i can investigate why tests dont pass. ok to commit > Ensure timeouts are configured low for RPCs to commit table > --- > > Key: OMID-117 > URL: https://issues.apache.org/jira/browse/OMID-117 > Project: Apache Omid > Issue Type: Bug >Reporter: James Taylor >Priority: Major > Attachments: OMID-117.patch, OMID-117_hbase2.patch, > OMID-117_v2.patch, OMID-117_v3.patch, OMID-117_v4.patch, OMID-117_v5.patch, > OMID-117_v6.patch, OMID-117_v7.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)