[GitHub] incubator-tephra pull request #48: [TEPHRA-241] Introduce a way to limit the...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-tephra/pull/48 ---
[jira] [Commented] (TEPHRA-241) Introduce a way to limit the size of a transaction
[ https://issues.apache.org/jira/browse/TEPHRA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159800#comment-16159800 ] ASF GitHub Bot commented on TEPHRA-241: --- Github user asfgit closed the pull request at: https://github.com/apache/incubator-tephra/pull/48 > Introduce a way to limit the size of a transaction > -- > > Key: TEPHRA-241 > URL: https://issues.apache.org/jira/browse/TEPHRA-241 > Project: Tephra > Issue Type: Improvement > Components: api, manager >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > When clients perform a huge number of writes in a short transaction, that can > result in huge change sets. For example, if a client performs 10M writes and > sends that change set over, that can easily be 1GB large. The transaction > manager will keep this in memory. It will also write this as an edit to the > transaction log. > Assume it runs out of memory because the change set is too large. It crashes > and when it restarts, it will replay the log, load that huge change set > again, and crash again. > To prevent this kind of systemic failure, and to encourage developers to use > long transactions when performing many writes, we can introduce two new > properties in the configuration: > - change set warn threshold: if a change set exceeds this size, a warning is > logged. > - change set reject threshold: if a change set exceeds this size, it is > rejected (canCommit will throw an exception) and that will fail the > transaction. > Both thresholds should be Long.MAX_VALUE by default, to preserve existing > behavior after upgrade. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (TEPHRA-241) Introduce a way to limit the size of a transaction
[ https://issues.apache.org/jira/browse/TEPHRA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Neumann resolved TEPHRA-241. Resolution: Fixed > Introduce a way to limit the size of a transaction > -- > > Key: TEPHRA-241 > URL: https://issues.apache.org/jira/browse/TEPHRA-241 > Project: Tephra > Issue Type: Improvement > Components: api, manager >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > When clients perform a huge number of writes in a short transaction, that can > result in huge change sets. For example, if a client performs 10M writes and > sends that change set over, that can easily be 1GB large. The transaction > manager will keep this in memory. It will also write this as an edit to the > transaction log. > Assume it runs out of memory because the change set is too large. It crashes > and when it restarts, it will replay the log, load that huge change set > again, and crash again. > To prevent this kind of systemic failure, and to encourage developers to use > long transactions when performing many writes, we can introduce two new > properties in the configuration: > - change set warn threshold: if a change set exceeds this size, a warning is > logged. > - change set reject threshold: if a change set exceeds this size, it is > rejected (canCommit will throw an exception) and that will fail the > transaction. > Both thresholds should be Long.MAX_VALUE by default, to preserve existing > behavior after upgrade. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (TEPHRA-253) TransactionProcessorTest is sometimes flaky
[ https://issues.apache.org/jira/browse/TEPHRA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Neumann reassigned TEPHRA-253: -- Assignee: Andreas Neumann (was: Poorna Chandra) > TransactionProcessorTest is sometimes flaky > --- > > Key: TEPHRA-253 > URL: https://issues.apache.org/jira/browse/TEPHRA-253 > Project: Tephra > Issue Type: Bug >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > > The test sometimes fails as follows: > {noformat} > Running org.apache.tephra.hbase.coprocessor.TransactionProcessorTest > Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.741 sec <<< > FAILURE! > testFamilyDeleteTimestamp(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 1.526 sec > testTransactionStateCache(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.053 sec > testDataJanitorRegionScanner(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.288 sec <<< FAILURE! > org.junit.internal.ArrayComparisonFailure: arrays first differed at element > [3]; expected:<4> but was:<1> > at > org.junit.internal.ComparisonCriteria.arrayEquals(ComparisonCriteria.java:50) > at org.junit.Assert.internalArrayEquals(Assert.java:473) > at org.junit.Assert.assertArrayEquals(Assert.java:294) > at org.junit.Assert.assertArrayEquals(Assert.java:305) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:593) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:585) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.testDataJanitorRegionScanner(TransactionProcessorTest.java:190) > {noformat} > It is not clear what is causing this, most likely the region server did not > have an up-to-date transaction state snapshot at the time of the lfush (that > might be due to TEPHRA-239 orTEPHRA-249, or it might be a condition where > flush() has no effect because the region is already flushing, > Let's observe this and gather more information when/if it happens again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEPHRA-253) TransactionProcessorTest is sometimes flaky
[ https://issues.apache.org/jira/browse/TEPHRA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160097#comment-16160097 ] Andreas Neumann commented on TEPHRA-253: Fix is to wait with the flush until the transaction state is loaded. PR: https://github.com/apache/incubator-tephra/pull/54 > TransactionProcessorTest is sometimes flaky > --- > > Key: TEPHRA-253 > URL: https://issues.apache.org/jira/browse/TEPHRA-253 > Project: Tephra > Issue Type: Bug >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > The test sometimes fails as follows: > {noformat} > Running org.apache.tephra.hbase.coprocessor.TransactionProcessorTest > Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.741 sec <<< > FAILURE! > testFamilyDeleteTimestamp(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 1.526 sec > testTransactionStateCache(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.053 sec > testDataJanitorRegionScanner(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.288 sec <<< FAILURE! > org.junit.internal.ArrayComparisonFailure: arrays first differed at element > [3]; expected:<4> but was:<1> > at > org.junit.internal.ComparisonCriteria.arrayEquals(ComparisonCriteria.java:50) > at org.junit.Assert.internalArrayEquals(Assert.java:473) > at org.junit.Assert.assertArrayEquals(Assert.java:294) > at org.junit.Assert.assertArrayEquals(Assert.java:305) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:593) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:585) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.testDataJanitorRegionScanner(TransactionProcessorTest.java:190) > {noformat} > It is not clear what is causing this, most likely the region server did not > have an up-to-date transaction state snapshot at the time of the lfush (that > might be due to TEPHRA-239 orTEPHRA-249, or it might be a condition where > flush() has no effect because the region is already flushing, > Let's observe this and gather more information when/if it happens again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEPHRA-253) TransactionProcessorTest is sometimes flaky
[ https://issues.apache.org/jira/browse/TEPHRA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160061#comment-16160061 ] Andreas Neumann commented on TEPHRA-253: Suspicion confirmed. After changing travis to dump the standard output of the test case, I see: {noformat} 2017-09-09 19:18:16,851 - INFO [main:o.a.h.h.r.RegionCoprocessorHost@196] - Load coprocessor org.apache.tephra.hbase.coprocessor.TransactionProcessor from HTD of TestRegionScanner successfully. 2017-09-09 19:18:16,868 - INFO [StoreOpener-fc704aec719b675f06e5d7bd12da85f0-1:o.a.h.h.r.c.CompactionConfiguration@85] - size [134217728, 9223372036854775807); files [3, 10); ratio 1.20; off-peak ratio 5.00; throttle point 2684354560; delete expired; major period 60480, major jitter 0.50 2017-09-09 19:18:16,883 - INFO [main:o.a.h.h.r.HRegion@644] - Onlined fc704aec719b675f06e5d7bd12da85f0; next sequenceid=1 2017-09-09 19:18:16,883 - INFO [main:o.a.t.h.c.TransactionProcessorTest@178] - Coprocessor is using transaction state: null 2017-09-09 19:18:16,926 - INFO [main:o.a.t.h.c.TransactionProcessorTest@192] - Flushing region TestRegionScanner,,1504984696824.fc704aec719b675f06e5d7bd12da85f0. 2017-09-09 19:18:16,960 - INFO [HDFSTransactionStateStorage STARTING:o.a.t.p.HDFSTransactionStateStorage@109] - Using snapshot dir /home/travis/build/apache/incubator-tephra/tephra-hbase-compat-0.96/target/junit6493752557205114158/junit8165179254738335598 2017-09-09 19:18:16,981 - INFO [TransactionStateCache STARTING:o.a.t.p.HDFSTransactionStateStorage@185] - Read encoded transaction snapshot of 84 bytes 2017-09-09 19:18:16,984 - INFO [TransactionStateCache STARTING:o.a.t.c.TransactionStateCache@166] - Transaction state reloaded with snapshot from 1504984695267 2017-09-09 19:18:17,393 - INFO [main:o.a.h.h.r.DefaultStoreFlusher@88] - Flushed, sequenceid=37, memsize=5.9 K, hasBloomFilter=true, into tmp file hdfs://localhost:53322/home/travis/build/apache/incubator-tephra/tephra-hbase-compat-0.96/target/junit6493752557205114158/junit7077794411994061305/hbase/data/default/TestRegionScanner/fc704aec719b675f06e5d7bd12da85f0/.tmp/6e813e3b7af94e13afc9dc1303dda3f8 2017-09-09 19:18:17,415 - INFO [main:o.a.h.h.r.HStore@770] - Added hdfs://localhost:53322/home/travis/build/apache/incubator-tephra/tephra-hbase-compat-0.96/target/junit6493752557205114158/junit7077794411994061305/hbase/data/default/TestRegionScanner/fc704aec719b675f06e5d7bd12da85f0/f/6e813e3b7af94e13afc9dc1303dda3f8, entries=36, sequenceid=37, filesize=2.2 K 2017-09-09 19:18:17,416 - INFO [main:o.a.h.h.r.HRegion@1708] - Finished memstore flush of ~5.9 K/6048, currentsize=0/0 for region TestRegionScanner,,1504984696824.fc704aec719b675f06e5d7bd12da85f0. in 489ms, sequenceid=37, compaction requested=false {noformat} Clearly, the flush begins before the transaction state is loaded. > TransactionProcessorTest is sometimes flaky > --- > > Key: TEPHRA-253 > URL: https://issues.apache.org/jira/browse/TEPHRA-253 > Project: Tephra > Issue Type: Bug >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > The test sometimes fails as follows: > {noformat} > Running org.apache.tephra.hbase.coprocessor.TransactionProcessorTest > Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.741 sec <<< > FAILURE! > testFamilyDeleteTimestamp(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 1.526 sec > testTransactionStateCache(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.053 sec > testDataJanitorRegionScanner(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.288 sec <<< FAILURE! > org.junit.internal.ArrayComparisonFailure: arrays first differed at element > [3]; expected:<4> but was:<1> > at > org.junit.internal.ComparisonCriteria.arrayEquals(ComparisonCriteria.java:50) > at org.junit.Assert.internalArrayEquals(Assert.java:473) > at org.junit.Assert.assertArrayEquals(Assert.java:294) > at org.junit.Assert.assertArrayEquals(Assert.java:305) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:593) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:585) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.testDataJanitorRegionScanner(TransactionProcessorTest.java:190) > {noformat} > It is not clear what is causing this, most likely the region server did not > have an up-to-date transaction state snapshot at the time of the lfush (that > might be due to TEPHRA-239 orTEPHRA-249, or it might be a condition
[jira] [Updated] (TEPHRA-253) TransactionProcessorTest is sometimes flaky
[ https://issues.apache.org/jira/browse/TEPHRA-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Neumann updated TEPHRA-253: --- Fix Version/s: 0.13.0-incubating > TransactionProcessorTest is sometimes flaky > --- > > Key: TEPHRA-253 > URL: https://issues.apache.org/jira/browse/TEPHRA-253 > Project: Tephra > Issue Type: Bug >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > The test sometimes fails as follows: > {noformat} > Running org.apache.tephra.hbase.coprocessor.TransactionProcessorTest > Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.741 sec <<< > FAILURE! > testFamilyDeleteTimestamp(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 1.526 sec > testTransactionStateCache(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.053 sec > testDataJanitorRegionScanner(org.apache.tephra.hbase.coprocessor.TransactionProcessorTest) > Time elapsed: 0.288 sec <<< FAILURE! > org.junit.internal.ArrayComparisonFailure: arrays first differed at element > [3]; expected:<4> but was:<1> > at > org.junit.internal.ComparisonCriteria.arrayEquals(ComparisonCriteria.java:50) > at org.junit.Assert.internalArrayEquals(Assert.java:473) > at org.junit.Assert.assertArrayEquals(Assert.java:294) > at org.junit.Assert.assertArrayEquals(Assert.java:305) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:593) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.assertKeyValueMatches(TransactionProcessorTest.java:585) > at > org.apache.tephra.hbase.coprocessor.TransactionProcessorTest.testDataJanitorRegionScanner(TransactionProcessorTest.java:190) > {noformat} > It is not clear what is causing this, most likely the region server did not > have an up-to-date transaction state snapshot at the time of the lfush (that > might be due to TEPHRA-239 orTEPHRA-249, or it might be a condition where > flush() has no effect because the region is already flushing, > Let's observe this and gather more information when/if it happens again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEPHRA-241) Introduce a way to limit the size of a transaction
[ https://issues.apache.org/jira/browse/TEPHRA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159773#comment-16159773 ] ASF GitHub Bot commented on TEPHRA-241: --- Github user anew commented on the issue: https://github.com/apache/incubator-tephra/pull/48 After rebasing on latest master, another travis build is running; waiting for that to finish. > Introduce a way to limit the size of a transaction > -- > > Key: TEPHRA-241 > URL: https://issues.apache.org/jira/browse/TEPHRA-241 > Project: Tephra > Issue Type: Improvement > Components: api, manager >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > When clients perform a huge number of writes in a short transaction, that can > result in huge change sets. For example, if a client performs 10M writes and > sends that change set over, that can easily be 1GB large. The transaction > manager will keep this in memory. It will also write this as an edit to the > transaction log. > Assume it runs out of memory because the change set is too large. It crashes > and when it restarts, it will replay the log, load that huge change set > again, and crash again. > To prevent this kind of systemic failure, and to encourage developers to use > long transactions when performing many writes, we can introduce two new > properties in the configuration: > - change set warn threshold: if a change set exceeds this size, a warning is > logged. > - change set reject threshold: if a change set exceeds this size, it is > rejected (canCommit will throw an exception) and that will fail the > transaction. > Both thresholds should be Long.MAX_VALUE by default, to preserve existing > behavior after upgrade. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-tephra pull request #54: wip
GitHub user anew opened a pull request: https://github.com/apache/incubator-tephra/pull/54 wip on hold You can merge this pull request into a Git repository by running: $ git pull https://github.com/anew/incubator-tephra tephra-253 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-tephra/pull/54.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #54 commit 140c1f0aa621fb7ab7a80c087b5be293d3b68035 Author: anewDate: 2017-09-09T06:43:55Z wip ---
[jira] [Commented] (TEPHRA-241) Introduce a way to limit the size of a transaction
[ https://issues.apache.org/jira/browse/TEPHRA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159796#comment-16159796 ] ASF GitHub Bot commented on TEPHRA-241: --- Github user anew commented on the issue: https://github.com/apache/incubator-tephra/pull/48 Same travis failure. This appears to happen a lot more frequently with Java 8. I will commit this now and try to fix the flaky test before 0.13 release. > Introduce a way to limit the size of a transaction > -- > > Key: TEPHRA-241 > URL: https://issues.apache.org/jira/browse/TEPHRA-241 > Project: Tephra > Issue Type: Improvement > Components: api, manager >Affects Versions: 0.12.0-incubating >Reporter: Andreas Neumann >Assignee: Andreas Neumann > Fix For: 0.13.0-incubating > > > When clients perform a huge number of writes in a short transaction, that can > result in huge change sets. For example, if a client performs 10M writes and > sends that change set over, that can easily be 1GB large. The transaction > manager will keep this in memory. It will also write this as an edit to the > transaction log. > Assume it runs out of memory because the change set is too large. It crashes > and when it restarts, it will replay the log, load that huge change set > again, and crash again. > To prevent this kind of systemic failure, and to encourage developers to use > long transactions when performing many writes, we can introduce two new > properties in the configuration: > - change set warn threshold: if a change set exceeds this size, a warning is > logged. > - change set reject threshold: if a change set exceeds this size, it is > rejected (canCommit will throw an exception) and that will fail the > transaction. > Both thresholds should be Long.MAX_VALUE by default, to preserve existing > behavior after upgrade. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] incubator-tephra issue #48: [TEPHRA-241] Introduce a way to limit the size o...
Github user anew commented on the issue: https://github.com/apache/incubator-tephra/pull/48 Same travis failure. This appears to happen a lot more frequently with Java 8. I will commit this now and try to fix the flaky test before 0.13 release. ---