[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606904#comment-13606904 ] nkeywal commented on HBASE-7590: ok, will do (as for rb) by the end of this week. > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.96.0 > > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, > 7590.v3.patch, 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8145) TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0
[ https://issues.apache.org/jira/browse/HBASE-8145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal resolved HBASE-8145. Resolution: Duplicate > TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0 > -- > > Key: HBASE-8145 > URL: https://issues.apache.org/jira/browse/HBASE-8145 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.96.0 > > > I will check for 0.95. > {code} > for (HRegion region : regions) { > if > (!region.getRegionInfo().getEncodedName().equals(toMove.getRegionInfo().getEncodedName()) > && > Bytes.BYTES_COMPARATOR.compare(region.getRegionInfo().getStartKey(), ROW_X) < > 0) { > otherRow = region.getRegionInfo().getStartKey(); > break; > } > } > {code} > We're likely to get sometimes the startKey of the first region here, and > that's an empty byte array. This make the put creation to fail, since there > is now (with HBASE-8101) a check on the empty rows at put creation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8145) TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0
nkeywal created HBASE-8145: -- Summary: TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0 Key: HBASE-8145 URL: https://issues.apache.org/jira/browse/HBASE-8145 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Trivial Fix For: 0.96.0 I will check for 0.95. {code} for (HRegion region : regions) { if (!region.getRegionInfo().getEncodedName().equals(toMove.getRegionInfo().getEncodedName()) && Bytes.BYTES_COMPARATOR.compare(region.getRegionInfo().getStartKey(), ROW_X) < 0) { otherRow = region.getRegionInfo().getStartKey(); break; } } {code} We're likely to get sometimes the startKey of the first region here, and that's an empty byte array. This make the put creation to fail, since there is now (with HBASE-8101) a check on the empty rows at put creation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Resolution: Fixed Fix Version/s: 0.96.0 0.95.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.96.0 > > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, > 7590.v3.patch, 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch, 8135.v5.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606568#comment-13606568 ] nkeywal commented on HBASE-8135: This seems to prove that we're in the usual flakiness (thanks for having relaunched the tests, Ted). Committed to trunk and 0.95. Thanks for the review, Stack & Ted! > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch, 8135.v5.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Status: Patch Available (was: Open) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.94.5, 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Status: Open (was: Patch Available) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.94.5, 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Attachment: 8135.v5.patch > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606196#comment-13606196 ] nkeywal commented on HBASE-8135: v5 is what I will commit if the build runs ok. > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Attachment: 8135.v5.patch > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, > 8135.v4.patch, 8135.v5.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606188#comment-13606188 ] nkeywal commented on HBASE-8135: bq. There was a javadoc warning. Right. I'm going to hunt it. > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6674) Check behavior of current surefire trunk on Hadoop QA
[ https://issues.apache.org/jira/browse/HBASE-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6674: --- Resolution: Duplicate Status: Resolved (was: Patch Available) Since JUnit part is done, I'm using HBASE-4955 to test surefire. > Check behavior of current surefire trunk on Hadoop QA > - > > Key: HBASE-6674 > URL: https://issues.apache.org/jira/browse/HBASE-6674 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Attachments: 5processes.patch, 5processes.patch, 5processes.patch, > 6674.patch, 6674.v2.patch, 6674.v2.patch, 6674.v2.patch, 6674.v2.patch > > > Not to be committed. > Surefire 2.13 is in progress. Let's check that it works for us before it's > released. Locally it's acceptable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606183#comment-13606183 ] nkeywal commented on HBASE-8135: Thanks a lot Ted. For v4 I've just moved the test 'Put' with the other tests. I will commit as soon as I get a +1. > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Attachment: 8135.v4.patch > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Status: Patch Available (was: Open) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.94.5, 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Status: Open (was: Patch Available) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.94.5, 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Patch Available (was: Open) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, > 7590.v3.patch, 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605698#comment-13605698 ] nkeywal commented on HBASE-7590: May be 13 is going to be my lucky number :-) ? > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, > 7590.v3.patch, 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Attachment: 7590.v13.patch > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, > 7590.v3.patch, 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Open (was: Patch Available) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, > 7590.v3.patch, 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8128) HTable#put improvements
[ https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8128: --- Fix Version/s: 0.94.8 > HTable#put improvements > --- > > Key: HBASE-8128 > URL: https://issues.apache.org/jira/browse/HBASE-8128 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.95.0, 0.96.0, 0.94.8 > > Attachments: 8128.v1.patch > > > 3 points: > - When doing a single put, we're creating an object by calling Arrays.asList > - we're doing a size check every 10 put. Not doing it seems simpler, better > and allows to share some code between a single put and a list of puts. > - we could call flushCommits on empty write buffer, especially for someone > using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8128) HTable#put improvements
[ https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605527#comment-13605527 ] nkeywal commented on HBASE-8128: Committed in 0.94 > HTable#put improvements > --- > > Key: HBASE-8128 > URL: https://issues.apache.org/jira/browse/HBASE-8128 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.95.0, 0.96.0 > > Attachments: 8128.v1.patch > > > 3 points: > - When doing a single put, we're creating an object by calling Arrays.asList > - we're doing a size check every 10 put. Not doing it seems simpler, better > and allows to share some code between a single put and a list of puts. > - we could call flushCommits on empty write buffer, especially for someone > using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605503#comment-13605503 ] nkeywal commented on HBASE-4955: Yes, JUnit 4.11 contains what we need. Surefire contains what we need as well, we just need to get a version that comes up with an acceptable set of regressions. Our Surefire is in Gary's repo. He will be the one getting the blame if Apache complains :-). BTW, I haven't done the update to JUnit in HBase 0.94, as it implies backporting a few jiras as well (in the required section). So we still need to have it in Gary's repo as well. > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605483#comment-13605483 ] nkeywal commented on HBASE-8135: Yes, just locally (I do test before submitting :-) ) I used ClassSize.align for timerange, and it went ok. But for Put/Delete/Increment, There are 88 bytes of difference I cannot explain. The code is on the v2.patch. > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605470#comment-13605470 ] nkeywal commented on HBASE-8135: There is an issue: with the unit tests, I've got a huge gap I do not explain: Expected :80 Actual :168 2013-03-18 19:54:28,845 DEBUG [main] util.ClassSize(246): 0 row class [B 2013-03-18 19:54:28,845 DEBUG [main] util.ClassSize(246): 1 ts long 2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 2 writeToWAL boolean 2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 3 familyMap interface java.util.NavigableMap 2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 4 attributes interface java.util.Map 2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(273): Primitives=9, arrays=1, references(includes 2 for object overhead)=5, refSize 8, size=80, prealign_size=73 Any hint? > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Attachment: 8135.v2.patch > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Status: Open (was: Patch Available) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.94.5, 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch, 8135.v2.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605362#comment-13605362 ] nkeywal commented on HBASE-8135: I was trying to find a generic way to get the size of an object. A google search on this leads to quite a lot of terrible practises :-). It should be possible to do a static{} block for the fixed fields, but it won't bring much actual value. With the current implementation, it's better to have unit tests when ones adds fields. I'm going to do this in this patch (including Increment) it will be simpler. > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6870) HTable#coprocessorExec always scan the whole table
[ https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6870: --- Affects Version/s: 0.96.0 0.95.0 > HTable#coprocessorExec always scan the whole table > --- > > Key: HBASE-6870 > URL: https://issues.apache.org/jira/browse/HBASE-6870 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 0.94.1, 0.95.0, 0.96.0 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, > HBASE-6870v2.patch, HBASE-6870v3.patch > > > In current logic, HTable#coprocessorExec always scan the whole table, its > efficiency is low and will affect the Regionserver carrying .META. under > large coprocessorExec requests -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8136) coprocessor service requires .meta. to be available all the time.
[ https://issues.apache.org/jira/browse/HBASE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal resolved HBASE-8136. Resolution: Duplicate Release Note: HBASE-6870 And the good news if that there is already a patch for HBASE-6870 :-). > coprocessor service requires .meta. to be available all the time. > - > > Key: HBASE-8136 > URL: https://issues.apache.org/jira/browse/HBASE-8136 > Project: HBase > Issue Type: Bug > Components: Client, Coprocessors >Affects Versions: 0.96.0 >Reporter: nkeywal >Priority: Minor > > HTable#getRegionLocations does not use a cache: all the calls to this > function go to .META. > So: > - we're missing an opportunity to reuse/update the location cache in the > HConnection. > - this method is called by the coprocessor service. So, for people using this > features, they have .meta. on their execution path, and it's not good for > performances, scalability and reliability. > I'm not totally clear on the fix. I think it should be possible to use the > cache to see if we have all regions for the table. But it means we won't > always have the last version when calling getRegionLocations. > Any thought on this? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8136) coprocessor service requires .meta. to be available all the time.
nkeywal created HBASE-8136: -- Summary: coprocessor service requires .meta. to be available all the time. Key: HBASE-8136 URL: https://issues.apache.org/jira/browse/HBASE-8136 Project: HBase Issue Type: Bug Components: Client, Coprocessors Affects Versions: 0.96.0 Reporter: nkeywal Priority: Minor HTable#getRegionLocations does not use a cache: all the calls to this function go to .META. So: - we're missing an opportunity to reuse/update the location cache in the HConnection. - this method is called by the coprocessor service. So, for people using this features, they have .meta. on their execution path, and it's not good for performances, scalability and reliability. I'm not totally clear on the fix. I think it should be possible to use the cache to see if we have all regions for the table. But it means we won't always have the last version when calling getRegionLocations. Any thought on this? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605224#comment-13605224 ] nkeywal commented on HBASE-4955: It's fun, because if I do: mvn clean test -Dsurefire.part2.skip=true -q -PrunAllTests -Dsurefire.part1.forkCount=10 the number of executed tests is a random number above 600 while with mvn clean test -Dsurefire.part2.skip=true -q -PrunAllTests -Dsurefire.part1.forkCount=1 It's always 543 more parallism == less randomness (logic) but less tests executed (not logic) I don't reproduce it on a surefire unit tests. I'm going to try a little bit more then we will have the option to wait for 2.15, hoping it will be identified and fixed. > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605183#comment-13605183 ] nkeywal commented on HBASE-8135: Agreed, I will do that on commit. Are you +1 otherwise? > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605139#comment-13605139 ] nkeywal commented on HBASE-4955: Likely a bad news... Among the missing tests, we have this: @RunWith(Parameterized.class) @Category(SmallTests.class) public class TestFixedFileTrailer { i.e. there could be issues with parametized tests (and that could not be enough to explain the 200 missing tests). Looking... > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605046#comment-13605046 ] nkeywal commented on HBASE-7590: v12 with the comments on RB from Devaraj taken into account. Nearly there! > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, > 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Patch Available (was: Open) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, > 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Open (was: Patch Available) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, > 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Attachment: 7590.v12.patch > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, > 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Attachment: 7590.v12.patch > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, > 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, > 7590.v5.patch, 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Status: Patch Available (was: Open) > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.94.5, 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize
[ https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8135: --- Attachment: 8135.v1.patch > Mutation should implement HeapSize > -- > > Key: HBASE-8135 > URL: https://issues.apache.org/jira/browse/HBASE-8135 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0, 0.94.5 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0, 0.96.0 > > Attachments: 8135.v1.patch > > > Code is there already. > Doing so would allow to share some code when doing client side buffering. > patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8135) Mutation should implement HeapSize
nkeywal created HBASE-8135: -- Summary: Mutation should implement HeapSize Key: HBASE-8135 URL: https://issues.apache.org/jira/browse/HBASE-8135 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.5, 0.95.0, 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.95.0, 0.96.0 Attachments: 8135.v1.patch Code is there already. Doing so would allow to share some code when doing client side buffering. patch compiles locally, should not impact tests, but not tested locally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4955: --- Status: Patch Available (was: Open) > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4955: --- Attachment: 4955.v2.patch > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4955: --- Status: Open (was: Patch Available) > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604981#comment-13604981 ] nkeywal commented on HBASE-4955: bq. Tests run: 35, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 281.479 sec Useless log lines. That's SUREFIRE-969 bq. Took 2 mo. 7 d. That's SUREFIRE-970. bq. 2,158 tests (-274) It should be the flakiness of TestHTableMultiplexer.testHTableMultiplexer I'm going to retry because of the point 3. For the first 2 ones, I tend to think it should not prevent us from committing. We don't have any issue today because I built a version that included all we need. If we want to come back to an official version, we need to compromise. We can expect these points are likely to be solved in a later version, but these later version can also include regressions.. We need to jump in at a moment, and we've been waiting for more than a year now. > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8128) HTable#put improvements
[ https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8128: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > HTable#put improvements > --- > > Key: HBASE-8128 > URL: https://issues.apache.org/jira/browse/HBASE-8128 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.95.0, 0.96.0 > > Attachments: 8128.v1.patch > > > 3 points: > - When doing a single put, we're creating an object by calling Arrays.asList > - we're doing a size check every 10 put. Not doing it seems simpler, better > and allows to share some code between a single put and a list of puts. > - we could call flushCommits on empty write buffer, especially for someone > using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8128) HTable#put improvements
[ https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604970#comment-13604970 ] nkeywal commented on HBASE-8128: bq. Sound simple and efficient... Will it be possible to have it for 0.94 too? The patch should be directly applicable, so I can do it. As you and Lars want. bq. These classes could do with a general revamp. Agreed. I'm actually studying this currently. Committed in trunk and 0.95, thanks for the reviews! > HTable#put improvements > --- > > Key: HBASE-8128 > URL: https://issues.apache.org/jira/browse/HBASE-8128 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.95.0, 0.96.0 > > Attachments: 8128.v1.patch > > > 3 points: > - When doing a single put, we're creating an object by calling Arrays.asList > - we're doing a size check every 10 put. Not doing it seems simpler, better > and allows to share some code between a single put and a list of puts. > - we could call flushCommits on empty write buffer, especially for someone > using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8128) HTable#put improvements
nkeywal created HBASE-8128: -- Summary: HTable#put improvements Key: HBASE-8128 URL: https://issues.apache.org/jira/browse/HBASE-8128 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.95.0, 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Trivial Fix For: 0.95.0, 0.96.0 Attachments: 8128.v1.patch 3 points: - When doing a single put, we're creating an object by calling Arrays.asList - we're doing a size check every 10 put. Not doing it seems simpler, better and allows to share some code between a single put and a list of puts. - we could call flushCommits on empty write buffer, especially for someone using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8128) HTable#put improvements
[ https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8128: --- Status: Patch Available (was: Open) > HTable#put improvements > --- > > Key: HBASE-8128 > URL: https://issues.apache.org/jira/browse/HBASE-8128 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.95.0, 0.96.0 > > Attachments: 8128.v1.patch > > > 3 points: > - When doing a single put, we're creating an object by calling Arrays.asList > - we're doing a size check every 10 put. Not doing it seems simpler, better > and allows to share some code between a single put and a list of puts. > - we could call flushCommits on empty write buffer, especially for someone > using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8128) HTable#put improvements
[ https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8128: --- Attachment: 8128.v1.patch > HTable#put improvements > --- > > Key: HBASE-8128 > URL: https://issues.apache.org/jira/browse/HBASE-8128 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.95.0, 0.96.0 > > Attachments: 8128.v1.patch > > > 3 points: > - When doing a single put, we're creating an object by calling Arrays.asList > - we're doing a size check every 10 put. Not doing it seems simpler, better > and allows to share some code between a single put and a list of puts. > - we could call flushCommits on empty write buffer, especially for someone > using a lot of HTable instead of using a pool, as it's called in close(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4955: --- Attachment: 4955.v2.patch > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4955: --- Fix Version/s: (was: 0.95.0) 0.96.0 Status: Patch Available (was: Open) > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.96.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, > 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4955: --- Status: Open (was: Patch Available) > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing
[ https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603218#comment-13603218 ] nkeywal commented on HBASE-8097: bq. The timestamp updated in DeadServer.add seems bogus to me. You're right. It should be a putIfAbsent if it was a concurrentMap. The methods are synchronized and it should be critical, so the following implementation should be correct: {code} /** * Adds the server to the dead server list if it's not there already. * @param sn the server name */ public synchronized void add(ServerName sn) { this.numProcessing++; if (!deadServers.containsKey(sn)){ deadServers.put(sn, EnvironmentEdgeManager.currentTimeMillis()); } } {code} Tell me if you want me to create another JIRA for this or if you don't mind adding this into this JIRA. For numProcessing, it seems there are several bugs as well: if you have an exception anywhere (for example in verifyAndAssignMetaWithRetries();) we have a broken state: we haven't decreased the numProcessing but we're not working on it anymore. If we want to be exception safe (as ServerShutdownHandler is) we need the finally imho. I can't find a better solution than: {code} @Override public void process() throws IOException { boolean gotException = true; try { try { LOG.info("Splitting META logs for " + serverName); if (this.shouldSplitHlog) { this.services.getMasterFileSystem().splitMetaLog(serverName); } } catch (IOException ioe) { this.deadServers.add(serverName); this.services.getExecutorService().submit(this); throw new IOException("failed log splitting for " + serverName + ", will retry", ioe); } // Assign root and meta if we were carrying them. if (isCarryingMeta()) { // .META. // Check again: region may be assigned to other where because of RIT // timeout if (this.services.getAssignmentManager().isCarryingMeta(serverName)) { LOG.info("Server " + serverName + " was carrying META. Trying to assign."); this.services.getAssignmentManager().regionOffline( HRegionInfo.FIRST_META_REGIONINFO); verifyAndAssignMetaWithRetries(); } else { LOG.info("META has been assigned to otherwhere, skip assigning."); } } gotException = false; } finally { if (gotException){ // If we had an exception we can't rely on super.process to say we finished the process. this.deadServers.finish(serverName); } } super.process(); } {code} I can't say I like, but it should do the job... > MetaServerShutdownHandler may potentially keep bumping up > DeadServer.numProcessing > -- > > Key: HBASE-8097 > URL: https://issues.apache.org/jira/browse/HBASE-8097 > Project: HBase > Issue Type: Bug >Reporter: Jeffrey Zhong >Assignee: Jeffrey Zhong > Fix For: 0.96.0 > > Attachments: 8097.txt, hbase-8097_1.patch > > > {code} > } catch (IOException ioe) { > this.services.getExecutorService().submit(this); > this.deadServers.add(serverName); > throw new IOException("failed log splitting for " + > serverName + ", will retry", ioe); > } > {code} > this.deadServers.add(serverName); will keep incrementing > DeadServer.numProcessing > We can't get rid of numProcessing by just checking deadServers.size() because > deadServers is also used to report some historically failed RSs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8105) RegionServer Doesn't Rejoin Cluster after Netsplit
[ https://issues.apache.org/jira/browse/HBASE-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602355#comment-13602355 ] nkeywal commented on HBASE-8105: I suppose you have a YouAreDeadException in the logs? This would be expected. The logic is that the region server cannot be trusted anymore as it was ejected from the cluster. Then yes, it could abort. On the other hand you may want to look at it in details. Personally I would prefer to abort to be sure I don't have clients trying to use this dead server. Note that for questions or discussions, it's better to use the user mailing list. > RegionServer Doesn't Rejoin Cluster after Netsplit > -- > > Key: HBASE-8105 > URL: https://issues.apache.org/jira/browse/HBASE-8105 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.92.1 > Environment: Linux Ubuntu 10.04 LTS >Reporter: philo vivero > > Running a 15-node HBase cluster. Testing various failure scenarios. Segregate > one RegionServer from the cluster by firewalling off every port except SSH > (because we need to be able to re-enable the node later). > After the RS is automatically removed from the cluster, we re-enable all > ports again, but RS never rejoins the cluster. > I suspect the possibility this is desired behaviour, but haven't found proof > so far. The code doesn't have any comment indicating this is the behaviour > desired: > http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.2/org/apache/hadoop/hbase/regionserver/HRegionServer.java/ > See lines starting at 624, public void run(). It makes it through the first > try/catch block, but then loops inside the second try/catch block. Our > hypothesis is that it never gets out naturally. > If we bounce the RegionServer process, then it rejoins the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8025) zkcli fails when SERVER_GC_OPTS is enabled
[ https://issues.apache.org/jira/browse/HBASE-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602305#comment-13602305 ] nkeywal commented on HBASE-8025: Reading the patch, it seems ok to me. I will test it & commit to trunk & 0.95 tomorrow if there is no objection. Will wait for Lars for 0.94, but it seems it should be committed there as well. As well, Jean-Marc & Dave, if you want to study it more (cf. suggestion above), I will wait for you. > zkcli fails when SERVER_GC_OPTS is enabled > -- > > Key: HBASE-8025 > URL: https://issues.apache.org/jira/browse/HBASE-8025 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.4 >Reporter: Dave Latham > Fix For: 0.95.0, 0.98.0, 0.94.7 > > Attachments: HBASE-8025-0.94.patch > > > HBASE-7091 added logic to separate GC logging options for some client > commands versus server commands. It uses a list of known client commands > ("shell" "hbck" "hlog" "hfile" "zkcli") and uses the server GC logging > options for all other invocations of bin/hbase. When zkcli is invoked, it in > turn invokes "hbase org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServerArg" > to gather the server command line arguments, but because > org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServerArg is not on the white > list it enables server GC logging, which causes extra output that causes the > zkcli invocation to break. HBASE-7153 addressed this but the fix only solved > the array syntax - not the white list, so the zkcli command still fails. > There are many other tools you can invoke that are more likely to "client" > than "server" options. For example, "bin/hbase org.jruby.Main > region_mover.rb" or "bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable" > or "bin/hbase version" or "bin/hbase > org.apache.hadoop.hbase.mapreduce.Export". The whitelist of server commands > is shorter and easier to maintain than a whitelist of client commands. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8101) Cleanup: findbugs and javadoc warning fixes as well as making it illegal passing null row to Put/Delete, etc.
[ https://issues.apache.org/jira/browse/HBASE-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602173#comment-13602173 ] nkeywal commented on HBASE-8101: @Override + public int hashCode() { +// TODO: This is wrong. Can't have two gets the same just because on same row. But it +// matches how equals works currently and gets rid of the findbugs warning. +return this.getRow().hashCode(); + } => You shouldn't call hashCode on an array, you could call java.util.Arrays.hashCode + public Increment(final byte [] row, final int offset, final int length) { +if (row == null || length <= 0 || length > HConstants.MAX_ROW_LENGTH) { throw new IllegalArgumentException("Row key is invalid"); } => When it happens in production, I like to have the actual values (i.e. row= offset= & so on ;-) +@edu.umd.cs.findbugs.annotations.SuppressWarnings( +value="CN_IDIOM_NO_SUPER_CALL", +justification="Its PITA calling the super.clone") => There is a good reason for this warning: subclasses won't be able to call super.clone themselves if we do that (the type will be wrong: the object.clone creates the right object). As it's private (i.e. we don't offer a public API that should be subclassed I guess it's acceptable. At the very least we should put a warning in the justification. +1 otherwise, thanks for doing this! > Cleanup: findbugs and javadoc warning fixes as well as making it illegal > passing null row to Put/Delete, etc. > - > > Key: HBASE-8101 > URL: https://issues.apache.org/jira/browse/HBASE-8101 > Project: HBase > Issue Type: Sub-task > Components: IPC/RPC >Reporter: stack > Fix For: 0.95.0 > > Attachments: 8101.txt, 8101v2.txt > > > Part of hbase-7900 broken out so that patch gets smaller. This is a patch > with cleanup mostly findbugs fixes (general ones) as well as adding check for > null row being passed to Put, Get, etc. This patch helps rpc along. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7840) Enhance the java it framework to start & stop a distributed hbase & hadoop cluster
[ https://issues.apache.org/jira/browse/HBASE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601454#comment-13601454 ] nkeywal commented on HBASE-7840: Waiting for review or +1 on this one, I can rebase if you like. > Enhance the java it framework to start & stop a distributed hbase & hadoop > cluster > --- > > Key: HBASE-7840 > URL: https://issues.apache.org/jira/browse/HBASE-7840 > Project: HBase > Issue Type: New Feature > Components: test >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0 > > Attachments: 7840.v1.patch, 7840.v3.patch > > > Needs are to use a development version of HBase & HDFS 1 & 2. > Ideally, should be nicely backportable to 0.94 to allow comparisons and > regression tests between versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601449#comment-13601449 ] nkeywal commented on HBASE-7590: I will fix the 100 lines stuff on commit. Any +1 on the new version? > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, > 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8081) Backport HBASE-7213 (separate hlog for meta tables) to 0.94
[ https://issues.apache.org/jira/browse/HBASE-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600969#comment-13600969 ] nkeywal commented on HBASE-8081: Added to this, there are a lot of critical scenarios that you can have without separated logs: - some blocks in the WAL may be not recoverable (corrupted, too many boxes missing). This risk is highly mitigated with a separate log. Without this, the whole cluster becomes unavailable when you're unlucky. - if you come into hdfs issues during the recovery (hdfs issue being going to a dead datanode, something highly probable during a recovery), the recovery will be much slower. - trying to run a recovery while .meta. is not available is also problematic. Unsuring that .meta. comes back early simplifies a lot of critical scenarios. So having this in 0.94 is 'interesting' I would say :-). > Backport HBASE-7213 (separate hlog for meta tables) to 0.94 > --- > > Key: HBASE-8081 > URL: https://issues.apache.org/jira/browse/HBASE-8081 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.5 >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.7 > > Attachments: 7213-0.94-2.patch, 7213-0.94.patch > > > I am interested in backporting HBASE-7213 to 0.94. Helps to address more of > the MTTR story. Offline discussion with Lars indicated he is interested as > well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600250#comment-13600250 ] nkeywal commented on HBASE-7590: Comments taken into account, and I added the IOException instead of only ZooKeeperConnection exception... I add it on RB as well. > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, > 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Patch Available (was: Open) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, > 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Attachment: 7590.v5.patch > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, > 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Status: Open (was: Patch Available) > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, > 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7590: --- Attachment: 7590.v5.patch > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, > 7590.v5.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7327) Assignment Timeouts: Remove the code from the master
[ https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600109#comment-13600109 ] nkeywal commented on HBASE-7327: Yes (i.e. TOM is *not* activated by default) The idea is really to remove it but using baby steps. > Assignment Timeouts: Remove the code from the master > > > Key: HBASE-7327 > URL: https://issues.apache.org/jira/browse/HBASE-7327 > Project: HBase > Issue Type: Improvement > Components: master >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch > > > As per HBASE-7247... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13599849#comment-13599849 ] nkeywal commented on HBASE-4955: The regression in SUREFIRE-970 makes the move to 2.14 problematic. It would be nice to have SUREFIRE-969 as well bu it's more mandatory. A bigger issue is that in my tests it seems that having multiple execution with different parameters does not work anymore. I will need to have a look at that to get it fixed in a release... > Use the official versions of surefire & junit > - > > Key: HBASE-4955 > URL: https://issues.apache.org/jira/browse/HBASE-4955 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.94.0 > Environment: all >Reporter: nkeywal >Assignee: nkeywal >Priority: Minor > Fix For: 0.95.0 > > Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch > > > We currently use private versions for Surefire & JUnit since HBASE-4763. > This JIRA traks what we need to move to official versions. > Surefire 2.11 is just out, but, after some tests, it does not contain all > what we need. > JUnit. Could be for JUnit 4.11. Issue to monitor: > https://github.com/KentBeck/junit/issues/359: fixed in our version, no > feedback for an integration on trunk > Surefire: Could be for Surefire 2.12. Issues to monitor are: > 329 (category support): fixed, we use the official implementation from the > trunk > 786 (@Category with forkMode=always): fixed, we use the official > implementation from the trunk > 791 (incorrect elapsed time on test failure): fixed, we use the official > implementation from the trunk > 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on > our version. > 760 (does not take into account the test method): fixed in trunk, not fixed > in our version > 798 (print immediately the test class name): not fixed in trunk, not fixed in > our version > 799 (Allow test parallelization when forkMode=always): not fixed in trunk, > not fixed in our version > 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, > fixed on our version > 800 & 793 are the more important to monitor, it's the only ones that are > fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening
[ https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598749#comment-13598749 ] nkeywal commented on HBASE-7247: TimeOutManagement it now optional and deactivated by default. I will redo the measures. > Assignment performances decreased by 50% because of > regionserver.OpenRegionHandler#tickleOpening > > > Key: HBASE-7247 > URL: https://issues.apache.org/jira/browse/HBASE-7247 > Project: HBase > Issue Type: Improvement > Components: master, Region Assignment, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0 > > Attachments: 7247.v1.patch > > > The regionserver.OpenRegionHandler#tickleOpening updates the region znode as > "Do this so master doesn't timeout this region-in-transition.". > However, on the usual test, this makes the assignment time of 1500 regions > goes from 70s to 100s, that is, we're 50% slower because of this. > More generally, ZooKeper commits to disk all the data update, and this takes > time. Using it to provide a keep alive seems overkill. At the very list, it > could be made asynchronous. > I'm not sure how necessary these updates are required (I need to go deeper in > the internal, feedback welcome), but it seems very important to optimize > this... The trival fix would be to make this optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7327) Assignment Timeouts: Remove the code from the master
[ https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7327: --- Resolution: Later Status: Resolved (was: Patch Available) Since the final decision was to make this optional, the code is still there, to be removed later. > Assignment Timeouts: Remove the code from the master > > > Key: HBASE-7327 > URL: https://issues.apache.org/jira/browse/HBASE-7327 > Project: HBase > Issue Type: Improvement > Components: master >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch > > > As per HBASE-7247... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7927) Two versions of netty with hadoop.profile=2.0: 3.5.9 and 3.2.4
[ https://issues.apache.org/jira/browse/HBASE-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-7927: --- Assignee: nkeywal > Two versions of netty with hadoop.profile=2.0: 3.5.9 and 3.2.4 > -- > > Key: HBASE-7927 > URL: https://issues.apache.org/jira/browse/HBASE-7927 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > > I don't know why, but when you do a mvn dependency:tree, everything looks > fine. When you look at the generated target/cached_classpath.txt you see 2 > versions of netty: netty-3.2.4.Final.jar and netty-3.5.9.Final.jar. > This is bad and can lead to unpredictable behavior. > I haven't looked at the other dependencies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7713) Maven build fails for hbase-common on windows environment
[ https://issues.apache.org/jira/browse/HBASE-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598736#comment-13598736 ] nkeywal commented on HBASE-7713: No feedback. Closing, @[~rdoppalapudi], please reopen if you think differently. > Maven build fails for hbase-common on windows environment > - > > Key: HBASE-7713 > URL: https://issues.apache.org/jira/browse/HBASE-7713 > Project: HBase > Issue Type: Bug > Environment: Windows Environment >Reporter: Raghu Doppalapudi >Priority: Minor > > build fails with following error message > "org.codehaus.plexus.resource.loader.ResourceNotFoundException: Could not > find resource 'dev-support/findbugs-exclude.xml' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7713) Maven build fails for hbase-common on windows environment
[ https://issues.apache.org/jira/browse/HBASE-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal resolved HBASE-7713. Resolution: Cannot Reproduce > Maven build fails for hbase-common on windows environment > - > > Key: HBASE-7713 > URL: https://issues.apache.org/jira/browse/HBASE-7713 > Project: HBase > Issue Type: Bug > Environment: Windows Environment >Reporter: Raghu Doppalapudi >Priority: Minor > > build fails with following error message > "org.codehaus.plexus.resource.loader.ResourceNotFoundException: Could not > find resource 'dev-support/findbugs-exclude.xml' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7938) Add integration test for various MapReduce workflows
[ https://issues.apache.org/jira/browse/HBASE-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597375#comment-13597375 ] nkeywal commented on HBASE-7938: I'm +1 of course. I don't really know the IntegrationTestsDriver, I use maven to run the integration tests (I'm not saying it's a good thing, actually may be it's a bad thing). I will have a look. If it has not changed recently, we don't start a mini map reduce cluster when we do a start-hbase not distributed. We should add it may be, to ease manual tests? Following the work done in HBASE-7840, we could also add the start/stop of the map reduce part. Then the integration tests dependending on map reduce could start a real cluster. For the path, I don't know: if we include the test code in HBase, it will have copied all other the place anyway, so we won't be able to check this. The code would have to be quite smart to explicitly ask maven to do nothing about it. > Add integration test for various MapReduce workflows > > > Key: HBASE-7938 > URL: https://issues.apache.org/jira/browse/HBASE-7938 > Project: HBase > Issue Type: Bug > Components: mapreduce >Reporter: Nick Dimiduk > Fix For: 0.95.0, 0.98.0, 0.94.7 > > > We have existing unit tests for smoke-testing the packaged MR jobs, however > they do not create a runtime environment that is true to running on a real MR > cluster. This is particularly true in regard to classpaths (HBASE-7934) but > also other static state (HBASE-4802). An integration test that can be pointed > to run on a pseudo-distributed Hadoop deployed on localhost would find these > kinds of problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7989) Client with a cache info on a dead server will wait for 20s before trying another one.
[ https://issues.apache.org/jira/browse/HBASE-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596263#comment-13596263 ] nkeywal commented on HBASE-7989: Yes. There is a 20s timout for connect by default. And here there are two issues: - we should be able to have a much lower timeout for connect as it doesn't depend on GC stuff and it's a clear error (we are sure that the action is not done on the server, contrary to a read or write timeout) - we should not even go to the server in some cases (we know it's dead). > Client with a cache info on a dead server will wait for 20s before trying > another one. > -- > > Key: HBASE-7989 > URL: https://issues.apache.org/jira/browse/HBASE-7989 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal > > Scenario is: > - fetch the cache in the client > - a server dies > - try to use a region that is on the dead server > This will lead to a 20 second connect timeout. We don't have this in unit > test because we have this only is the remote box does not answer. In the unit > tests we have immediately a connection refused from the OS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8022) Site target fails
[ https://issues.apache.org/jira/browse/HBASE-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596174#comment-13596174 ] nkeywal commented on HBASE-8022: +1. We should add this to our precommit tests imho. > Site target fails > - > > Key: HBASE-8022 > URL: https://issues.apache.org/jira/browse/HBASE-8022 > Project: HBase > Issue Type: Bug >Affects Versions: 0.95.0, 0.96.0 >Reporter: Andrew Purtell > Attachments: HBASE-8022.patch > > > {noformat} > mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly > [...] > Recoverable error > org.xml.sax.SAXParseException: Include operation failed, reverting to > fallback. Resource error reading file as XML > (href='../../target/site/hbase-default.xml'). Reason: > /usr/src/Hadoop/hbase/target/site/hbase-default.xml (No such file or > directory) > Error on line 672 column 52 of > file:///usr/src/Hadoop/hbase/src/docbkx/configuration.xml: > Error reported by XML parser: An 'include' failed, and no 'fallback' > element was found. > [INFO] > > [INFO] > > [INFO] Skipping HBase > [INFO] This project has been banned from the build due to previous failures. > [INFO] > > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . FAILURE [5:34.980s] > [INFO] HBase - Common SKIPPED > [INFO] HBase - Protocol .. SKIPPED > [INFO] HBase - Client SKIPPED > [INFO] HBase - Prefix Tree ... SKIPPED > [INFO] HBase - Hadoop Compatibility .. SKIPPED > [INFO] HBase - Hadoop Two Compatibility .. SKIPPED > [INFO] HBase - Server SKIPPED > [INFO] HBase - Integration Tests . SKIPPED > [INFO] HBase - Examples .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 5:36.029s > [INFO] Finished at: Thu Mar 07 21:59:14 CST 2013 > [INFO] Final Memory: 29M/297M > [INFO] > > [ERROR] Failed to execute goal > com.agilejava.docbkx:docbkx-maven-plugin:2.0.14:generate-html (multipage) on > project hbase: Failed to transform configuration.xml. > org.xml.sax.SAXParseException: An 'include' failed, and no 'fallback' element > was found. -> [Help 1] > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8023) Assembly target fails
[ https://issues.apache.org/jira/browse/HBASE-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596012#comment-13596012 ] nkeywal commented on HBASE-8023: Ah, I'm just seeing that you created HBASE-8022 for the site part. I confirm assembly w/o the site works here. If the assembly w/o the site still fails for you, it's worth doing a mvn clean install -DskipTests first. > Assembly target fails > - > > Key: HBASE-8023 > URL: https://issues.apache.org/jira/browse/HBASE-8023 > Project: HBase > Issue Type: Bug >Affects Versions: 0.95.0, 0.96.0 >Reporter: Andrew Purtell > > The assembly target fails when using the 2.0 Hadoop profile (at least). > {noformat} > mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly > [...] > [INFO] --- maven-assembly-plugin:2.3:assembly (default-cli) @ hbase --- > [INFO] Reading assembly descriptor: src/assembly/hadoop-two-compat.xml > [WARNING] [DEPRECATION] moduleSet/binaries section detected in root-project > assembly. > MODULE BINARIES MAY NOT BE AVAILABLE FOR THIS ASSEMBLY! > To refactor, move this assembly into a child project and use the flag > true in each moduleSet. > [INFO] Processing sources for module project: > org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-protocol:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-client:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-prefix-tree:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-hadoop-compat:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-hadoop2-compat:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-server:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-it:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-examples:jar:0.97-SNAPSHOT > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . FAILURE [15.877s] > [INFO] HBase - Common SUCCESS [4.633s] > [INFO] HBase - Protocol .. SUCCESS [2.629s] > [INFO] HBase - Client SUCCESS [2.901s] > [INFO] HBase - Prefix Tree ... SUCCESS [3.085s] > [INFO] HBase - Hadoop Compatibility .. SUCCESS [2.647s] > [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.005s] > [INFO] HBase - Server SUCCESS [1.888s] > [INFO] HBase - Integration Tests . SUCCESS [6.917s] > [INFO] HBase - Examples .. SUCCESS [2.815s] > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 6:41.503s > [INFO] Finished at: Thu Mar 07 22:14:08 CST 2013 > [INFO] Final Memory: 67M/448M > [INFO] > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-assembly-plugin:2.3:assembly (default-cli) on > project hbase: Failed to create assembly: Artifact: > org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT (included by module) does not > have an artifact with a file. Please ensure the package phase is run before > the assembly is generated. -> [Help 1] > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8023) Assembly target fails
[ https://issues.apache.org/jira/browse/HBASE-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595996#comment-13595996 ] nkeywal commented on HBASE-8023: mvn clean package -DskipTests assembly:assembly -Dhadoop.profile=2.0 works here. but mvn clean package site -DskipTests assembly:assembly -Dhadoop.profile=2.0 does not [INFO] Processing input file: book.xml [INFO] Applying customization parameters [INFO] Chunking output. Recoverable error org.xml.sax.SAXParseException: Include operation failed, reverting to fallback. Resource error reading file as XML (href='../../target/site/hbase-default.xml'). Reason: /home/liochon/dev/hbase/target/site/hbase-default.xml (No such file or directory) Error on line 672 column 52 of file:///home/liochon/dev/hbase/src/docbkx/configuration.xml: Error reported by XML parser: An 'include' failed, and no 'fallback' element was found. Error on line 70 column 85 of file:///home/liochon/dev/hbase/src/docbkx/book.xml: Error reported by XML parser: Error attempting to parse XML file (href='configuration.xml'). [INFO] I'm testing but I think it's not related to the 2.0 profile, just site. > Assembly target fails > - > > Key: HBASE-8023 > URL: https://issues.apache.org/jira/browse/HBASE-8023 > Project: HBase > Issue Type: Bug >Affects Versions: 0.95.0, 0.96.0 >Reporter: Andrew Purtell > > The assembly target fails when using the 2.0 Hadoop profile (at least). > {noformat} > mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly > [...] > [INFO] --- maven-assembly-plugin:2.3:assembly (default-cli) @ hbase --- > [INFO] Reading assembly descriptor: src/assembly/hadoop-two-compat.xml > [WARNING] [DEPRECATION] moduleSet/binaries section detected in root-project > assembly. > MODULE BINARIES MAY NOT BE AVAILABLE FOR THIS ASSEMBLY! > To refactor, move this assembly into a child project and use the flag > true in each moduleSet. > [INFO] Processing sources for module project: > org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-protocol:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-client:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-prefix-tree:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-hadoop-compat:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-hadoop2-compat:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-server:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-it:jar:0.97-SNAPSHOT > [INFO] Processing sources for module project: > org.apache.hbase:hbase-examples:jar:0.97-SNAPSHOT > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] HBase . FAILURE [15.877s] > [INFO] HBase - Common SUCCESS [4.633s] > [INFO] HBase - Protocol .. SUCCESS [2.629s] > [INFO] HBase - Client SUCCESS [2.901s] > [INFO] HBase - Prefix Tree ... SUCCESS [3.085s] > [INFO] HBase - Hadoop Compatibility .. SUCCESS [2.647s] > [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.005s] > [INFO] HBase - Server SUCCESS [1.888s] > [INFO] HBase - Integration Tests . SUCCESS [6.917s] > [INFO] HBase - Examples .. SUCCESS [2.815s] > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 6:41.503s > [INFO] Finished at: Thu Mar 07 22:14:08 CST 2013 > [INFO] Final Memory: 67M/448M > [INFO] > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-assembly-plugin:2.3:assembly (default-cli) on > project hbase: Failed to create assembly: Artifact: > org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT (included by module) does not > have an artifact with a file. Please ensure the package phase is run before > the assembly is generated. -> [Help 1] > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8002: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch, 8002.v4.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595737#comment-13595737 ] nkeywal commented on HBASE-8002: And thanks for the review! > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch, 8002.v4.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595736#comment-13595736 ] nkeywal commented on HBASE-8002: v4 is what I committed on trunk & 0.95 > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch, 8002.v4.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8002: --- Attachment: 8002.v4.patch > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch, 8002.v4.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients
[ https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595034#comment-13595034 ] nkeywal commented on HBASE-7590: It's on RB, waiting for reviews before being committed :-). > Add a costless notifications mechanism from master to regionservers & clients > - > > Key: HBASE-7590 > URL: https://issues.apache.org/jira/browse/HBASE-7590 > Project: HBase > Issue Type: Bug > Components: Client, master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: nkeywal > Attachments: 7590.inprogress.patch, 7590.v1.patch, > 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch > > > t would be very useful to add a mechanism to distribute some information to > the clients and regionservers. Especially It would be useful to know globally > (regionservers + clients apps) that some regionservers are dead. This would > allow: > - to lower the load on the system, without clients using staled information > and going on dead machines > - to make the recovery faster from a client point of view. It's common to use > large timeouts on the client side, so the client may need a lot of time > before declaring a region server dead and trying another one. If the client > receives the information separatly about a region server states, it can take > the right decision, and continue/stop to wait accordingly. > We can also send more information, for example instructions like 'slow down' > to instruct the client to increase the retries delay and so on. > Technically, the master could send this information. To lower the load on > the system, we should: > - have a multicast communication (i.e. the master does not have to connect to > all servers by tcp), with once packet every 10 seconds or so. > - receivers should not depend on this: if the information is available great. > If not, it should not break anything. > - it should be optional. > So at the end we would have a thread in the master sending a protobuf message > about the dead servers on a multicast socket. If the socket is not > configured, it does not do anything. On the client side, when we receive an > information that a node is dead, we refresh the cache about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6772) Make the Distributed Split HDFS Location aware
[ https://issues.apache.org/jira/browse/HBASE-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595023#comment-13595023 ] nkeywal commented on HBASE-6772: The new design is better than my original proposition. I'm +1. Devaraj' comment is important as well imho, so we should put this info as well in ZK. Just one point: the master should provide the full list of regionservers owning a copy. This way: - if one of the regionserver is actually dead it can be picked up by another one - it's possible to optimize the choice in the regionserver: if the RS sees it's the only one for a block it can pick it instead of another one that have more potential regionserver. - + the rack already mentioned by Devaraj. > Make the Distributed Split HDFS Location aware > -- > > Key: HBASE-6772 > URL: https://issues.apache.org/jira/browse/HBASE-6772 > Project: HBase > Issue Type: Improvement > Components: master, regionserver >Affects Versions: 0.96.0 >Reporter: nkeywal >Assignee: Jeffrey Zhong > > During a hlog split, each log file (a single hdfs block) is allocated to a > different region server. This region server reads the file and creates the > recovery edit files. > The allocation to the region server is random. We could take into account the > locations of the log file to split: > - the reads would be local, hence faster. This allows short circuit as well. > - less network i/o used during a failure (and this is important) > - we would be sure to read from a working datanode, hence we're sure we won't > have read errors. Read errors slow the split process a lot, as we often enter > the "timeouted world". > We need to limit the calls to the namenode however. > Typical algo could be: > - the master gets the locations of the hlog files > - it writes it into ZK, if possible in one transaction (this way all the > tasks are visible alltogether, allowing some arbitrage by the region server). > - when the regionserver receives the event, it checks for all logs and all > locations. > - if there is a match, it takes it > - if not it waits something like 0.2s (to give the time to other regionserver > to take it if the location matches), and take any remaining task. > Drawbacks are: > - a 0.2s delay added if there is no regionserver available on one of the > locations. It's likely possible to remove it with some extra synchronization. > - Small increase in complexity and dependency to HDFS > Considering the advantages, it's worth it imho. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7948) client doesn't need to refresh meta while the region is opening
[ https://issues.apache.org/jira/browse/HBASE-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593930#comment-13593930 ] nkeywal commented on HBASE-7948: bq. Can you please elaborate about more dangerous parts? I was thinking about the code that we're slowly removing with HBASE-8002. It has 3 sides effects: 1) It was decreasing the performances, it has been fixed in numerous patches, but there are still scary comments and issues (HBASE-7247) 2) It was hiding issues. In the tests we had very low timeout, so master failover scenarios seemed to be working. In production, we were depending on a 10 minutes timeout but we didn't know. 3) It was causing double assignment issues, i.e. data corruption. This was exactly the same logic (don't trust the RS), with more dramatic consequences. bq. In my experience it's not safe to trust anything forever as a general principle, not because I think RS code is unreliable. I'm not against this, but in this case we need to tackle this the standard way: watchdog the process, and exclude the fuzzy ones from the group. Before doing this, I could like to see the chaos monkey test working with kill -9 for a while (I doubt it does today :-( ) But I agree with your point, and we will have this soon or later (BTW, it's exactly why there are checksums in hdfs: because you can't trust the storage). bq. But for client there's no data loss potential from flushing the cache, but there's potential to be stuck forever in case of abnormal RS behavior. With remote things I prefer to be defensive on all sides if practical Yeah. I'm likely biased. So, imho - the patch is an improvement. HBase is better with this patch than without. - it would be simpler without the RS-trust part - to me, at the margin on degraded conditions, it would be more efficient without the RS-trust part as well. As we need to make progress :-), I propose: 1) Well, if you're not against the idea of removing the RS-trust part, we're done 2) If you really want to keep it, let's wait a few days if someone wants to come by. If no one does, let's commit on Friday. What do you think? > client doesn't need to refresh meta while the region is opening > --- > > Key: HBASE-7948 > URL: https://issues.apache.org/jira/browse/HBASE-7948 > Project: HBase > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HBASE-7948-v0.patch, HBASE-7948-v1.patch, > HBASE-7948-v1.patch, HBASE-7948-v2.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593888#comment-13593888 ] nkeywal commented on HBASE-8002: bq. is it possible to remove the setting from HConstants and add to where it applies? Ok, will do. bq. where everything is set to null doesn't appear necessary. I need it because the attributes are final, so must be set on all execution paths. I would prefer to keep them final, but I can change this if you like. bq. Precondition instead of assert since we don't usually enable assert when we run and the timeout checks are infrequent enough it tshouldn't be costly running precondition. It was more for documentation & unit tests. But ok, will do. I will commit the patch with point 1) & 3) tomorrow if there is no objection. If you ask for 2) I will include it. > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8003: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch, 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593634#comment-13593634 ] nkeywal commented on HBASE-8003: Tried locally, all passed at the first try excepted TestReplicationQueueFailover. This one succeeded the second time. Committed, thanks for the review. > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch, 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593429#comment-13593429 ] nkeywal commented on HBASE-8002: Ok locally. But the "return of the flakiness" makes this a little bit complicated. > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8002: --- Status: Patch Available (was: Open) > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers
[ https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8002: --- Attachment: 8002.v3.patch > Make TimeOut Management for Assignment optional in master and regionservers > --- > > Key: HBASE-8002 > URL: https://issues.apache.org/jira/browse/HBASE-8002 > Project: HBase > Issue Type: Bug > Components: Client, master, Region Assignment >Affects Versions: 0.95.0, 0.98.0 >Reporter: nkeywal >Assignee: nkeywal > Fix For: 0.95.0, 0.98.0 > > Attachments: 8002.v3.patch > > > See HBASE-7327 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8003: --- Status: Patch Available (was: Open) > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch, 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8003: --- Status: Open (was: Patch Available) > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch, 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8003: --- Attachment: 8003.v1.patch > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch, 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593312#comment-13593312 ] nkeywal commented on HBASE-8003: Back to flakiness it seems. All this should be totally unrelated. Let's give it another go. > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch, 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593251#comment-13593251 ] nkeywal commented on HBASE-8003: trivial patch, as it's always used with seconds today. > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Attachments: 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8003: --- Fix Version/s: 0.98.0 Affects Version/s: 0.98.0 Status: Patch Available (was: Open) > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0 >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Fix For: 0.98.0 > > Attachments: 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
[ https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-8003: --- Attachment: 8003.v1.patch > Threads#getBoundedCachedThreadPool harcodes the time unit to seconds > > > Key: HBASE-8003 > URL: https://issues.apache.org/jira/browse/HBASE-8003 > Project: HBase > Issue Type: Bug >Reporter: nkeywal >Assignee: nkeywal >Priority: Trivial > Attachments: 8003.v1.patch > > > /** >* Create a new CachedThreadPool with a bounded number as the maximum >* thread size in the pool. >* >* @param maxCachedThread the maximum thread could be created in the pool >* @param timeout the maximum time to wait >* @param unit the time unit of the timeout argument >* @param threadFactory the factory to use when creating new threads >* @return threadPoolExecutor the cachedThreadPool with a bounded number >* as the maximum thread size in the pool. >*/ > public static ThreadPoolExecutor getBoundedCachedThreadPool( > int maxCachedThread, long timeout, TimeUnit unit, > ThreadFactory threadFactory) { > ThreadPoolExecutor boundedCachedThreadPool = > new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, > TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); > // allow the core pool threads timeout and terminate > boundedCachedThreadPool.allowCoreThreadTimeOut(true); > return boundedCachedThreadPool; > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
nkeywal created HBASE-8003: -- Summary: Threads#getBoundedCachedThreadPool harcodes the time unit to seconds Key: HBASE-8003 URL: https://issues.apache.org/jira/browse/HBASE-8003 Project: HBase Issue Type: Bug Reporter: nkeywal Assignee: nkeywal Priority: Trivial /** * Create a new CachedThreadPool with a bounded number as the maximum * thread size in the pool. * * @param maxCachedThread the maximum thread could be created in the pool * @param timeout the maximum time to wait * @param unit the time unit of the timeout argument * @param threadFactory the factory to use when creating new threads * @return threadPoolExecutor the cachedThreadPool with a bounded number * as the maximum thread size in the pool. */ public static ThreadPoolExecutor getBoundedCachedThreadPool( int maxCachedThread, long timeout, TimeUnit unit, ThreadFactory threadFactory) { ThreadPoolExecutor boundedCachedThreadPool = new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout, TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory); // allow the core pool threads timeout and terminate boundedCachedThreadPool.allowCoreThreadTimeOut(true); return boundedCachedThreadPool; } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira