[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606904#comment-13606904
 ] 

nkeywal commented on HBASE-7590:


ok, will do (as for rb) by the end of this week.

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
> 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8145) TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-8145.


Resolution: Duplicate

> TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0
> --
>
> Key: HBASE-8145
> URL: https://issues.apache.org/jira/browse/HBASE-8145
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.96.0
>
>
> I will check for 0.95.
> {code}
> for (HRegion region : regions) {
>   if 
> (!region.getRegionInfo().getEncodedName().equals(toMove.getRegionInfo().getEncodedName())
>   && 
> Bytes.BYTES_COMPARATOR.compare(region.getRegionInfo().getStartKey(), ROW_X) < 
> 0) {
> otherRow = region.getRegionInfo().getStartKey();
> break;
>   }
> }
> {code}
> We're likely to get sometimes the startKey of the first region here, and 
> that's an empty byte array. This make the put creation to fail, since there 
> is now (with HBASE-8101) a check on the empty rows at put creation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8145) TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0

2013-03-19 Thread nkeywal (JIRA)
nkeywal created HBASE-8145:
--

 Summary: TestHCM flaky: java.lang.IllegalArgumentException: Row 
length is 0
 Key: HBASE-8145
 URL: https://issues.apache.org/jira/browse/HBASE-8145
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.96.0


I will check for 0.95.

{code}
for (HRegion region : regions) {
  if 
(!region.getRegionInfo().getEncodedName().equals(toMove.getRegionInfo().getEncodedName())
  && 
Bytes.BYTES_COMPARATOR.compare(region.getRegionInfo().getStartKey(), ROW_X) < 
0) {
otherRow = region.getRegionInfo().getStartKey();
break;
  }
}
{code}

We're likely to get sometimes the startKey of the first region here, and that's 
an empty byte array. This make the put creation to fail, since there is now 
(with HBASE-8101) a check on the empty rows at put creation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.95.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
> 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch, 8135.v5.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606568#comment-13606568
 ] 

nkeywal commented on HBASE-8135:


This seems to prove that we're in the usual flakiness (thanks for having 
relaunched the tests, Ted). Committed to trunk and 0.95.

Thanks for the review, Stack & Ted!

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch, 8135.v5.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Patch Available  (was: Open)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.5, 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Open  (was: Patch Available)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.5, 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v5.patch

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606196#comment-13606196
 ] 

nkeywal commented on HBASE-8135:


v5 is what I will commit if the build runs ok.

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v5.patch

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
> 8135.v4.patch, 8135.v5.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606188#comment-13606188
 ] 

nkeywal commented on HBASE-8135:


bq. There was a javadoc warning. 
Right. I'm going to hunt it.

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6674) Check behavior of current surefire trunk on Hadoop QA

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6674:
---

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Since JUnit part is done, I'm using HBASE-4955 to test surefire.

> Check behavior of current surefire trunk on Hadoop QA
> -
>
> Key: HBASE-6674
> URL: https://issues.apache.org/jira/browse/HBASE-6674
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Attachments: 5processes.patch, 5processes.patch, 5processes.patch, 
> 6674.patch, 6674.v2.patch, 6674.v2.patch, 6674.v2.patch, 6674.v2.patch
>
>
> Not to be committed.
> Surefire 2.13 is in progress. Let's check that it works for us before it's 
> released. Locally it's acceptable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606183#comment-13606183
 ] 

nkeywal commented on HBASE-8135:


Thanks a lot Ted. For v4 I've just moved the test 'Put' with the other tests. I 
will commit as soon as I get a +1.

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v4.patch

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Patch Available  (was: Open)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.5, 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Open  (was: Patch Available)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.5, 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
> 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605698#comment-13605698
 ] 

nkeywal commented on HBASE-7590:


May be 13 is going to be my lucky number :-) ?

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
> 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v13.patch

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
> 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
> 7590.v3.patch, 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Fix Version/s: 0.94.8

> HTable#put improvements
> ---
>
> Key: HBASE-8128
> URL: https://issues.apache.org/jira/browse/HBASE-8128
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.95.0, 0.96.0, 0.94.8
>
> Attachments: 8128.v1.patch
>
>
> 3 points:
>  - When doing a single put, we're creating an object by calling Arrays.asList
>  - we're doing a size check every 10 put. Not doing it seems simpler, better 
> and allows to share some code between a single put and a list of puts.
>  - we could call flushCommits on empty write buffer, especially for someone 
> using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605527#comment-13605527
 ] 

nkeywal commented on HBASE-8128:


Committed in 0.94

> HTable#put improvements
> ---
>
> Key: HBASE-8128
> URL: https://issues.apache.org/jira/browse/HBASE-8128
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8128.v1.patch
>
>
> 3 points:
>  - When doing a single put, we're creating an object by calling Arrays.asList
>  - we're doing a size check every 10 put. Not doing it seems simpler, better 
> and allows to share some code between a single put and a list of puts.
>  - we could call flushCommits on empty write buffer, especially for someone 
> using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605503#comment-13605503
 ] 

nkeywal commented on HBASE-4955:


Yes, JUnit 4.11 contains what we need. Surefire contains what we need as well, 
we just need to get a version that comes up with an acceptable set of 
regressions.

Our Surefire is in Gary's repo. He will be the one getting the blame if Apache 
complains :-).
BTW, I haven't done the update to JUnit in HBase 0.94, as it implies 
backporting a few jiras as well (in the required section). So we still need to 
have it in Gary's repo as well.

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605483#comment-13605483
 ] 

nkeywal commented on HBASE-8135:


Yes, just locally (I do test before submitting :-) ) I used ClassSize.align for 
timerange, and it went ok. But for Put/Delete/Increment, There are 88 bytes of 
difference I cannot explain. The code is on the v2.patch.

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605470#comment-13605470
 ] 

nkeywal commented on HBASE-8135:


There is an issue: with the unit tests, I've got a huge gap I do not explain:

Expected :80
Actual   :168

2013-03-18 19:54:28,845 DEBUG [main] util.ClassSize(246): 0 row class [B
2013-03-18 19:54:28,845 DEBUG [main] util.ClassSize(246): 1 ts long
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 2 writeToWAL boolean
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 3 familyMap interface 
java.util.NavigableMap
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 4 attributes 
interface java.util.Map
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(273): Primitives=9, 
arrays=1, references(includes 2 for object overhead)=5, refSize 8, size=80, 
prealign_size=73

Any hint?

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v2.patch

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Open  (was: Patch Available)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.5, 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch, 8135.v2.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605362#comment-13605362
 ] 

nkeywal commented on HBASE-8135:


I was trying to find a generic way to get the size of an object. A google 
search on this leads to quite a lot of  terrible practises :-). It should be 
possible to do a static{} block for the fixed fields, but it won't bring much 
actual value. With the current implementation, it's better to have unit tests 
when ones adds fields. I'm going to do this in this patch (including Increment) 
it will be simpler.

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6870:
---

Affects Version/s: 0.96.0
   0.95.0

> HTable#coprocessorExec always scan the whole table 
> ---
>
> Key: HBASE-6870
> URL: https://issues.apache.org/jira/browse/HBASE-6870
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 0.94.1, 0.95.0, 0.96.0
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
> HBASE-6870v2.patch, HBASE-6870v3.patch
>
>
> In current logic, HTable#coprocessorExec always scan the whole table, its 
> efficiency is low and will affect the Regionserver carrying .META. under 
> large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8136) coprocessor service requires .meta. to be available all the time.

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-8136.


  Resolution: Duplicate
Release Note: HBASE-6870 

And the good news if that there is already a patch for HBASE-6870 :-).

> coprocessor service requires .meta. to be available all the time.
> -
>
> Key: HBASE-8136
> URL: https://issues.apache.org/jira/browse/HBASE-8136
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Coprocessors
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Priority: Minor
>
> HTable#getRegionLocations does not use a cache: all the calls to this 
> function go to .META.
> So:
> - we're missing an opportunity to reuse/update the location cache in the 
> HConnection.
> - this method is called by the coprocessor service. So, for people using this 
> features, they have .meta. on their execution path, and it's not good for 
> performances, scalability and reliability.
> I'm not totally clear on the fix. I think it should be possible to use the 
> cache to see if we have all regions for the table. But it means we won't 
> always have the last version when calling getRegionLocations.
> Any thought on this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8136) coprocessor service requires .meta. to be available all the time.

2013-03-18 Thread nkeywal (JIRA)
nkeywal created HBASE-8136:
--

 Summary: coprocessor service requires .meta. to be available all 
the time.
 Key: HBASE-8136
 URL: https://issues.apache.org/jira/browse/HBASE-8136
 Project: HBase
  Issue Type: Bug
  Components: Client, Coprocessors
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor



HTable#getRegionLocations does not use a cache: all the calls to this function 
go to .META.

So:
- we're missing an opportunity to reuse/update the location cache in the 
HConnection.
- this method is called by the coprocessor service. So, for people using this 
features, they have .meta. on their execution path, and it's not good for 
performances, scalability and reliability.

I'm not totally clear on the fix. I think it should be possible to use the 
cache to see if we have all regions for the table. But it means we won't always 
have the last version when calling getRegionLocations.

Any thought on this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605224#comment-13605224
 ] 

nkeywal commented on HBASE-4955:


It's fun, because if I do:
mvn clean test -Dsurefire.part2.skip=true -q -PrunAllTests 
-Dsurefire.part1.forkCount=10
the number of executed tests is a random number above 600

while with
mvn clean test -Dsurefire.part2.skip=true -q -PrunAllTests 
-Dsurefire.part1.forkCount=1
It's always 543

more parallism == less randomness (logic) but less tests executed (not logic)

I don't reproduce it on a surefire unit tests. I'm going to try a little bit 
more then we will have the option to wait for 2.15, hoping it will be 
identified and fixed.


> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605183#comment-13605183
 ] 

nkeywal commented on HBASE-8135:


Agreed, I will do that on commit. Are you +1 otherwise?

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605139#comment-13605139
 ] 

nkeywal commented on HBASE-4955:


Likely a bad news... Among the missing tests, we have this:

@RunWith(Parameterized.class)
@Category(SmallTests.class)
public class TestFixedFileTrailer {


i.e. there could be issues with parametized tests (and that could not be enough 
to explain the 200 missing tests).

Looking...


> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605046#comment-13605046
 ] 

nkeywal commented on HBASE-7590:


v12 with the comments on RB from Devaraj taken into account. Nearly there!


> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
> 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
> 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
> 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v12.patch

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
> 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v12.patch

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
> 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
> 7590.v5.patch, 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Patch Available  (was: Open)

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.5, 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v1.patch

> Mutation should implement HeapSize
> --
>
> Key: HBASE-8135
> URL: https://issues.apache.org/jira/browse/HBASE-8135
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0, 0.94.5
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8135.v1.patch
>
>
> Code is there already.
> Doing so would allow to share some code when doing client side buffering.
> patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)
nkeywal created HBASE-8135:
--

 Summary: Mutation should implement HeapSize
 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0
 Attachments: 8135.v1.patch

Code is there already.
Doing so would allow to share some code when doing client side buffering.
patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Status: Patch Available  (was: Open)

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Attachment: 4955.v2.patch

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Status: Open  (was: Patch Available)

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604981#comment-13604981
 ] 

nkeywal commented on HBASE-4955:


bq. Tests run: 35, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 281.479 sec
Useless log lines. That's SUREFIRE-969 

bq. Took 2 mo. 7 d.
That's SUREFIRE-970.

bq. 2,158 tests (-274)
It should be the flakiness of TestHTableMultiplexer.testHTableMultiplexer


I'm going to retry because of the point 3.
For the first 2 ones, I tend to think it should not prevent us from committing. 
We don't have any issue today because I built a version that included all we 
need. If we want to come back to an official version, we need to compromise. We 
can expect these points are likely to be solved in a later version, but these 
later version can also include regressions.. We need to jump in at a moment, 
and we've been waiting for more than a year now.

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> HTable#put improvements
> ---
>
> Key: HBASE-8128
> URL: https://issues.apache.org/jira/browse/HBASE-8128
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8128.v1.patch
>
>
> 3 points:
>  - When doing a single put, we're creating an object by calling Arrays.asList
>  - we're doing a size check every 10 put. Not doing it seems simpler, better 
> and allows to share some code between a single put and a list of puts.
>  - we could call flushCommits on empty write buffer, especially for someone 
> using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604970#comment-13604970
 ] 

nkeywal commented on HBASE-8128:


bq. Sound simple and efficient... Will it be possible to have it for 0.94 too?
The patch should be directly applicable, so I can do it. As you and Lars want.

bq. These classes could do with a general revamp.
Agreed. I'm actually studying this currently.

Committed in trunk and 0.95, thanks for the reviews!

> HTable#put improvements
> ---
>
> Key: HBASE-8128
> URL: https://issues.apache.org/jira/browse/HBASE-8128
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8128.v1.patch
>
>
> 3 points:
>  - When doing a single put, we're creating an object by calling Arrays.asList
>  - we're doing a size check every 10 put. Not doing it seems simpler, better 
> and allows to share some code between a single put and a list of puts.
>  - we could call flushCommits on empty write buffer, especially for someone 
> using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8128) HTable#put improvements

2013-03-16 Thread nkeywal (JIRA)
nkeywal created HBASE-8128:
--

 Summary: HTable#put improvements
 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0
 Attachments: 8128.v1.patch

3 points:
 - When doing a single put, we're creating an object by calling Arrays.asList
 - we're doing a size check every 10 put. Not doing it seems simpler, better 
and allows to share some code between a single put and a list of puts.
 - we could call flushCommits on empty write buffer, especially for someone 
using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Status: Patch Available  (was: Open)

> HTable#put improvements
> ---
>
> Key: HBASE-8128
> URL: https://issues.apache.org/jira/browse/HBASE-8128
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8128.v1.patch
>
>
> 3 points:
>  - When doing a single put, we're creating an object by calling Arrays.asList
>  - we're doing a size check every 10 put. Not doing it seems simpler, better 
> and allows to share some code between a single put and a list of puts.
>  - we could call flushCommits on empty write buffer, especially for someone 
> using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Attachment: 8128.v1.patch

> HTable#put improvements
> ---
>
> Key: HBASE-8128
> URL: https://issues.apache.org/jira/browse/HBASE-8128
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.95.0, 0.96.0
>
> Attachments: 8128.v1.patch
>
>
> 3 points:
>  - When doing a single put, we're creating an object by calling Arrays.asList
>  - we're doing a size check every 10 put. Not doing it seems simpler, better 
> and allows to share some code between a single put and a list of puts.
>  - we could call flushCommits on empty write buffer, especially for someone 
> using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Attachment: 4955.v2.patch

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Fix Version/s: (was: 0.95.0)
   0.96.0
   Status: Patch Available  (was: Open)

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
> 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire & junit

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Status: Open  (was: Patch Available)

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing

2013-03-15 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603218#comment-13603218
 ] 

nkeywal commented on HBASE-8097:


bq. The timestamp updated in DeadServer.add seems bogus to me.
You're right. It should be a putIfAbsent if it was a concurrentMap. The methods 
are synchronized and it should be critical, so the following implementation 
should be correct:
{code}
  /**
   * Adds the server to the dead server list if it's not there already.
   * @param sn the server name
   */
  public synchronized void add(ServerName sn) {
this.numProcessing++;
if (!deadServers.containsKey(sn)){
  deadServers.put(sn, EnvironmentEdgeManager.currentTimeMillis());
}
  }
{code}

Tell me if you want me to create another JIRA for this or if you don't mind 
adding this into this JIRA.

For numProcessing, it seems there are several bugs as well: if you have an 
exception anywhere (for example in verifyAndAssignMetaWithRetries();) we have a 
broken state: we haven't decreased the numProcessing but we're not working on 
it anymore. If we want to be exception safe (as ServerShutdownHandler is) we 
need the finally imho. I can't find a better solution than:

{code}
  @Override
  public void process() throws IOException {
boolean gotException = true;
try {
  try {
LOG.info("Splitting META logs for " + serverName);
if (this.shouldSplitHlog) {
  this.services.getMasterFileSystem().splitMetaLog(serverName);
}
  } catch (IOException ioe) {
this.deadServers.add(serverName);
this.services.getExecutorService().submit(this);
throw new IOException("failed log splitting for " +
serverName + ", will retry", ioe);
  }


  // Assign root and meta if we were carrying them.
  if (isCarryingMeta()) { // .META.
// Check again: region may be assigned to other where because of RIT
// timeout
if (this.services.getAssignmentManager().isCarryingMeta(serverName)) {
  LOG.info("Server " + serverName
  + " was carrying META. Trying to assign.");
  this.services.getAssignmentManager().regionOffline(
  HRegionInfo.FIRST_META_REGIONINFO);
  verifyAndAssignMetaWithRetries();
} else {
  LOG.info("META has been assigned to otherwhere, skip assigning.");
}
  }
  gotException = false;
} finally {
  if (gotException){
// If we had an exception we can't rely on super.process to say we 
finished the process.
this.deadServers.finish(serverName);
  }
}
super.process();
  }
{code}

I can't say I like, but it should do the job...



> MetaServerShutdownHandler may potentially keep bumping up 
> DeadServer.numProcessing
> --
>
> Key: HBASE-8097
> URL: https://issues.apache.org/jira/browse/HBASE-8097
> Project: HBase
>  Issue Type: Bug
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.96.0
>
> Attachments: 8097.txt, hbase-8097_1.patch
>
>
> {code}
> } catch (IOException ioe) {
>   this.services.getExecutorService().submit(this);
>   this.deadServers.add(serverName);
>   throw new IOException("failed log splitting for " +
>   serverName + ", will retry", ioe);
> }
> {code}
> this.deadServers.add(serverName); will keep incrementing 
> DeadServer.numProcessing
> We can't get rid of numProcessing by just checking deadServers.size() because 
> deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8105) RegionServer Doesn't Rejoin Cluster after Netsplit

2013-03-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602355#comment-13602355
 ] 

nkeywal commented on HBASE-8105:


I suppose you have a YouAreDeadException in the logs?
This would be expected. The logic is that the region server cannot be trusted 
anymore as it was ejected from the cluster. Then yes, it could abort. On the 
other hand you may want to look at it in details. Personally I would prefer to 
abort to be sure I don't have clients trying to use this dead server.

Note that for questions or discussions, it's better to use the user mailing 
list.

> RegionServer Doesn't Rejoin Cluster after Netsplit
> --
>
> Key: HBASE-8105
> URL: https://issues.apache.org/jira/browse/HBASE-8105
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.1
> Environment: Linux Ubuntu 10.04 LTS
>Reporter: philo vivero
>
> Running a 15-node HBase cluster. Testing various failure scenarios. Segregate 
> one RegionServer from the cluster by firewalling off every port except SSH 
> (because we need to be able to re-enable the node later).
> After the RS is automatically removed from the cluster, we re-enable all 
> ports again, but RS never rejoins the cluster.
> I suspect the possibility this is desired behaviour, but haven't found proof 
> so far. The code doesn't have any comment indicating this is the behaviour 
> desired:
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.2/org/apache/hadoop/hbase/regionserver/HRegionServer.java/
> See lines starting at 624, public void run(). It makes it through the first 
> try/catch block, but then loops inside the second try/catch block. Our 
> hypothesis is that it never gets out naturally.
> If we bounce the RegionServer process, then it rejoins the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8025) zkcli fails when SERVER_GC_OPTS is enabled

2013-03-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602305#comment-13602305
 ] 

nkeywal commented on HBASE-8025:


Reading the patch, it seems ok to me. I will test it & commit to trunk & 0.95 
tomorrow if there is no objection. Will wait for Lars for 0.94, but it seems it 
should be committed there as well. As well, Jean-Marc & Dave, if you want to 
study it more (cf. suggestion above), I will wait for you.

> zkcli fails when SERVER_GC_OPTS is enabled
> --
>
> Key: HBASE-8025
> URL: https://issues.apache.org/jira/browse/HBASE-8025
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.4
>Reporter: Dave Latham
> Fix For: 0.95.0, 0.98.0, 0.94.7
>
> Attachments: HBASE-8025-0.94.patch
>
>
> HBASE-7091 added logic to separate GC logging options for some client 
> commands versus server commands.  It uses a list of known client commands 
> ("shell" "hbck" "hlog" "hfile" "zkcli") and uses the server GC logging 
> options for all other invocations of bin/hbase.  When zkcli is invoked, it in 
> turn invokes "hbase org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServerArg" 
> to gather the server command line arguments, but because 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServerArg is not on the white 
> list it enables server GC logging, which causes extra output that causes the 
> zkcli invocation to break.  HBASE-7153 addressed this but the fix only solved 
> the array syntax - not the white list, so the zkcli command still fails.
> There are many other tools you can invoke that are more likely to "client" 
> than "server" options. For example, "bin/hbase org.jruby.Main 
> region_mover.rb" or "bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable" 
> or "bin/hbase version" or "bin/hbase 
> org.apache.hadoop.hbase.mapreduce.Export". The whitelist of server commands 
> is shorter and easier to maintain than a whitelist of client commands.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8101) Cleanup: findbugs and javadoc warning fixes as well as making it illegal passing null row to Put/Delete, etc.

2013-03-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602173#comment-13602173
 ] 

nkeywal commented on HBASE-8101:


   @Override
+  public int hashCode() {
+// TODO: This is wrong.  Can't have two gets the same just because on same 
row.  But it
+// matches how equals works currently and gets rid of the findbugs warning.
+return this.getRow().hashCode();
+  }
=> You shouldn't call hashCode on an array, you could call 
java.util.Arrays.hashCode


+  public Increment(final byte [] row, final int offset, final int length) {
+if (row == null || length <= 0 || length > HConstants.MAX_ROW_LENGTH) {
   throw new IllegalArgumentException("Row key is invalid");
 }
=> When it happens in production, I like to have the actual values (i.e. row= 
offset= & so on ;-)


+@edu.umd.cs.findbugs.annotations.SuppressWarnings(
+value="CN_IDIOM_NO_SUPER_CALL",
+justification="Its PITA calling the super.clone")
=> There is a good reason for this warning: subclasses won't be able to call 
super.clone themselves if we do that (the type will be wrong: the object.clone 
creates the right object). As it's private (i.e. we don't offer a public API 
that should be subclassed I guess it's acceptable. At the very least we should 
put a warning in the justification.

+1 otherwise, thanks for doing this!

> Cleanup: findbugs and javadoc warning fixes as well as making it illegal 
> passing null row to Put/Delete, etc.
> -
>
> Key: HBASE-8101
> URL: https://issues.apache.org/jira/browse/HBASE-8101
> Project: HBase
>  Issue Type: Sub-task
>  Components: IPC/RPC
>Reporter: stack
> Fix For: 0.95.0
>
> Attachments: 8101.txt, 8101v2.txt
>
>
> Part of hbase-7900 broken out so that patch gets smaller.  This is a patch 
> with cleanup mostly findbugs fixes (general ones) as well as adding check for 
> null row being passed to Put, Get, etc.  This patch helps rpc along.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7840) Enhance the java it framework to start & stop a distributed hbase & hadoop cluster

2013-03-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601454#comment-13601454
 ] 

nkeywal commented on HBASE-7840:


Waiting for review or +1 on this one, I can rebase if you like.

> Enhance the java it framework to start & stop a distributed hbase & hadoop 
> cluster 
> ---
>
> Key: HBASE-7840
> URL: https://issues.apache.org/jira/browse/HBASE-7840
> Project: HBase
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0
>
> Attachments: 7840.v1.patch, 7840.v3.patch
>
>
> Needs are to use a development version of HBase & HDFS 1 & 2.
> Ideally, should be nicely backportable to 0.94 to allow comparisons and 
> regression tests between versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601449#comment-13601449
 ] 

nkeywal commented on HBASE-7590:


I will fix the 100 lines stuff on commit. Any +1 on the new version?

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
> 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8081) Backport HBASE-7213 (separate hlog for meta tables) to 0.94

2013-03-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600969#comment-13600969
 ] 

nkeywal commented on HBASE-8081:


Added to this, there are a lot of critical scenarios that you can have without 
separated logs:
- some blocks in the WAL may be not recoverable (corrupted, too many boxes 
missing). This risk is highly mitigated with a separate log. Without this, the 
whole cluster becomes unavailable when you're unlucky.
- if you come into hdfs issues during the recovery (hdfs issue being going to a 
dead datanode, something highly probable during a recovery), the recovery will 
be much slower.
- trying to run a recovery while .meta. is not available is also problematic. 
Unsuring that .meta. comes back early simplifies a lot of critical scenarios.

So having this in 0.94 is 'interesting' I would say :-).

> Backport HBASE-7213 (separate hlog for meta tables) to 0.94
> ---
>
> Key: HBASE-8081
> URL: https://issues.apache.org/jira/browse/HBASE-8081
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.5
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.7
>
> Attachments: 7213-0.94-2.patch, 7213-0.94.patch
>
>
> I am interested in backporting HBASE-7213 to 0.94. Helps to address more of 
> the MTTR story. Offline discussion with Lars indicated he is interested as 
> well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-12 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600250#comment-13600250
 ] 

nkeywal commented on HBASE-7590:


Comments taken into account, and I added the IOException instead of only 
ZooKeeperConnection exception...
I add it on RB as well.

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
> 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
> 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v5.patch

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
> 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
> 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v5.patch

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
> 7590.v5.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7327) Assignment Timeouts: Remove the code from the master

2013-03-12 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600109#comment-13600109
 ] 

nkeywal commented on HBASE-7327:


Yes (i.e. TOM is *not* activated by default) The idea is really to remove it 
but using baby steps.

> Assignment Timeouts: Remove the code from the master
> 
>
> Key: HBASE-7327
> URL: https://issues.apache.org/jira/browse/HBASE-7327
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch
>
>
> As per HBASE-7247...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire & junit

2013-03-12 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13599849#comment-13599849
 ] 

nkeywal commented on HBASE-4955:


The regression in SUREFIRE-970 makes the move to 2.14 problematic. It would be 
nice to have SUREFIRE-969 as well bu it's more mandatory. A bigger issue is 
that in my tests it seems that having multiple execution with different 
parameters does not work anymore. I will need to have a look at that to get it 
fixed in a release...

> Use the official versions of surefire & junit
> -
>
> Key: HBASE-4955
> URL: https://issues.apache.org/jira/browse/HBASE-4955
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.94.0
> Environment: all
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Minor
> Fix For: 0.95.0
>
> Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch
>
>
> We currently use private versions for Surefire & JUnit since HBASE-4763.
> This JIRA traks what we need to move to official versions.
> Surefire 2.11 is just out, but, after some tests, it does not contain all 
> what we need.
> JUnit. Could be for JUnit 4.11. Issue to monitor:
> https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
> feedback for an integration on trunk
> Surefire: Could be for Surefire 2.12. Issues to monitor are:
> 329 (category support): fixed, we use the official implementation from the 
> trunk
> 786 (@Category with forkMode=always): fixed, we use the official 
> implementation from the trunk
> 791 (incorrect elapsed time on test failure): fixed, we use the official 
> implementation from the trunk
> 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
> our version.
> 760 (does not take into account the test method): fixed in trunk, not fixed 
> in our version
> 798 (print immediately the test class name): not fixed in trunk, not fixed in 
> our version
> 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
> not fixed in our version
> 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
> fixed on our version
> 800 & 793 are the more important to monitor, it's the only ones that are 
> fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-03-11 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598749#comment-13598749
 ] 

nkeywal commented on HBASE-7247:


TimeOutManagement it now optional and deactivated by default. I will redo the 
measures.

> Assignment performances decreased by 50% because of 
> regionserver.OpenRegionHandler#tickleOpening
> 
>
> Key: HBASE-7247
> URL: https://issues.apache.org/jira/browse/HBASE-7247
> Project: HBase
>  Issue Type: Improvement
>  Components: master, Region Assignment, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0
>
> Attachments: 7247.v1.patch
>
>
> The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
> "Do this so master doesn't timeout this region-in-transition.".
> However, on the usual test, this makes the assignment time of 1500 regions 
> goes from 70s to 100s, that is, we're 50% slower because of this.
> More generally, ZooKeper commits to disk all the data update, and this takes 
> time. Using it to provide a keep alive seems overkill. At the very list, it 
> could be made asynchronous.
> I'm not sure how necessary these updates are required (I need to go deeper in 
> the internal, feedback welcome), but it seems very important to optimize 
> this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7327) Assignment Timeouts: Remove the code from the master

2013-03-11 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7327:
---

Resolution: Later
Status: Resolved  (was: Patch Available)

Since the final decision was to make this optional, the code is still there, to 
be removed later.

> Assignment Timeouts: Remove the code from the master
> 
>
> Key: HBASE-7327
> URL: https://issues.apache.org/jira/browse/HBASE-7327
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch
>
>
> As per HBASE-7247...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7927) Two versions of netty with hadoop.profile=2.0: 3.5.9 and 3.2.4

2013-03-11 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7927:
---

Assignee: nkeywal

> Two versions of netty with hadoop.profile=2.0: 3.5.9 and 3.2.4
> --
>
> Key: HBASE-7927
> URL: https://issues.apache.org/jira/browse/HBASE-7927
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
>
> I don't know why, but when you do a mvn dependency:tree, everything looks 
> fine. When you look at the generated target/cached_classpath.txt you see 2 
> versions of netty: netty-3.2.4.Final.jar and netty-3.5.9.Final.jar.
> This is bad and can lead to unpredictable behavior.
> I haven't looked at the other dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7713) Maven build fails for hbase-common on windows environment

2013-03-11 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598736#comment-13598736
 ] 

nkeywal commented on HBASE-7713:


No feedback. Closing, @[~rdoppalapudi], please reopen if you think differently.

> Maven build fails for hbase-common on windows environment
> -
>
> Key: HBASE-7713
> URL: https://issues.apache.org/jira/browse/HBASE-7713
> Project: HBase
>  Issue Type: Bug
> Environment: Windows Environment
>Reporter: Raghu Doppalapudi
>Priority: Minor
>
> build fails with following error message 
> "org.codehaus.plexus.resource.loader.ResourceNotFoundException: Could not 
> find resource 'dev-support/findbugs-exclude.xml'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7713) Maven build fails for hbase-common on windows environment

2013-03-11 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-7713.


Resolution: Cannot Reproduce

> Maven build fails for hbase-common on windows environment
> -
>
> Key: HBASE-7713
> URL: https://issues.apache.org/jira/browse/HBASE-7713
> Project: HBase
>  Issue Type: Bug
> Environment: Windows Environment
>Reporter: Raghu Doppalapudi
>Priority: Minor
>
> build fails with following error message 
> "org.codehaus.plexus.resource.loader.ResourceNotFoundException: Could not 
> find resource 'dev-support/findbugs-exclude.xml'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7938) Add integration test for various MapReduce workflows

2013-03-08 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597375#comment-13597375
 ] 

nkeywal commented on HBASE-7938:


I'm +1 of course.

I don't really know the IntegrationTestsDriver, I use maven to run the 
integration tests (I'm not saying it's a good thing, actually may be it's a bad 
thing). I will have a look.

If it has not changed recently, we don't start a mini map reduce cluster when 
we do a start-hbase not distributed. We should add it may be, to ease manual 
tests?
Following the work done in HBASE-7840, we could also add the start/stop of the 
map reduce part.

Then the integration tests dependending on map reduce could start a real 
cluster. For the path, I don't know: if we include the test code in HBase, it 
will have copied all other the place anyway, so we won't be able to check this. 
The code would have to be quite smart to explicitly ask maven to do nothing 
about it.




> Add integration test for various MapReduce workflows
> 
>
> Key: HBASE-7938
> URL: https://issues.apache.org/jira/browse/HBASE-7938
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Reporter: Nick Dimiduk
> Fix For: 0.95.0, 0.98.0, 0.94.7
>
>
> We have existing unit tests for smoke-testing the packaged MR jobs, however 
> they do not create a runtime environment that is true to running on a real MR 
> cluster. This is particularly true in regard to classpaths (HBASE-7934) but 
> also other static state (HBASE-4802). An integration test that can be pointed 
> to run on a pseudo-distributed Hadoop deployed on localhost would find these 
> kinds of problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7989) Client with a cache info on a dead server will wait for 20s before trying another one.

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596263#comment-13596263
 ] 

nkeywal commented on HBASE-7989:


Yes. There is a 20s timout for connect by default. And here there are two 
issues:
- we should be able to have a much lower timeout for connect as it doesn't 
depend on GC stuff and it's a clear error (we are sure that the action is not 
done on the server, contrary to a read or write timeout) 
- we should not even go to the server in some cases (we know it's dead).

> Client with a cache info on a dead server will wait for 20s before trying 
> another one.
> --
>
> Key: HBASE-7989
> URL: https://issues.apache.org/jira/browse/HBASE-7989
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>
> Scenario is:
> - fetch the cache in the client
> - a server dies
> - try to use a region that is on the dead server
> This will lead to a 20 second connect timeout. We don't have this in unit 
> test because we have this only is the remote box does not answer. In the unit 
> tests we have immediately a connection refused from the OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8022) Site target fails

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596174#comment-13596174
 ] 

nkeywal commented on HBASE-8022:


+1. We should add this to our precommit tests imho.

> Site target fails
> -
>
> Key: HBASE-8022
> URL: https://issues.apache.org/jira/browse/HBASE-8022
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.95.0, 0.96.0
>Reporter: Andrew Purtell
> Attachments: HBASE-8022.patch
>
>
> {noformat}
> mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly
> [...]
> Recoverable error
> org.xml.sax.SAXParseException: Include operation failed, reverting to 
> fallback. Resource error reading file as XML 
> (href='../../target/site/hbase-default.xml'). Reason: 
> /usr/src/Hadoop/hbase/target/site/hbase-default.xml (No such file or 
> directory)
> Error on line 672 column 52 of 
> file:///usr/src/Hadoop/hbase/src/docbkx/configuration.xml:
>   Error reported by XML parser: An 'include' failed, and no 'fallback' 
> element was found.
> [INFO]
>  
> [INFO] 
> 
> [INFO] Skipping HBase
> [INFO] This project has been banned from the build due to previous failures.
> [INFO] 
> 
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] HBase . FAILURE [5:34.980s]
> [INFO] HBase - Common  SKIPPED
> [INFO] HBase - Protocol .. SKIPPED
> [INFO] HBase - Client  SKIPPED
> [INFO] HBase - Prefix Tree ... SKIPPED
> [INFO] HBase - Hadoop Compatibility .. SKIPPED
> [INFO] HBase - Hadoop Two Compatibility .. SKIPPED
> [INFO] HBase - Server  SKIPPED
> [INFO] HBase - Integration Tests . SKIPPED
> [INFO] HBase - Examples .. SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 5:36.029s
> [INFO] Finished at: Thu Mar 07 21:59:14 CST 2013
> [INFO] Final Memory: 29M/297M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> com.agilejava.docbkx:docbkx-maven-plugin:2.0.14:generate-html (multipage) on 
> project hbase: Failed to transform configuration.xml. 
> org.xml.sax.SAXParseException: An 'include' failed, and no 'fallback' element 
> was found. -> [Help 1]
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8023) Assembly target fails

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596012#comment-13596012
 ] 

nkeywal commented on HBASE-8023:


Ah, I'm just seeing that you created HBASE-8022 for the site part. I confirm 
assembly w/o the site works here.
If the assembly w/o the site still fails for you, it's worth doing a mvn clean 
install -DskipTests first.

> Assembly target fails
> -
>
> Key: HBASE-8023
> URL: https://issues.apache.org/jira/browse/HBASE-8023
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.95.0, 0.96.0
>Reporter: Andrew Purtell
>
> The assembly target fails when using the 2.0 Hadoop profile (at least).
> {noformat}
> mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly
> [...]
> [INFO] --- maven-assembly-plugin:2.3:assembly (default-cli) @ hbase ---
> [INFO] Reading assembly descriptor: src/assembly/hadoop-two-compat.xml
> [WARNING] [DEPRECATION] moduleSet/binaries section detected in root-project 
> assembly.
> MODULE BINARIES MAY NOT BE AVAILABLE FOR THIS ASSEMBLY!
>  To refactor, move this assembly into a child project and use the flag 
> true in each moduleSet.
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-protocol:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-client:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-prefix-tree:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-hadoop-compat:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-hadoop2-compat:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-server:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-it:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-examples:jar:0.97-SNAPSHOT
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] HBase . FAILURE [15.877s]
> [INFO] HBase - Common  SUCCESS [4.633s]
> [INFO] HBase - Protocol .. SUCCESS [2.629s]
> [INFO] HBase - Client  SUCCESS [2.901s]
> [INFO] HBase - Prefix Tree ... SUCCESS [3.085s]
> [INFO] HBase - Hadoop Compatibility .. SUCCESS [2.647s]
> [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.005s]
> [INFO] HBase - Server  SUCCESS [1.888s]
> [INFO] HBase - Integration Tests . SUCCESS [6.917s]
> [INFO] HBase - Examples .. SUCCESS [2.815s]
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 6:41.503s
> [INFO] Finished at: Thu Mar 07 22:14:08 CST 2013
> [INFO] Final Memory: 67M/448M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-assembly-plugin:2.3:assembly (default-cli) on 
> project hbase: Failed to create assembly: Artifact: 
> org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT (included by module) does not 
> have an artifact with a file. Please ensure the package phase is run before 
> the assembly is generated. -> [Help 1]
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8023) Assembly target fails

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595996#comment-13595996
 ] 

nkeywal commented on HBASE-8023:


mvn clean package -DskipTests assembly:assembly -Dhadoop.profile=2.0
works here.

but
mvn clean package site -DskipTests assembly:assembly -Dhadoop.profile=2.0
does not
[INFO] Processing input file: book.xml
[INFO] Applying customization parameters
[INFO] Chunking output.
Recoverable error
org.xml.sax.SAXParseException: Include operation failed, reverting to fallback. 
Resource error reading file as XML 
(href='../../target/site/hbase-default.xml'). Reason: 
/home/liochon/dev/hbase/target/site/hbase-default.xml (No such file or 
directory)
Error on line 672 column 52 of 
file:///home/liochon/dev/hbase/src/docbkx/configuration.xml:
  Error reported by XML parser: An 'include' failed, and no 'fallback' element 
was found.
Error on line 70 column 85 of 
file:///home/liochon/dev/hbase/src/docbkx/book.xml:
  Error reported by XML parser: Error attempting to parse XML file 
(href='configuration.xml').
[INFO] 


I'm testing but I think it's not related to the 2.0 profile, just site.

> Assembly target fails
> -
>
> Key: HBASE-8023
> URL: https://issues.apache.org/jira/browse/HBASE-8023
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.95.0, 0.96.0
>Reporter: Andrew Purtell
>
> The assembly target fails when using the 2.0 Hadoop profile (at least).
> {noformat}
> mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly
> [...]
> [INFO] --- maven-assembly-plugin:2.3:assembly (default-cli) @ hbase ---
> [INFO] Reading assembly descriptor: src/assembly/hadoop-two-compat.xml
> [WARNING] [DEPRECATION] moduleSet/binaries section detected in root-project 
> assembly.
> MODULE BINARIES MAY NOT BE AVAILABLE FOR THIS ASSEMBLY!
>  To refactor, move this assembly into a child project and use the flag 
> true in each moduleSet.
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-protocol:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-client:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-prefix-tree:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-hadoop-compat:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-hadoop2-compat:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-server:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-it:jar:0.97-SNAPSHOT
> [INFO] Processing sources for module project: 
> org.apache.hbase:hbase-examples:jar:0.97-SNAPSHOT
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] HBase . FAILURE [15.877s]
> [INFO] HBase - Common  SUCCESS [4.633s]
> [INFO] HBase - Protocol .. SUCCESS [2.629s]
> [INFO] HBase - Client  SUCCESS [2.901s]
> [INFO] HBase - Prefix Tree ... SUCCESS [3.085s]
> [INFO] HBase - Hadoop Compatibility .. SUCCESS [2.647s]
> [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.005s]
> [INFO] HBase - Server  SUCCESS [1.888s]
> [INFO] HBase - Integration Tests . SUCCESS [6.917s]
> [INFO] HBase - Examples .. SUCCESS [2.815s]
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 6:41.503s
> [INFO] Finished at: Thu Mar 07 22:14:08 CST 2013
> [INFO] Final Memory: 67M/448M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-assembly-plugin:2.3:assembly (default-cli) on 
> project hbase: Failed to create assembly: Artifact: 
> org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT (included by module) does not 
> have an artifact with a file. Please ensure the package phase is run before 
> the assembly is generated. -> [Help 1]
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch, 8002.v4.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595737#comment-13595737
 ] 

nkeywal commented on HBASE-8002:


And thanks for the review!

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch, 8002.v4.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595736#comment-13595736
 ] 

nkeywal commented on HBASE-8002:


v4 is what I committed on trunk & 0.95

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch, 8002.v4.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Attachment: 8002.v4.patch

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch, 8002.v4.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers & clients

2013-03-06 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595034#comment-13595034
 ] 

nkeywal commented on HBASE-7590:


It's on RB, waiting for reviews before being committed :-).

> Add a costless notifications mechanism from master to regionservers & clients
> -
>
> Key: HBASE-7590
> URL: https://issues.apache.org/jira/browse/HBASE-7590
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: nkeywal
> Attachments: 7590.inprogress.patch, 7590.v1.patch, 
> 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch
>
>
> t would be very useful to add a mechanism to distribute some information to 
> the clients and regionservers. Especially It would be useful to know globally 
> (regionservers + clients apps) that some regionservers are dead. This would 
> allow:
> - to lower the load on the system, without clients using staled information 
> and going on dead machines
> - to make the recovery faster from a client point of view. It's common to use 
> large timeouts on the client side, so the client may need a lot of time 
> before declaring a region server dead and trying another one. If the client 
> receives the information separatly about a region server states, it can take 
> the right decision, and continue/stop to wait accordingly.
> We can also send more information, for example instructions like 'slow down' 
> to instruct the client to increase the retries delay and so on.
>  Technically, the master could send this information. To lower the load on 
> the system, we should:
> - have a multicast communication (i.e. the master does not have to connect to 
> all servers by tcp), with once packet every 10 seconds or so.
> - receivers should not depend on this: if the information is available great. 
> If not, it should not break anything.
> - it should be optional.
> So at the end we would have a thread in the master sending a protobuf message 
> about the dead servers on a multicast socket. If the socket is not 
> configured, it does not do anything. On the client side, when we receive an 
> information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6772) Make the Distributed Split HDFS Location aware

2013-03-06 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595023#comment-13595023
 ] 

nkeywal commented on HBASE-6772:


The new design is better than my original proposition. I'm +1. Devaraj' comment 
is important as well imho, so we should put this info as well in ZK.
Just one point: the master should provide the full list of regionservers owning 
a copy. This way:
 - if one of the regionserver is actually dead it can be picked up by another 
one
 - it's possible to optimize the choice in the regionserver: if the RS sees 
it's the only one for a block it can pick it instead of another one that have 
more potential regionserver.
 - + the rack already mentioned by Devaraj.


> Make the Distributed Split HDFS Location aware
> --
>
> Key: HBASE-6772
> URL: https://issues.apache.org/jira/browse/HBASE-6772
> Project: HBase
>  Issue Type: Improvement
>  Components: master, regionserver
>Affects Versions: 0.96.0
>Reporter: nkeywal
>Assignee: Jeffrey Zhong
>
> During a hlog split, each log file (a single hdfs block) is allocated to a 
> different region server. This region server reads the file and creates the 
> recovery edit files.
> The allocation to the region server is random. We could take into account the 
> locations of the log file to split:
> - the reads would be local, hence faster. This allows short circuit as well.
> - less network i/o used during a failure (and this is important)
> - we would be sure to read from a working datanode, hence we're sure we won't 
> have read errors. Read errors slow the split process a lot, as we often enter 
> the "timeouted world". 
> We need to limit the calls to the namenode however.
> Typical algo could be:
> - the master gets the locations of the hlog files
> - it writes it into ZK, if possible in one transaction (this way all the 
> tasks are visible alltogether, allowing some arbitrage by the region server).
> - when the regionserver receives the event, it checks for all logs and all 
> locations.
> - if there is a match, it takes it
> - if not it waits something like 0.2s (to give the time to other regionserver 
> to take it if the location matches), and take any remaining task.
> Drawbacks are:
> - a 0.2s delay added if there is no regionserver available on one of the 
> locations. It's likely possible to remove it with some extra synchronization.
> - Small increase in complexity and dependency to HDFS
> Considering the advantages, it's worth it imho.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7948) client doesn't need to refresh meta while the region is opening

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593930#comment-13593930
 ] 

nkeywal commented on HBASE-7948:


bq. Can you please elaborate about more dangerous parts? 
I was thinking about the code that we're slowly removing with HBASE-8002. It 
has 3 sides effects:
1) It was decreasing the performances, it has been fixed in numerous patches, 
but there are still scary comments and issues (HBASE-7247)
2) It was hiding issues. In the tests we had very low timeout, so master 
failover scenarios seemed to be working. In production, we were depending on a 
10 minutes timeout but we didn't know.
3) It was causing double assignment issues, i.e. data corruption.

This was exactly the same logic (don't trust the RS), with more dramatic 
consequences.

bq.  In my experience it's not safe to trust anything forever as a general 
principle, not because I think RS code is unreliable.
I'm not against this, but in this case we need to tackle this the standard way: 
watchdog the process, and exclude the fuzzy ones from the group. Before doing 
this, I could like to see the chaos monkey test working with kill -9 for a 
while (I doubt it does today :-( )
But I agree with your point, and we will have this soon or later (BTW, it's 
exactly why there are checksums in hdfs: because you can't trust the storage).

bq.  But for client there's no data loss potential from flushing the cache, but 
there's potential to be stuck forever in case of abnormal RS behavior. With 
remote things I prefer to be defensive on all sides if practical 

Yeah. I'm likely biased. So, imho
- the patch is an improvement. HBase is better with this patch than without.
- it would be simpler without the RS-trust part
- to me, at the margin on degraded conditions, it would be more efficient 
without the RS-trust part as well.

As we need to make progress :-), I propose:
1) Well, if you're not against the idea of removing the RS-trust part, we're 
done
2) If you really want to keep it, let's wait a few days if someone wants to 
come by. If no one does, let's commit on Friday.

What do you think?



> client doesn't need to refresh meta while the region is opening
> ---
>
> Key: HBASE-7948
> URL: https://issues.apache.org/jira/browse/HBASE-7948
> Project: HBase
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7948-v0.patch, HBASE-7948-v1.patch, 
> HBASE-7948-v1.patch, HBASE-7948-v2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593888#comment-13593888
 ] 

nkeywal commented on HBASE-8002:


bq. is it possible to remove the setting from HConstants and add to where it 
applies? 
Ok, will do.

bq. where everything is set to null doesn't appear necessary.
I need it because the attributes are final, so must be set on all execution 
paths. I would prefer to keep them final, but I can change this if you like.

bq. Precondition instead of assert since we don't usually enable assert when we 
run and the timeout checks are infrequent enough it tshouldn't be costly 
running precondition.
It was more for documentation & unit tests. But ok, will do.


I will commit the patch with point 1) & 3) tomorrow if there is no objection. 
If you ask for 2) I will include it.

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch, 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593634#comment-13593634
 ] 

nkeywal commented on HBASE-8003:


Tried locally, all passed at the first try excepted 
TestReplicationQueueFailover. This one succeeded the second time.

Committed, thanks for the review.

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch, 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593429#comment-13593429
 ] 

nkeywal commented on HBASE-8002:


Ok locally. But the "return of the flakiness" makes this a little bit 
complicated.

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Status: Patch Available  (was: Open)

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Attachment: 8002.v3.patch

> Make TimeOut Management for Assignment optional in master and regionservers
> ---
>
> Key: HBASE-8002
> URL: https://issues.apache.org/jira/browse/HBASE-8002
> Project: HBase
>  Issue Type: Bug
>  Components: Client, master, Region Assignment
>Affects Versions: 0.95.0, 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
> Fix For: 0.95.0, 0.98.0
>
> Attachments: 8002.v3.patch
>
>
> See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Status: Patch Available  (was: Open)

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch, 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Status: Open  (was: Patch Available)

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch, 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Attachment: 8003.v1.patch

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch, 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593312#comment-13593312
 ] 

nkeywal commented on HBASE-8003:


Back to flakiness it seems. All this should be totally unrelated. Let's give it 
another go.

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch, 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593251#comment-13593251
 ] 

nkeywal commented on HBASE-8003:


trivial patch, as it's always used with seconds today.

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Attachments: 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Fix Version/s: 0.98.0
Affects Version/s: 0.98.0
   Status: Patch Available  (was: Open)

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Attachment: 8003.v1.patch

> Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
> 
>
> Key: HBASE-8003
> URL: https://issues.apache.org/jira/browse/HBASE-8003
> Project: HBase
>  Issue Type: Bug
>Reporter: nkeywal
>Assignee: nkeywal
>Priority: Trivial
> Attachments: 8003.v1.patch
>
>
>   /**
>* Create a new CachedThreadPool with a bounded number as the maximum 
>* thread size in the pool.
>* 
>* @param maxCachedThread the maximum thread could be created in the pool
>* @param timeout the maximum time to wait
>* @param unit the time unit of the timeout argument
>* @param threadFactory the factory to use when creating new threads
>* @return threadPoolExecutor the cachedThreadPool with a bounded number 
>* as the maximum thread size in the pool. 
>*/
>   public static ThreadPoolExecutor getBoundedCachedThreadPool(
>   int maxCachedThread, long timeout, TimeUnit unit,
>   ThreadFactory threadFactory) {
> ThreadPoolExecutor boundedCachedThreadPool =
>   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
> TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
> // allow the core pool threads timeout and terminate
> boundedCachedThreadPool.allowCoreThreadTimeOut(true);
> return boundedCachedThreadPool;
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)
nkeywal created HBASE-8003:
--

 Summary: Threads#getBoundedCachedThreadPool harcodes the time unit 
to seconds
 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial


  /**
   * Create a new CachedThreadPool with a bounded number as the maximum 
   * thread size in the pool.
   * 
   * @param maxCachedThread the maximum thread could be created in the pool
   * @param timeout the maximum time to wait
   * @param unit the time unit of the timeout argument
   * @param threadFactory the factory to use when creating new threads
   * @return threadPoolExecutor the cachedThreadPool with a bounded number 
   * as the maximum thread size in the pool. 
   */
  public static ThreadPoolExecutor getBoundedCachedThreadPool(
  int maxCachedThread, long timeout, TimeUnit unit,
  ThreadFactory threadFactory) {
ThreadPoolExecutor boundedCachedThreadPool =
  new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
TimeUnit.SECONDS, new LinkedBlockingQueue(), threadFactory);
// allow the core pool threads timeout and terminate
boundedCachedThreadPool.allowCoreThreadTimeOut(true);
return boundedCachedThreadPool;
  }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   5   6   7   8   9   10   >