[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Open  (was: Patch Available)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v4.patch

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Patch Available  (was: Open)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13606183#comment-13606183
 ] 

nkeywal commented on HBASE-8135:


Thanks a lot Ted. For v4 I've just moved the test 'Put' with the other tests. I 
will commit as soon as I get a +1.

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6674) Check behavior of current surefire trunk on Hadoop QA

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6674:
---

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Since JUnit part is done, I'm using HBASE-4955 to test surefire.

 Check behavior of current surefire trunk on Hadoop QA
 -

 Key: HBASE-6674
 URL: https://issues.apache.org/jira/browse/HBASE-6674
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5processes.patch, 5processes.patch, 5processes.patch, 
 6674.patch, 6674.v2.patch, 6674.v2.patch, 6674.v2.patch, 6674.v2.patch


 Not to be committed.
 Surefire 2.13 is in progress. Let's check that it works for us before it's 
 released. Locally it's acceptable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13606188#comment-13606188
 ] 

nkeywal commented on HBASE-8135:


bq. There was a javadoc warning. 
Right. I'm going to hunt it.

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 8135.v4.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v5.patch

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13606196#comment-13606196
 ] 

nkeywal commented on HBASE-8135:


v5 is what I will commit if the build runs ok.

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Open  (was: Patch Available)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v5.patch

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Patch Available  (was: Open)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13606568#comment-13606568
 ] 

nkeywal commented on HBASE-8135:


This seems to prove that we're in the usual flakiness (thanks for having 
relaunched the tests, Ted). Committed to trunk and 0.95.

Thanks for the review, Stack  Ted!

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch, 8135.v5.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch, 8135-v3.txt, 
 8135.v4.patch, 8135.v5.patch, 8135.v5.patch, 8135.v5.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.95.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.96.0

 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8145) TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0

2013-03-19 Thread nkeywal (JIRA)
nkeywal created HBASE-8145:
--

 Summary: TestHCM flaky: java.lang.IllegalArgumentException: Row 
length is 0
 Key: HBASE-8145
 URL: https://issues.apache.org/jira/browse/HBASE-8145
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.96.0


I will check for 0.95.

{code}
for (HRegion region : regions) {
  if 
(!region.getRegionInfo().getEncodedName().equals(toMove.getRegionInfo().getEncodedName())
   
Bytes.BYTES_COMPARATOR.compare(region.getRegionInfo().getStartKey(), ROW_X)  
0) {
otherRow = region.getRegionInfo().getStartKey();
break;
  }
}
{code}

We're likely to get sometimes the startKey of the first region here, and that's 
an empty byte array. This make the put creation to fail, since there is now 
(with HBASE-8101) a check on the empty rows at put creation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8145) TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0

2013-03-19 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-8145.


Resolution: Duplicate

 TestHCM flaky: java.lang.IllegalArgumentException: Row length is 0
 --

 Key: HBASE-8145
 URL: https://issues.apache.org/jira/browse/HBASE-8145
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.96.0


 I will check for 0.95.
 {code}
 for (HRegion region : regions) {
   if 
 (!region.getRegionInfo().getEncodedName().equals(toMove.getRegionInfo().getEncodedName())

 Bytes.BYTES_COMPARATOR.compare(region.getRegionInfo().getStartKey(), ROW_X)  
 0) {
 otherRow = region.getRegionInfo().getStartKey();
 break;
   }
 }
 {code}
 We're likely to get sometimes the startKey of the first region here, and 
 that's an empty byte array. This make the put creation to fail, since there 
 is now (with HBASE-8101) a check on the empty rows at put creation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-19 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13606904#comment-13606904
 ] 

nkeywal commented on HBASE-7590:


ok, will do (as for rb) by the end of this week.

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.96.0

 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604970#comment-13604970
 ] 

nkeywal commented on HBASE-8128:


bq. Sound simple and efficient... Will it be possible to have it for 0.94 too?
The patch should be directly applicable, so I can do it. As you and Lars want.

bq. These classes could do with a general revamp.
Agreed. I'm actually studying this currently.

Committed in trunk and 0.95, thanks for the reviews!

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604981#comment-13604981
 ] 

nkeywal commented on HBASE-4955:


bq. Tests run: 35, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 281.479 sec
Useless log lines. That's SUREFIRE-969 

bq. Took 2 mo. 7 d.
That's SUREFIRE-970.

bq. 2,158 tests (-274)
It should be the flakiness of TestHTableMultiplexer.testHTableMultiplexer


I'm going to retry because of the point 3.
For the first 2 ones, I tend to think it should not prevent us from committing. 
We don't have any issue today because I built a version that included all we 
need. If we want to come back to an official version, we need to compromise. We 
can expect these points are likely to be solved in a later version, but these 
later version can also include regressions.. We need to jump in at a moment, 
and we've been waiting for more than a year now.

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Status: Open  (was: Patch Available)

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Attachment: 4955.v2.patch

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Status: Patch Available  (was: Open)

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)
nkeywal created HBASE-8135:
--

 Summary: Mutation should implement HeapSize
 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0
 Attachments: 8135.v1.patch

Code is there already.
Doing so would allow to share some code when doing client side buffering.
patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Patch Available  (was: Open)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v1.patch

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v12.patch

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v12.patch

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605046#comment-13605046
 ] 

nkeywal commented on HBASE-7590:


v12 with the comments on RB from Devaraj taken into account. Nearly there!


 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 
 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605139#comment-13605139
 ] 

nkeywal commented on HBASE-4955:


Likely a bad news... Among the missing tests, we have this:

@RunWith(Parameterized.class)
@Category(SmallTests.class)
public class TestFixedFileTrailer {


i.e. there could be issues with parametized tests (and that could not be enough 
to explain the 200 missing tests).

Looking...


 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605183#comment-13605183
 ] 

nkeywal commented on HBASE-8135:


Agreed, I will do that on commit. Are you +1 otherwise?

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605224#comment-13605224
 ] 

nkeywal commented on HBASE-4955:


It's fun, because if I do:
mvn clean test -Dsurefire.part2.skip=true -q -PrunAllTests 
-Dsurefire.part1.forkCount=10
the number of executed tests is a random number above 600

while with
mvn clean test -Dsurefire.part2.skip=true -q -PrunAllTests 
-Dsurefire.part1.forkCount=1
It's always 543

more parallism == less randomness (logic) but less tests executed (not logic)

I don't reproduce it on a surefire unit tests. I'm going to try a little bit 
more then we will have the option to wait for 2.15, hoping it will be 
identified and fixed.


 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8136) coprocessor service requires .meta. to be available all the time.

2013-03-18 Thread nkeywal (JIRA)
nkeywal created HBASE-8136:
--

 Summary: coprocessor service requires .meta. to be available all 
the time.
 Key: HBASE-8136
 URL: https://issues.apache.org/jira/browse/HBASE-8136
 Project: HBase
  Issue Type: Bug
  Components: Client, Coprocessors
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor



HTable#getRegionLocations does not use a cache: all the calls to this function 
go to .META.

So:
- we're missing an opportunity to reuse/update the location cache in the 
HConnection.
- this method is called by the coprocessor service. So, for people using this 
features, they have .meta. on their execution path, and it's not good for 
performances, scalability and reliability.

I'm not totally clear on the fix. I think it should be possible to use the 
cache to see if we have all regions for the table. But it means we won't always 
have the last version when calling getRegionLocations.

Any thought on this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-8136) coprocessor service requires .meta. to be available all the time.

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-8136.


  Resolution: Duplicate
Release Note: HBASE-6870 

And the good news if that there is already a patch for HBASE-6870 :-).

 coprocessor service requires .meta. to be available all the time.
 -

 Key: HBASE-8136
 URL: https://issues.apache.org/jira/browse/HBASE-8136
 Project: HBase
  Issue Type: Bug
  Components: Client, Coprocessors
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 HTable#getRegionLocations does not use a cache: all the calls to this 
 function go to .META.
 So:
 - we're missing an opportunity to reuse/update the location cache in the 
 HConnection.
 - this method is called by the coprocessor service. So, for people using this 
 features, they have .meta. on their execution path, and it's not good for 
 performances, scalability and reliability.
 I'm not totally clear on the fix. I think it should be possible to use the 
 cache to see if we have all regions for the table. But it means we won't 
 always have the last version when calling getRegionLocations.
 Any thought on this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6870) HTable#coprocessorExec always scan the whole table

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-6870:
---

Affects Version/s: 0.96.0
   0.95.0

 HTable#coprocessorExec always scan the whole table 
 ---

 Key: HBASE-6870
 URL: https://issues.apache.org/jira/browse/HBASE-6870
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors
Affects Versions: 0.94.1, 0.95.0, 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, 
 HBASE-6870v2.patch, HBASE-6870v3.patch


 In current logic, HTable#coprocessorExec always scan the whole table, its 
 efficiency is low and will affect the Regionserver carrying .META. under 
 large coprocessorExec requests

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605362#comment-13605362
 ] 

nkeywal commented on HBASE-8135:


I was trying to find a generic way to get the size of an object. A google 
search on this leads to quite a lot of  terrible practises :-). It should be 
possible to do a static{} block for the fixed fields, but it won't bring much 
actual value. With the current implementation, it's better to have unit tests 
when ones adds fields. I'm going to do this in this patch (including Increment) 
it will be simpler.

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Status: Open  (was: Patch Available)

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.5, 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8135:
---

Attachment: 8135.v2.patch

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605470#comment-13605470
 ] 

nkeywal commented on HBASE-8135:


There is an issue: with the unit tests, I've got a huge gap I do not explain:

Expected :80
Actual   :168

2013-03-18 19:54:28,845 DEBUG [main] util.ClassSize(246): 0 row class [B
2013-03-18 19:54:28,845 DEBUG [main] util.ClassSize(246): 1 ts long
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 2 writeToWAL boolean
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 3 familyMap interface 
java.util.NavigableMap
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(246): 4 attributes 
interface java.util.Map
2013-03-18 19:54:28,846 DEBUG [main] util.ClassSize(273): Primitives=9, 
arrays=1, references(includes 2 for object overhead)=5, refSize 8, size=80, 
prealign_size=73

Any hint?

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8135) Mutation should implement HeapSize

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605483#comment-13605483
 ] 

nkeywal commented on HBASE-8135:


Yes, just locally (I do test before submitting :-) ) I used ClassSize.align for 
timerange, and it went ok. But for Put/Delete/Increment, There are 88 bytes of 
difference I cannot explain. The code is on the v2.patch.

 Mutation should implement HeapSize
 --

 Key: HBASE-8135
 URL: https://issues.apache.org/jira/browse/HBASE-8135
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0, 0.94.5
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0, 0.96.0

 Attachments: 8135.v1.patch, 8135.v2.patch


 Code is there already.
 Doing so would allow to share some code when doing client side buffering.
 patch compiles locally, should not impact tests, but not tested locally.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605503#comment-13605503
 ] 

nkeywal commented on HBASE-4955:


Yes, JUnit 4.11 contains what we need. Surefire contains what we need as well, 
we just need to get a version that comes up with an acceptable set of 
regressions.

Our Surefire is in Gary's repo. He will be the one getting the blame if Apache 
complains :-).
BTW, I haven't done the update to JUnit in HBase 0.94, as it implies 
backporting a few jiras as well (in the required section). So we still need to 
have it in Gary's repo as well.

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605527#comment-13605527
 ] 

nkeywal commented on HBASE-8128:


Committed in 0.94

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Fix Version/s: 0.94.8

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0, 0.94.8

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v13.patch

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605698#comment-13605698
 ] 

nkeywal commented on HBASE-7590:


May be 13 is going to be my lucky number :-) ?

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-18 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v12.patch, 7590.v12.patch, 
 7590.v13.patch, 7590.v1.patch, 7590.v1-rebased.patch, 7590.v2.patch, 
 7590.v3.patch, 7590.v5.patch, 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Status: Open  (was: Patch Available)

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Attachment: 4955.v2.patch

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4955:
---

Fix Version/s: (was: 0.95.0)
   0.96.0
   Status: Patch Available  (was: Open)

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 
 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8128) HTable#put improvements

2013-03-16 Thread nkeywal (JIRA)
nkeywal created HBASE-8128:
--

 Summary: HTable#put improvements
 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0
 Attachments: 8128.v1.patch

3 points:
 - When doing a single put, we're creating an object by calling Arrays.asList
 - we're doing a size check every 10 put. Not doing it seems simpler, better 
and allows to share some code between a single put and a list of puts.
 - we could call flushCommits on empty write buffer, especially for someone 
using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Status: Patch Available  (was: Open)

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8128) HTable#put improvements

2013-03-16 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8128:
---

Attachment: 8128.v1.patch

 HTable#put improvements
 ---

 Key: HBASE-8128
 URL: https://issues.apache.org/jira/browse/HBASE-8128
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.95.0, 0.96.0

 Attachments: 8128.v1.patch


 3 points:
  - When doing a single put, we're creating an object by calling Arrays.asList
  - we're doing a size check every 10 put. Not doing it seems simpler, better 
 and allows to share some code between a single put and a list of puts.
  - we could call flushCommits on empty write buffer, especially for someone 
 using a lot of HTable instead of using a pool, as it's called in close().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8097) MetaServerShutdownHandler may potentially keep bumping up DeadServer.numProcessing

2013-03-15 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603218#comment-13603218
 ] 

nkeywal commented on HBASE-8097:


bq. The timestamp updated in DeadServer.add seems bogus to me.
You're right. It should be a putIfAbsent if it was a concurrentMap. The methods 
are synchronized and it should be critical, so the following implementation 
should be correct:
{code}
  /**
   * Adds the server to the dead server list if it's not there already.
   * @param sn the server name
   */
  public synchronized void add(ServerName sn) {
this.numProcessing++;
if (!deadServers.containsKey(sn)){
  deadServers.put(sn, EnvironmentEdgeManager.currentTimeMillis());
}
  }
{code}

Tell me if you want me to create another JIRA for this or if you don't mind 
adding this into this JIRA.

For numProcessing, it seems there are several bugs as well: if you have an 
exception anywhere (for example in verifyAndAssignMetaWithRetries();) we have a 
broken state: we haven't decreased the numProcessing but we're not working on 
it anymore. If we want to be exception safe (as ServerShutdownHandler is) we 
need the finally imho. I can't find a better solution than:

{code}
  @Override
  public void process() throws IOException {
boolean gotException = true;
try {
  try {
LOG.info(Splitting META logs for  + serverName);
if (this.shouldSplitHlog) {
  this.services.getMasterFileSystem().splitMetaLog(serverName);
}
  } catch (IOException ioe) {
this.deadServers.add(serverName);
this.services.getExecutorService().submit(this);
throw new IOException(failed log splitting for  +
serverName + , will retry, ioe);
  }


  // Assign root and meta if we were carrying them.
  if (isCarryingMeta()) { // .META.
// Check again: region may be assigned to other where because of RIT
// timeout
if (this.services.getAssignmentManager().isCarryingMeta(serverName)) {
  LOG.info(Server  + serverName
  +  was carrying META. Trying to assign.);
  this.services.getAssignmentManager().regionOffline(
  HRegionInfo.FIRST_META_REGIONINFO);
  verifyAndAssignMetaWithRetries();
} else {
  LOG.info(META has been assigned to otherwhere, skip assigning.);
}
  }
  gotException = false;
} finally {
  if (gotException){
// If we had an exception we can't rely on super.process to say we 
finished the process.
this.deadServers.finish(serverName);
  }
}
super.process();
  }
{code}

I can't say I like, but it should do the job...



 MetaServerShutdownHandler may potentially keep bumping up 
 DeadServer.numProcessing
 --

 Key: HBASE-8097
 URL: https://issues.apache.org/jira/browse/HBASE-8097
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.96.0

 Attachments: 8097.txt, hbase-8097_1.patch


 {code}
 } catch (IOException ioe) {
   this.services.getExecutorService().submit(this);
   this.deadServers.add(serverName);
   throw new IOException(failed log splitting for  +
   serverName + , will retry, ioe);
 }
 {code}
 this.deadServers.add(serverName); will keep incrementing 
 DeadServer.numProcessing
 We can't get rid of numProcessing by just checking deadServers.size() because 
 deadServers is also used to report some historically failed RSs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8101) Cleanup: findbugs and javadoc warning fixes as well as making it illegal passing null row to Put/Delete, etc.

2013-03-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602173#comment-13602173
 ] 

nkeywal commented on HBASE-8101:


   @Override
+  public int hashCode() {
+// TODO: This is wrong.  Can't have two gets the same just because on same 
row.  But it
+// matches how equals works currently and gets rid of the findbugs warning.
+return this.getRow().hashCode();
+  }
= You shouldn't call hashCode on an array, you could call 
java.util.Arrays.hashCode


+  public Increment(final byte [] row, final int offset, final int length) {
+if (row == null || length = 0 || length  HConstants.MAX_ROW_LENGTH) {
   throw new IllegalArgumentException(Row key is invalid);
 }
= When it happens in production, I like to have the actual values (i.e. row= 
offset=  so on ;-)


+@edu.umd.cs.findbugs.annotations.SuppressWarnings(
+value=CN_IDIOM_NO_SUPER_CALL,
+justification=Its PITA calling the super.clone)
= There is a good reason for this warning: subclasses won't be able to call 
super.clone themselves if we do that (the type will be wrong: the object.clone 
creates the right object). As it's private (i.e. we don't offer a public API 
that should be subclassed I guess it's acceptable. At the very least we should 
put a warning in the justification.

+1 otherwise, thanks for doing this!

 Cleanup: findbugs and javadoc warning fixes as well as making it illegal 
 passing null row to Put/Delete, etc.
 -

 Key: HBASE-8101
 URL: https://issues.apache.org/jira/browse/HBASE-8101
 Project: HBase
  Issue Type: Sub-task
  Components: IPC/RPC
Reporter: stack
 Fix For: 0.95.0

 Attachments: 8101.txt, 8101v2.txt


 Part of hbase-7900 broken out so that patch gets smaller.  This is a patch 
 with cleanup mostly findbugs fixes (general ones) as well as adding check for 
 null row being passed to Put, Get, etc.  This patch helps rpc along.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8025) zkcli fails when SERVER_GC_OPTS is enabled

2013-03-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602305#comment-13602305
 ] 

nkeywal commented on HBASE-8025:


Reading the patch, it seems ok to me. I will test it  commit to trunk  0.95 
tomorrow if there is no objection. Will wait for Lars for 0.94, but it seems it 
should be committed there as well. As well, Jean-Marc  Dave, if you want to 
study it more (cf. suggestion above), I will wait for you.

 zkcli fails when SERVER_GC_OPTS is enabled
 --

 Key: HBASE-8025
 URL: https://issues.apache.org/jira/browse/HBASE-8025
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.4
Reporter: Dave Latham
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: HBASE-8025-0.94.patch


 HBASE-7091 added logic to separate GC logging options for some client 
 commands versus server commands.  It uses a list of known client commands 
 (shell hbck hlog hfile zkcli) and uses the server GC logging 
 options for all other invocations of bin/hbase.  When zkcli is invoked, it in 
 turn invokes hbase org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServerArg 
 to gather the server command line arguments, but because 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServerArg is not on the white 
 list it enables server GC logging, which causes extra output that causes the 
 zkcli invocation to break.  HBASE-7153 addressed this but the fix only solved 
 the array syntax - not the white list, so the zkcli command still fails.
 There are many other tools you can invoke that are more likely to client 
 than server options. For example, bin/hbase org.jruby.Main 
 region_mover.rb or bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable 
 or bin/hbase version or bin/hbase 
 org.apache.hadoop.hbase.mapreduce.Export. The whitelist of server commands 
 is shorter and easier to maintain than a whitelist of client commands.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8105) RegionServer Doesn't Rejoin Cluster after Netsplit

2013-03-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602355#comment-13602355
 ] 

nkeywal commented on HBASE-8105:


I suppose you have a YouAreDeadException in the logs?
This would be expected. The logic is that the region server cannot be trusted 
anymore as it was ejected from the cluster. Then yes, it could abort. On the 
other hand you may want to look at it in details. Personally I would prefer to 
abort to be sure I don't have clients trying to use this dead server.

Note that for questions or discussions, it's better to use the user mailing 
list.

 RegionServer Doesn't Rejoin Cluster after Netsplit
 --

 Key: HBASE-8105
 URL: https://issues.apache.org/jira/browse/HBASE-8105
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.1
 Environment: Linux Ubuntu 10.04 LTS
Reporter: philo vivero

 Running a 15-node HBase cluster. Testing various failure scenarios. Segregate 
 one RegionServer from the cluster by firewalling off every port except SSH 
 (because we need to be able to re-enable the node later).
 After the RS is automatically removed from the cluster, we re-enable all 
 ports again, but RS never rejoins the cluster.
 I suspect the possibility this is desired behaviour, but haven't found proof 
 so far. The code doesn't have any comment indicating this is the behaviour 
 desired:
 http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.2/org/apache/hadoop/hbase/regionserver/HRegionServer.java/
 See lines starting at 624, public void run(). It makes it through the first 
 try/catch block, but then loops inside the second try/catch block. Our 
 hypothesis is that it never gets out naturally.
 If we bounce the RegionServer process, then it rejoins the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8081) Backport HBASE-7213 (separate hlog for meta tables) to 0.94

2013-03-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600969#comment-13600969
 ] 

nkeywal commented on HBASE-8081:


Added to this, there are a lot of critical scenarios that you can have without 
separated logs:
- some blocks in the WAL may be not recoverable (corrupted, too many boxes 
missing). This risk is highly mitigated with a separate log. Without this, the 
whole cluster becomes unavailable when you're unlucky.
- if you come into hdfs issues during the recovery (hdfs issue being going to a 
dead datanode, something highly probable during a recovery), the recovery will 
be much slower.
- trying to run a recovery while .meta. is not available is also problematic. 
Unsuring that .meta. comes back early simplifies a lot of critical scenarios.

So having this in 0.94 is 'interesting' I would say :-).

 Backport HBASE-7213 (separate hlog for meta tables) to 0.94
 ---

 Key: HBASE-8081
 URL: https://issues.apache.org/jira/browse/HBASE-8081
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.5
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.94.7

 Attachments: 7213-0.94-2.patch, 7213-0.94.patch


 I am interested in backporting HBASE-7213 to 0.94. Helps to address more of 
 the MTTR story. Offline discussion with Lars indicated he is interested as 
 well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601449#comment-13601449
 ] 

nkeywal commented on HBASE-7590:


I will fix the 100 lines stuff on commit. Any +1 on the new version?

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7840) Enhance the java it framework to start stop a distributed hbase hadoop cluster

2013-03-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601454#comment-13601454
 ] 

nkeywal commented on HBASE-7840:


Waiting for review or +1 on this one, I can rebase if you like.

 Enhance the java it framework to start  stop a distributed hbase  hadoop 
 cluster 
 ---

 Key: HBASE-7840
 URL: https://issues.apache.org/jira/browse/HBASE-7840
 Project: HBase
  Issue Type: New Feature
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0

 Attachments: 7840.v1.patch, 7840.v3.patch


 Needs are to use a development version of HBase  HDFS 1  2.
 Ideally, should be nicely backportable to 0.94 to allow comparisons and 
 regression tests between versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2013-03-12 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599849#comment-13599849
 ] 

nkeywal commented on HBASE-4955:


The regression in SUREFIRE-970 makes the move to 2.14 problematic. It would be 
nice to have SUREFIRE-969 as well bu it's more mandatory. A bigger issue is 
that in my tests it seems that having multiple execution with different 
parameters does not work anymore. I will need to have a look at that to get it 
fixed in a release...

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.95.0

 Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch


 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7327) Assignment Timeouts: Remove the code from the master

2013-03-12 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600109#comment-13600109
 ] 

nkeywal commented on HBASE-7327:


Yes (i.e. TOM is *not* activated by default) The idea is really to remove it 
but using baby steps.

 Assignment Timeouts: Remove the code from the master
 

 Key: HBASE-7327
 URL: https://issues.apache.org/jira/browse/HBASE-7327
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch


 As per HBASE-7247...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v5.patch

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Open  (was: Patch Available)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Attachment: 7590.v5.patch

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-12 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7590:
---

Status: Patch Available  (was: Open)

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-12 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600250#comment-13600250
 ] 

nkeywal commented on HBASE-7590:


Comments taken into account, and I added the IOException instead of only 
ZooKeeperConnection exception...
I add it on RB as well.

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch, 7590.v5.patch, 
 7590.v5.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7713) Maven build fails for hbase-common on windows environment

2013-03-11 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-7713.


Resolution: Cannot Reproduce

 Maven build fails for hbase-common on windows environment
 -

 Key: HBASE-7713
 URL: https://issues.apache.org/jira/browse/HBASE-7713
 Project: HBase
  Issue Type: Bug
 Environment: Windows Environment
Reporter: Raghu Doppalapudi
Priority: Minor

 build fails with following error message 
 org.codehaus.plexus.resource.loader.ResourceNotFoundException: Could not 
 find resource 'dev-support/findbugs-exclude.xml'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7713) Maven build fails for hbase-common on windows environment

2013-03-11 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598736#comment-13598736
 ] 

nkeywal commented on HBASE-7713:


No feedback. Closing, @[~rdoppalapudi], please reopen if you think differently.

 Maven build fails for hbase-common on windows environment
 -

 Key: HBASE-7713
 URL: https://issues.apache.org/jira/browse/HBASE-7713
 Project: HBase
  Issue Type: Bug
 Environment: Windows Environment
Reporter: Raghu Doppalapudi
Priority: Minor

 build fails with following error message 
 org.codehaus.plexus.resource.loader.ResourceNotFoundException: Could not 
 find resource 'dev-support/findbugs-exclude.xml'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7927) Two versions of netty with hadoop.profile=2.0: 3.5.9 and 3.2.4

2013-03-11 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7927:
---

Assignee: nkeywal

 Two versions of netty with hadoop.profile=2.0: 3.5.9 and 3.2.4
 --

 Key: HBASE-7927
 URL: https://issues.apache.org/jira/browse/HBASE-7927
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal

 I don't know why, but when you do a mvn dependency:tree, everything looks 
 fine. When you look at the generated target/cached_classpath.txt you see 2 
 versions of netty: netty-3.2.4.Final.jar and netty-3.5.9.Final.jar.
 This is bad and can lead to unpredictable behavior.
 I haven't looked at the other dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7327) Assignment Timeouts: Remove the code from the master

2013-03-11 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7327:
---

Resolution: Later
Status: Resolved  (was: Patch Available)

Since the final decision was to make this optional, the code is still there, to 
be removed later.

 Assignment Timeouts: Remove the code from the master
 

 Key: HBASE-7327
 URL: https://issues.apache.org/jira/browse/HBASE-7327
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch


 As per HBASE-7247...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

2013-03-11 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598749#comment-13598749
 ] 

nkeywal commented on HBASE-7247:


TimeOutManagement it now optional and deactivated by default. I will redo the 
measures.

 Assignment performances decreased by 50% because of 
 regionserver.OpenRegionHandler#tickleOpening
 

 Key: HBASE-7247
 URL: https://issues.apache.org/jira/browse/HBASE-7247
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0

 Attachments: 7247.v1.patch


 The regionserver.OpenRegionHandler#tickleOpening updates the region znode as 
 Do this so master doesn't timeout this region-in-transition..
 However, on the usual test, this makes the assignment time of 1500 regions 
 goes from 70s to 100s, that is, we're 50% slower because of this.
 More generally, ZooKeper commits to disk all the data update, and this takes 
 time. Using it to provide a keep alive seems overkill. At the very list, it 
 could be made asynchronous.
 I'm not sure how necessary these updates are required (I need to go deeper in 
 the internal, feedback welcome), but it seems very important to optimize 
 this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7938) Add integration test for various MapReduce workflows

2013-03-08 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597375#comment-13597375
 ] 

nkeywal commented on HBASE-7938:


I'm +1 of course.

I don't really know the IntegrationTestsDriver, I use maven to run the 
integration tests (I'm not saying it's a good thing, actually may be it's a bad 
thing). I will have a look.

If it has not changed recently, we don't start a mini map reduce cluster when 
we do a start-hbase not distributed. We should add it may be, to ease manual 
tests?
Following the work done in HBASE-7840, we could also add the start/stop of the 
map reduce part.

Then the integration tests dependending on map reduce could start a real 
cluster. For the path, I don't know: if we include the test code in HBase, it 
will have copied all other the place anyway, so we won't be able to check this. 
The code would have to be quite smart to explicitly ask maven to do nothing 
about it.




 Add integration test for various MapReduce workflows
 

 Key: HBASE-7938
 URL: https://issues.apache.org/jira/browse/HBASE-7938
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Reporter: Nick Dimiduk
 Fix For: 0.95.0, 0.98.0, 0.94.7


 We have existing unit tests for smoke-testing the packaged MR jobs, however 
 they do not create a runtime environment that is true to running on a real MR 
 cluster. This is particularly true in regard to classpaths (HBASE-7934) but 
 also other static state (HBASE-4802). An integration test that can be pointed 
 to run on a pseudo-distributed Hadoop deployed on localhost would find these 
 kinds of problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Attachment: 8002.v4.patch

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch, 8002.v4.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595736#comment-13595736
 ] 

nkeywal commented on HBASE-8002:


v4 is what I committed on trunk  0.95

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch, 8002.v4.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595737#comment-13595737
 ] 

nkeywal commented on HBASE-8002:


And thanks for the review!

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch, 8002.v4.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-07 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch, 8002.v4.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8023) Assembly target fails

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595996#comment-13595996
 ] 

nkeywal commented on HBASE-8023:


mvn clean package -DskipTests assembly:assembly -Dhadoop.profile=2.0
works here.

but
mvn clean package site -DskipTests assembly:assembly -Dhadoop.profile=2.0
does not
[INFO] Processing input file: book.xml
[INFO] Applying customization parameters
[INFO] Chunking output.
Recoverable error
org.xml.sax.SAXParseException: Include operation failed, reverting to fallback. 
Resource error reading file as XML 
(href='../../target/site/hbase-default.xml'). Reason: 
/home/liochon/dev/hbase/target/site/hbase-default.xml (No such file or 
directory)
Error on line 672 column 52 of 
file:///home/liochon/dev/hbase/src/docbkx/configuration.xml:
  Error reported by XML parser: An 'include' failed, and no 'fallback' element 
was found.
Error on line 70 column 85 of 
file:///home/liochon/dev/hbase/src/docbkx/book.xml:
  Error reported by XML parser: Error attempting to parse XML file 
(href='configuration.xml').
[INFO] 


I'm testing but I think it's not related to the 2.0 profile, just site.

 Assembly target fails
 -

 Key: HBASE-8023
 URL: https://issues.apache.org/jira/browse/HBASE-8023
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0
Reporter: Andrew Purtell

 The assembly target fails when using the 2.0 Hadoop profile (at least).
 {noformat}
 mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly
 [...]
 [INFO] --- maven-assembly-plugin:2.3:assembly (default-cli) @ hbase ---
 [INFO] Reading assembly descriptor: src/assembly/hadoop-two-compat.xml
 [WARNING] [DEPRECATION] moduleSet/binaries section detected in root-project 
 assembly.
 MODULE BINARIES MAY NOT BE AVAILABLE FOR THIS ASSEMBLY!
  To refactor, move this assembly into a child project and use the flag 
 useAllReactorProjectstrue/useAllReactorProjects in each moduleSet.
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-protocol:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-client:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-prefix-tree:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-hadoop-compat:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-hadoop2-compat:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-server:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-it:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-examples:jar:0.97-SNAPSHOT
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] HBase . FAILURE [15.877s]
 [INFO] HBase - Common  SUCCESS [4.633s]
 [INFO] HBase - Protocol .. SUCCESS [2.629s]
 [INFO] HBase - Client  SUCCESS [2.901s]
 [INFO] HBase - Prefix Tree ... SUCCESS [3.085s]
 [INFO] HBase - Hadoop Compatibility .. SUCCESS [2.647s]
 [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.005s]
 [INFO] HBase - Server  SUCCESS [1.888s]
 [INFO] HBase - Integration Tests . SUCCESS [6.917s]
 [INFO] HBase - Examples .. SUCCESS [2.815s]
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 6:41.503s
 [INFO] Finished at: Thu Mar 07 22:14:08 CST 2013
 [INFO] Final Memory: 67M/448M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-assembly-plugin:2.3:assembly (default-cli) on 
 project hbase: Failed to create assembly: Artifact: 
 org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT (included by module) does not 
 have an artifact with a file. Please ensure the package phase is run before 
 the assembly is generated. - [Help 1]
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8023) Assembly target fails

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596012#comment-13596012
 ] 

nkeywal commented on HBASE-8023:


Ah, I'm just seeing that you created HBASE-8022 for the site part. I confirm 
assembly w/o the site works here.
If the assembly w/o the site still fails for you, it's worth doing a mvn clean 
install -DskipTests first.

 Assembly target fails
 -

 Key: HBASE-8023
 URL: https://issues.apache.org/jira/browse/HBASE-8023
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0
Reporter: Andrew Purtell

 The assembly target fails when using the 2.0 Hadoop profile (at least).
 {noformat}
 mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly
 [...]
 [INFO] --- maven-assembly-plugin:2.3:assembly (default-cli) @ hbase ---
 [INFO] Reading assembly descriptor: src/assembly/hadoop-two-compat.xml
 [WARNING] [DEPRECATION] moduleSet/binaries section detected in root-project 
 assembly.
 MODULE BINARIES MAY NOT BE AVAILABLE FOR THIS ASSEMBLY!
  To refactor, move this assembly into a child project and use the flag 
 useAllReactorProjectstrue/useAllReactorProjects in each moduleSet.
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-protocol:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-client:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-prefix-tree:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-hadoop-compat:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-hadoop2-compat:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-server:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-it:jar:0.97-SNAPSHOT
 [INFO] Processing sources for module project: 
 org.apache.hbase:hbase-examples:jar:0.97-SNAPSHOT
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] HBase . FAILURE [15.877s]
 [INFO] HBase - Common  SUCCESS [4.633s]
 [INFO] HBase - Protocol .. SUCCESS [2.629s]
 [INFO] HBase - Client  SUCCESS [2.901s]
 [INFO] HBase - Prefix Tree ... SUCCESS [3.085s]
 [INFO] HBase - Hadoop Compatibility .. SUCCESS [2.647s]
 [INFO] HBase - Hadoop Two Compatibility .. SUCCESS [2.005s]
 [INFO] HBase - Server  SUCCESS [1.888s]
 [INFO] HBase - Integration Tests . SUCCESS [6.917s]
 [INFO] HBase - Examples .. SUCCESS [2.815s]
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 6:41.503s
 [INFO] Finished at: Thu Mar 07 22:14:08 CST 2013
 [INFO] Final Memory: 67M/448M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-assembly-plugin:2.3:assembly (default-cli) on 
 project hbase: Failed to create assembly: Artifact: 
 org.apache.hbase:hbase-common:jar:0.97-SNAPSHOT (included by module) does not 
 have an artifact with a file. Please ensure the package phase is run before 
 the assembly is generated. - [Help 1]
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8022) Site target fails

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596174#comment-13596174
 ] 

nkeywal commented on HBASE-8022:


+1. We should add this to our precommit tests imho.

 Site target fails
 -

 Key: HBASE-8022
 URL: https://issues.apache.org/jira/browse/HBASE-8022
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.0, 0.96.0
Reporter: Andrew Purtell
 Attachments: HBASE-8022.patch


 {noformat}
 mvn -DskipTests -Dhadoop.profile=2.0 clean install site assembly:assembly
 [...]
 Recoverable error
 org.xml.sax.SAXParseException: Include operation failed, reverting to 
 fallback. Resource error reading file as XML 
 (href='../../target/site/hbase-default.xml'). Reason: 
 /usr/src/Hadoop/hbase/target/site/hbase-default.xml (No such file or 
 directory)
 Error on line 672 column 52 of 
 file:///usr/src/Hadoop/hbase/src/docbkx/configuration.xml:
   Error reported by XML parser: An 'include' failed, and no 'fallback' 
 element was found.
 [INFO]
  
 [INFO] 
 
 [INFO] Skipping HBase
 [INFO] This project has been banned from the build due to previous failures.
 [INFO] 
 
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] HBase . FAILURE [5:34.980s]
 [INFO] HBase - Common  SKIPPED
 [INFO] HBase - Protocol .. SKIPPED
 [INFO] HBase - Client  SKIPPED
 [INFO] HBase - Prefix Tree ... SKIPPED
 [INFO] HBase - Hadoop Compatibility .. SKIPPED
 [INFO] HBase - Hadoop Two Compatibility .. SKIPPED
 [INFO] HBase - Server  SKIPPED
 [INFO] HBase - Integration Tests . SKIPPED
 [INFO] HBase - Examples .. SKIPPED
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 5:36.029s
 [INFO] Finished at: Thu Mar 07 21:59:14 CST 2013
 [INFO] Final Memory: 29M/297M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 com.agilejava.docbkx:docbkx-maven-plugin:2.0.14:generate-html (multipage) on 
 project hbase: Failed to transform configuration.xml. 
 org.xml.sax.SAXParseException: An 'include' failed, and no 'fallback' element 
 was found. - [Help 1]
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7989) Client with a cache info on a dead server will wait for 20s before trying another one.

2013-03-07 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596263#comment-13596263
 ] 

nkeywal commented on HBASE-7989:


Yes. There is a 20s timout for connect by default. And here there are two 
issues:
- we should be able to have a much lower timeout for connect as it doesn't 
depend on GC stuff and it's a clear error (we are sure that the action is not 
done on the server, contrary to a read or write timeout) 
- we should not even go to the server in some cases (we know it's dead).

 Client with a cache info on a dead server will wait for 20s before trying 
 another one.
 --

 Key: HBASE-7989
 URL: https://issues.apache.org/jira/browse/HBASE-7989
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal

 Scenario is:
 - fetch the cache in the client
 - a server dies
 - try to use a region that is on the dead server
 This will lead to a 20 second connect timeout. We don't have this in unit 
 test because we have this only is the remote box does not answer. In the unit 
 tests we have immediately a connection refused from the OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6772) Make the Distributed Split HDFS Location aware

2013-03-06 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595023#comment-13595023
 ] 

nkeywal commented on HBASE-6772:


The new design is better than my original proposition. I'm +1. Devaraj' comment 
is important as well imho, so we should put this info as well in ZK.
Just one point: the master should provide the full list of regionservers owning 
a copy. This way:
 - if one of the regionserver is actually dead it can be picked up by another 
one
 - it's possible to optimize the choice in the regionserver: if the RS sees 
it's the only one for a block it can pick it instead of another one that have 
more potential regionserver.
 - + the rack already mentioned by Devaraj.


 Make the Distributed Split HDFS Location aware
 --

 Key: HBASE-6772
 URL: https://issues.apache.org/jira/browse/HBASE-6772
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: Jeffrey Zhong

 During a hlog split, each log file (a single hdfs block) is allocated to a 
 different region server. This region server reads the file and creates the 
 recovery edit files.
 The allocation to the region server is random. We could take into account the 
 locations of the log file to split:
 - the reads would be local, hence faster. This allows short circuit as well.
 - less network i/o used during a failure (and this is important)
 - we would be sure to read from a working datanode, hence we're sure we won't 
 have read errors. Read errors slow the split process a lot, as we often enter 
 the timeouted world. 
 We need to limit the calls to the namenode however.
 Typical algo could be:
 - the master gets the locations of the hlog files
 - it writes it into ZK, if possible in one transaction (this way all the 
 tasks are visible alltogether, allowing some arbitrage by the region server).
 - when the regionserver receives the event, it checks for all logs and all 
 locations.
 - if there is a match, it takes it
 - if not it waits something like 0.2s (to give the time to other regionserver 
 to take it if the location matches), and take any remaining task.
 Drawbacks are:
 - a 0.2s delay added if there is no regionserver available on one of the 
 locations. It's likely possible to remove it with some extra synchronization.
 - Small increase in complexity and dependency to HDFS
 Considering the advantages, it's worth it imho.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

2013-03-06 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595034#comment-13595034
 ] 

nkeywal commented on HBASE-7590:


It's on RB, waiting for reviews before being committed :-).

 Add a costless notifications mechanism from master to regionservers  clients
 -

 Key: HBASE-7590
 URL: https://issues.apache.org/jira/browse/HBASE-7590
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7590.inprogress.patch, 7590.v1.patch, 
 7590.v1-rebased.patch, 7590.v2.patch, 7590.v3.patch


 t would be very useful to add a mechanism to distribute some information to 
 the clients and regionservers. Especially It would be useful to know globally 
 (regionservers + clients apps) that some regionservers are dead. This would 
 allow:
 - to lower the load on the system, without clients using staled information 
 and going on dead machines
 - to make the recovery faster from a client point of view. It's common to use 
 large timeouts on the client side, so the client may need a lot of time 
 before declaring a region server dead and trying another one. If the client 
 receives the information separatly about a region server states, it can take 
 the right decision, and continue/stop to wait accordingly.
 We can also send more information, for example instructions like 'slow down' 
 to instruct the client to increase the retries delay and so on.
  Technically, the master could send this information. To lower the load on 
 the system, we should:
 - have a multicast communication (i.e. the master does not have to connect to 
 all servers by tcp), with once packet every 10 seconds or so.
 - receivers should not depend on this: if the information is available great. 
 If not, it should not break anything.
 - it should be optional.
 So at the end we would have a thread in the master sending a protobuf message 
 about the dead servers on a multicast socket. If the socket is not 
 configured, it does not do anything. On the client side, when we receive an 
 information that a node is dead, we refresh the cache about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7327) Assignment Timeouts: Remove the code from the master

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593249#comment-13593249
 ] 

nkeywal commented on HBASE-7327:


Sorry [~jxiang] I didn't see you comment. I should not have said client, it's 
the regionserver code. I need to disable it because it's expensive as it 
updates ZooKeeper (that's HBASE-7247).

 Assignment Timeouts: Remove the code from the master
 

 Key: HBASE-7327
 URL: https://issues.apache.org/jira/browse/HBASE-7327
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 7327.v1.uncomplete.patch, 7327.v2.patch


 As per HBASE-7247...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)
nkeywal created HBASE-8002:
--

 Summary: Make TimeOut Management for Assignment optional in master 
and regionservers
 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0


As per HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Description: See HBASE-7327  (was: As per HBASE-7327)

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)
nkeywal created HBASE-8003:
--

 Summary: Threads#getBoundedCachedThreadPool harcodes the time unit 
to seconds
 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial


  /**
   * Create a new CachedThreadPool with a bounded number as the maximum 
   * thread size in the pool.
   * 
   * @param maxCachedThread the maximum thread could be created in the pool
   * @param timeout the maximum time to wait
   * @param unit the time unit of the timeout argument
   * @param threadFactory the factory to use when creating new threads
   * @return threadPoolExecutor the cachedThreadPool with a bounded number 
   * as the maximum thread size in the pool. 
   */
  public static ThreadPoolExecutor getBoundedCachedThreadPool(
  int maxCachedThread, long timeout, TimeUnit unit,
  ThreadFactory threadFactory) {
ThreadPoolExecutor boundedCachedThreadPool =
  new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
// allow the core pool threads timeout and terminate
boundedCachedThreadPool.allowCoreThreadTimeOut(true);
return boundedCachedThreadPool;
  }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Attachment: 8003.v1.patch

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593251#comment-13593251
 ] 

nkeywal commented on HBASE-8003:


trivial patch, as it's always used with seconds today.

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Fix Version/s: 0.98.0
Affects Version/s: 0.98.0
   Status: Patch Available  (was: Open)

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.98.0

 Attachments: 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593312#comment-13593312
 ] 

nkeywal commented on HBASE-8003:


Back to flakiness it seems. All this should be totally unrelated. Let's give it 
another go.

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.98.0

 Attachments: 8003.v1.patch, 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Status: Open  (was: Patch Available)

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.98.0

 Attachments: 8003.v1.patch, 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Attachment: 8003.v1.patch

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.98.0

 Attachments: 8003.v1.patch, 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8003:
---

Status: Patch Available  (was: Open)

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.98.0

 Attachments: 8003.v1.patch, 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Status: Patch Available  (was: Open)

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-8002:
---

Attachment: 8002.v3.patch

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8002) Make TimeOut Management for Assignment optional in master and regionservers

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593429#comment-13593429
 ] 

nkeywal commented on HBASE-8002:


Ok locally. But the return of the flakiness makes this a little bit 
complicated.

 Make TimeOut Management for Assignment optional in master and regionservers
 ---

 Key: HBASE-8002
 URL: https://issues.apache.org/jira/browse/HBASE-8002
 Project: HBase
  Issue Type: Bug
  Components: Client, master, Region Assignment
Affects Versions: 0.95.0, 0.98.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.95.0, 0.98.0

 Attachments: 8002.v3.patch


 See HBASE-7327

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8003) Threads#getBoundedCachedThreadPool harcodes the time unit to seconds

2013-03-05 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593634#comment-13593634
 ] 

nkeywal commented on HBASE-8003:


Tried locally, all passed at the first try excepted 
TestReplicationQueueFailover. This one succeeded the second time.

Committed, thanks for the review.

 Threads#getBoundedCachedThreadPool harcodes the time unit to seconds
 

 Key: HBASE-8003
 URL: https://issues.apache.org/jira/browse/HBASE-8003
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Fix For: 0.98.0

 Attachments: 8003.v1.patch, 8003.v1.patch


   /**
* Create a new CachedThreadPool with a bounded number as the maximum 
* thread size in the pool.
* 
* @param maxCachedThread the maximum thread could be created in the pool
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
* @param threadFactory the factory to use when creating new threads
* @return threadPoolExecutor the cachedThreadPool with a bounded number 
* as the maximum thread size in the pool. 
*/
   public static ThreadPoolExecutor getBoundedCachedThreadPool(
   int maxCachedThread, long timeout, TimeUnit unit,
   ThreadFactory threadFactory) {
 ThreadPoolExecutor boundedCachedThreadPool =
   new ThreadPoolExecutor(maxCachedThread, maxCachedThread, timeout,
 TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), threadFactory);
 // allow the core pool threads timeout and terminate
 boundedCachedThreadPool.allowCoreThreadTimeOut(true);
 return boundedCachedThreadPool;
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   5   6   7   8   9   10   >