[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626503#comment-13626503
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-TRUNK #4045 (See 
[https://builds.apache.org/job/HBase-TRUNK/4045/])
HBASE-8031 Document for hbase.rpc.timeout (Revision 1465887)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/hbase-common/src/main/resources/hbase-default.xml


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.98.0, 0.94.6, 0.95.0

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626740#comment-13626740
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #489 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/489/])
HBASE-8031 Document for hbase.rpc.timeout (Revision 1465887)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-common/src/main/resources/hbase-default.xml


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.98.0, 0.94.6, 0.95.0

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623201#comment-13623201
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456284)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.6

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603097#comment-13603097
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-0.94-security #124 (See 
[https://builds.apache.org/job/HBase-0.94-security/124/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456284)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.6

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601634#comment-13601634
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-TRUNK #3955 (See 
[https://builds.apache.org/job/HBase-TRUNK/3955/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456078)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601704#comment-13601704
 ] 

Hudson commented on HBASE-8031:
---

Integrated in hbase-0.95 #71 (See 
[https://builds.apache.org/job/hbase-0.95/71/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456079)

 Result = SUCCESS
enis : 
Files : 
* 
/hbase/branches/0.95/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601738#comment-13601738
 ] 

Enis Soztutar commented on HBASE-8031:
--

Will also commit this to 0.94, since it is only a new test. Patch coming 
shortly. 

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601828#comment-13601828
 ] 

Hadoop QA commented on HBASE-8031:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12573603/hbase-8031_v1-0.94.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 4 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4803//console

This message is automatically generated.

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. 

[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601857#comment-13601857
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #446 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/446/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456078)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601910#comment-13601910
 ] 

Hudson commented on HBASE-8031:
---

Integrated in hbase-0.95-on-hadoop2 #25 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/25/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456079)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/branches/0.95/hbase-it/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601948#comment-13601948
 ] 

Hudson commented on HBASE-8031:
---

Integrated in HBase-0.94 #897 (See 
[https://builds.apache.org/job/HBase-0.94/897/])
HBASE-8031 Adopt goraci as an Integration test (Revision 1456284)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/test/IntegrationTestBigLinkedList.java


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1-0.94.patch, hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600388#comment-13600388
 ] 

stack commented on HBASE-8031:
--

[~enis] That is excellent.

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600446#comment-13600446
 ] 

Hadoop QA commented on HBASE-8031:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573277/hbase-8031_v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4782//console

This message is automatically generated.

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that 

[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600462#comment-13600462
 ] 

stack commented on HBASE-8031:
--

+1 on commit.  Make a nice release note from the class comment.  I should add a 
note in refguide tooo...

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600506#comment-13600506
 ] 

Keith Turner commented on HBASE-8031:
-

Cool to see HBASE adopt this test.  I am not positive, but it seems like this 
patch contains the change that we had a discussion [1] about on github.  I am 
still conceptually opposed to this change.  I think in some situations a mapper 
rewritting the same data, because the task failed previously, could cover up 
the fact that data was lost in Hbase/Accumulo.  Since I created the test to 
detect data loss, the change bothers me a bit.  Granted the situation seems 
unlikely, but when running the test on a large clusters the unlikely sometimes 
becomes likely.

[1] : 
https://github.com/enis/goraci/commit/c320c50f5a5c562a13fa7a77b8da46c4e65e4f41#commitcomment-1489337



 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600581#comment-13600581
 ] 

Enis Soztutar commented on HBASE-8031:
--

Thanks Keith for chiming in. 
bq. I am not positive, but it seems like this patch contains the change that we 
had a discussion [1] about on github.
Indeed. Your concerns are very valid. But I find this implementation simpler.
bq. I think in some situations a mapper rewritting the same data, because the 
task failed previously, could cover up the fact that data was lost in 
Hbase/Accumulo. Since I created the test to detect data loss, the change 
bothers me a bit. Granted the situation seems unlikely...
In case of data loss in HBase, two things should happen to cause rewriting the 
lost data: 
 (1) Some of the map tasks should also fail. 
 (2) Failed map task should contain the rows that are lost. 

(1) and RS data loss should be independent, and we can rely on (1) to not 
happen that often. Even we lose a node with RS + TT, since the region and data 
distribution is balanced, we should be detecting data loss, from the data in 
the RS coming from other TT's.
(2) This is also highly unlikely.  Plus, for Loop, we are writing data 
incrementally. This means you can only rewrite the data within the same 
iteration. Verify step, verifies all the data, not just in the same iteration. 
Having a large loop count should further mitigate this problem. 

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, 

[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600639#comment-13600639
 ] 

Keith Turner commented on HBASE-8031:
-

I though of two example bugs rewriting could possibly masquerade.  I am sure 
there are more.

Client code reports data as flushed when its not really flushed, its actually 
still flushing data in background threads out of order.   If client dies and 
you verify, you may see lost data.  If you always rewrite data after client 
dies you may never see this flush bug.  Maybe close really does flush 
everything properly, so under normal runs you never lose data.  Data loss does 
not always happen on RS, could happen in client code.  We run continuous ingest 
for a long period and then just kill all of the clients.  This example makes me 
think we should kill ingest clients more frequently as part of testing (in 
addition to killing various server processes).

3 out 4 racks are accidentally powercycled by admins during test.   Tiny bit of 
data is lost, alot of data is rewritten, may cover up data loss.


 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600643#comment-13600643
 ] 

Jean-Marc Spaggiari commented on HBASE-8031:


Can't we check the number of versions for the entries and make sure they have 
the same number into the same bucket? That way, even if we re-write a missing 
row, we should still be able to detect that it was missed previously?

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600737#comment-13600737
 ] 

Enis Soztutar commented on HBASE-8031:
--

bq. Client code reports data as flushed when its not really flushed, its 
actually still flushing data in background threads out of order. If client dies 
and you verify, you may see lost data
This is an interesting test case. However, in the test, we do not track what is 
flushed and what is not. Even if we kill clients intentionally, triggering this 
kind of potential bug seems hard. 
We have a test called LoadTestTool, in that test, a bunch of writer thread keep 
flushing data, and send the info about what is flushed to the reader threads, 
which in turn query. This seems like a more appropriate test for testing client 
side flush semantics. 
bq. Can't we check the number of versions for the entries and make sure they 
have the same number into the same bucket? 
A client might die, before writing everything. Within the same bucket, some of 
the rows will have 2 versions, while the rest will have 1.

Regardless, I think we can go with this patch, and implement your suggestions 
in a follow up. We have to abandon relying on exact row count, and enable 
UNREFERENCED rows lying around, I think I can live with it. 

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7

 Attachments: hbase-8031_v1.patch


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-07 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596334#comment-13596334
 ] 

Nick Dimiduk commented on HBASE-8031:
-

+1

What's the plan of attack here, port the code and remove dependence on Gora?

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596361#comment-13596361
 ] 

Enis Soztutar commented on HBASE-8031:
--

bq. What's the plan of attack here, port the code and remove dependence on Gora?
Indeed. I have a working patch, will attach shortly. 

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596664#comment-13596664
 ] 

Andrew Purtell commented on HBASE-8031:
---

{quote}
bq. What's the plan of attack here, port the code and remove dependence on Gora?
Indeed
{quote}
+1

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8031) Adopt goraci as an Integration test

2013-03-07 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596774#comment-13596774
 ] 

Anoop Sam John commented on HBASE-8031:
---

Thanks Enis for bringing this up. We were also in need for a testing framework 
like this. Looking forward for your patch ..

 Adopt goraci as an Integration test
 ---

 Key: HBASE-8031
 URL: https://issues.apache.org/jira/browse/HBASE-8031
 Project: HBase
  Issue Type: Improvement
  Components: test
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.95.0, 0.98.0, 0.94.7


 As you might know, I am a big fan of the goraci test that Keith Turner has 
 developed, which in turn is inspired by the Accumulo test called Continuous 
 Ingest. 
 As much as I hate to say it, having to rely on gora and and external github 
 library makes using this lib cumbersome. And lately we had to use this for 
 testing against secure clusters and with Hadoop2, which gora does not support 
 for now. 
 So, I am proposing we add this test as an IT in the HBase code base so that 
 all HBase devs can benefit from it.
 The original source code can be found here:
  * https://github.com/keith-turner/goraci
  * https://github.com/enis/goraci/
 From the javadoc:
 {code}
 Apache Accumulo [0] has a simple test suite that verifies that data is not
  * lost at scale. This test suite is called continuous ingest. This test runs
  * many ingest clients that continually create linked lists containing 25
  * million nodes. At some point the clients are stopped and a map reduce job 
 is
  * run to ensure no linked list has a hole. A hole indicates data was lost.··
  *
  * The nodes in the linked list are random. This causes each linked list to
  * spread across the table. Therefore if one part of a table loses data, then 
 it
  * will be detected by references in another part of the table.
  *
 Below is rough sketch of how data is written. For specific details look at
  * the Generator code.
  *
  * 1 Write out 1 million nodes· 2 Flush the client· 3 Write out 1 million that
  * reference previous million· 4 If this is the 25th set of 1 million nodes,
  * then update 1st set of million to point to last· 5 goto 1
  *
  * The key is that nodes only reference flushed nodes. Therefore a node should
  * never reference a missing node, even if the ingest client is killed at any
  * point in time.
  *
  * Some ASCII art time:
  * [ . . . ] represents one batch of random longs of length WIDTH
  *
  *_
  *   |  __ |
  *   | |  ||
  * __+_+_ ||
  * v v v |||
  * first   = [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * prev= [ . . . . . . . . . . . ]   |||
  * ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ |||
  * | | | | | | | | | | | |||
  * current = [ . . . . . . . . . . . ]   |||
  *   |||
  * ...   |||
  *   |||
  * last= [ . . . . . . . . . . . ]   |||
  * | | | | | | | | | | |-|||
  * | |||
  * |___|
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira