[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-11-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146066#comment-13146066
 ] 

Hudson commented on HBASE-4552:
---

Integrated in HBase-0.92 #119 (See 
[https://builds.apache.org/job/HBase-0.92/119/])
HBASE-4740  [bulk load] the HBASE-4552 API can't tell if errors on region 
server are recoverable
   (Jonathan Hsieh)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/branches/0.92/src/main/resources/hbase-default.xml
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java


 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-11-07 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13146100#comment-13146100
 ] 

Hudson commented on HBASE-4552:
---

Integrated in HBase-TRUNK #2420 (See 
[https://builds.apache.org/job/HBase-TRUNK/2420/])
HBASE-4740  [bulk load] the HBASE-4552 API can't tell if errors on region 
server are recoverable
   (Jonathan Hsieh)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/resources/hbase-default.xml
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java


 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140212#comment-13140212
 ] 

ramkrishna.s.vasudevan commented on HBASE-4552:
---

In test case TestLoadIncrementalHFilesSplitRecovery
new HTable(tableName) is getting used.

{code}
+  LOG.info(Creating table  + table);
+  HTableDescriptor htd = new HTableDescriptor(table);
{code}
As part of HBASE-4253 it was found that using new HTableDescriptor(conf, 
tablename) is the best way.  Also check HBASE-4138(comment 25/Aug/11 09:19) for 
reference.
This will prevent the failure that happened in 
https://builds.apache.org/job/PreCommit-HBASE-Build/99/testReport/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFilesSplitRecovery/testBulkLoadPhaseRecovery/

Correct me if am wrong. 


 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140317#comment-13140317
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

@Ram, on trunk or 0.92 branches, HTableDescriptor(conf,tablename) doesn't seem 
to be in the api.  In patch v4, it seems like all the HTable constructors have 
been updated to explicitly take a the configuration reference.

I'm assuming you meant HTable? 

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140344#comment-13140344
 ] 

Ted Yu commented on HBASE-4552:
---

Integrated to 0.92 and TRUNK.

Thanks for the patch Jonathan.

Thanks for the review Todd.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140367#comment-13140367
 ] 

ramkrishna.s.vasudevan commented on HBASE-4552:
---

@Jon
Yes.. i was wrong...Sorry for that. So may be will chk the reason for failure 
once again..Thanks Jon

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140584#comment-13140584
 ] 

Hudson commented on HBASE-4552:
---

Integrated in HBase-TRUNK #2392 (See 
[https://builds.apache.org/job/HBase-TRUNK/2392/])
HBASE-4552  multi-CF bulk load is not atomic across column families 
(Jonathan Hsieh)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/resources/hbase-default.xml
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionServerBulkLoad.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-31 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140834#comment-13140834
 ] 

Hudson commented on HBASE-4552:
---

Integrated in HBase-0.92 #90 (See 
[https://builds.apache.org/job/HBase-0.92/90/])
HBASE-4552  multi-CF bulk load is not atomic across column families 
(Jonathan Hsieh)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/branches/0.92/src/main/resources/hbase-default.xml
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionServerBulkLoad.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java


 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-29 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139136#comment-13139136
 ] 

Hadoop QA commented on HBASE-4552:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12501415/hbase-4552.consolidated.v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -166 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.coprocessor.TestMasterObserver
  
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
  org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/95//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/95//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/95//console

This message is automatically generated.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-29 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139398#comment-13139398
 ] 

Hadoop QA commented on HBASE-4552:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12501459/hbase-4552.consolidated.v3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -166 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
  org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/99//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/99//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/99//console

This message is automatically generated.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-29 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139509#comment-13139509
 ] 

Hadoop QA commented on HBASE-4552:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12501481/hbase-4552.consolidated.v4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -166 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterFailover
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/101//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/101//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/101//console

This message is automatically generated.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-29 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139533#comment-13139533
 ] 

Ted Yu commented on HBASE-4552:
---

I verified that except for TestRegionServerCoprocessorExceptionWithXXX tests, 
the other failures were caused by too many open files.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch, hbase-4552.consolidated.v3.patch, 
 hbase-4552.consolidated.v4.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-28 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138918#comment-13138918
 ] 

Ted Yu commented on HBASE-4552:
---

After applying the consolidated patch, I got:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
(default-testCompile) on project hbase: Compilation failure: Compilation 
failure:
[ERROR] 
/Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java:[190,34]
 cannot find symbol
[ERROR] symbol  : method getTestDir(java.lang.String)
[ERROR] location: class org.apache.hadoop.hbase.HBaseTestingUtility
[ERROR] 
[ERROR] 
/Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java:[216,34]
 cannot find symbol
[ERROR] symbol  : method getTestDir(java.lang.String)
[ERROR] location: class org.apache.hadoop.hbase.HBaseTestingUtility
[ERROR] 
[ERROR] 
/Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java:[309,36]
 cannot find symbol
[ERROR] symbol  : method getTestDir(java.lang.String)
[ERROR] location: class org.apache.hadoop.hbase.HBaseTestingUtility
[ERROR] 
[ERROR] 
/Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionServerBulkLoad.java:[123,36]
 cannot find symbol
[ERROR] symbol  : method getTestDir(java.lang.String)
[ERROR] location: class org.apache.hadoop.hbase.HBaseTestingUtility
{code}
For TRUNK, there is no such error.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-28 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138977#comment-13138977
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

This was due to HBASE-4634 which got committed two days ago.  The old 
getTestDir was a public method and apparently was just removed.  This will 
probably break on trunk as well.

https://github.com/apache/hbase/commit/ed21cd6c4c266f610352d76d3d4b6f5cff492a97#src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java

I think this should be replaced with getDataTestDir calls (thats what the old 
bulk load test calls to getTestDir were changed to).

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-28 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139074#comment-13139074
 ] 

Hadoop QA commented on HBASE-4552:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12501415/hbase-4552.consolidated.v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -166 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestMultiParallel
  org.apache.hadoop.hbase.coprocessor.TestMasterObserver
  org.apache.hadoop.hbase.TestRegionRebalancing
  org.apache.hadoop.hbase.master.TestDefaultLoadBalancer
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/92//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/92//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/92//console

This message is automatically generated.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0

 Attachments: hbase-4552.consolidated.patch, 
 hbase-4552.consolidated.v2.patch


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-24 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134138#comment-13134138
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

Created recovery mechanism jira at HBASE-4652

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-23 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133677#comment-13133677
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

One more piece: Mechanism to atomically rollback if a partial failures 
encountered when attempting to bulk load multiple families.  

For example, let's say I want to bulk load a region with cfs A, B, C.  I issue 
a call to an RS region to atomically bulkload the HFiles.  The RS loads A and B 
successfully but fails on C (hdfs failure, or rs goes down, etc).  We should 
rollback A and B -- if we don't we would have A and B loaded but not C and have 
an atomicity violation.  






 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-23 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133679#comment-13133679
 ] 

Ted Yu commented on HBASE-4552:
---

In case of such faulty condition (hdfs failure), would it be easier if we 
record which column families encountered error and retry loading them after 
faulty condition recovers ?

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-23 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133696#comment-13133696
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

If we have an hdfs failure, we won't be able to record or update information 
about what failed. 
 
This make me think we need to journal/log the intended atomic actions.  Once we 
have the log, we can act depending on the situation:
* If we complete successfully, we remove/invalidate log and carry on.  
* If we fail (can't write, rs goes down and restarts), we check to see if 
everything is in.  If it isn't we rollback the subset of hfile loads that had 
happened.  If rollback fails, we still have the log, so we can try later or 
maybe we kill the RS?  

How about we make this a subtask/follow on jira.  The first cut will just 
detect the situation and  log error messages (similar to what currently 
happens).  A follow-on task will discuss and add/implement a recovery mechanism?


 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-23 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133701#comment-13133701
 ] 

Ted Yu commented on HBASE-4552:
---

It is fine to implement recovery in another JIRA. 

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-21 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133190#comment-13133190
 ] 

Jonathan Hsieh commented on HBASE-4552:
---

Plan

1) Test to show there is an atomicity problem.  Likely just does not use 
LoadIncrementalHFiles
2) Fix for the region server side.
3) Rewrite of LoadIncrementalHFiles so that it groups the proper HFiles into 
the new bulkLoadHFile calls.  This will likely have two parallel steps - the 
first gather enough info to group HFiles and then the second that attempts to 
bulk load.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Jonathan Hsieh
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-17 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129013#comment-13129013
 ] 

Todd Lipcon commented on HBASE-4552:


The trick is making sure it's atomic inside the region server - not just that 
the client sends all of the files for a given region in one RPC. If there are 
any concurrent scanners, then they should either see all of the new data or 
none of the new data on a given row. So we need some region-wide coordination. 
I think probably we have to take a write-lock on HRegion#lock

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125987#comment-13125987
 ] 

Ted Yu commented on HBASE-4552:
---

The above optimization for reducing calls to region server can be done in a 
seperate JIRA.
server.bulkLoadHFile() expects name of the region where HFile fits. Region name 
resolution needs to call conn.getRegionServerWithRetries().

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-12 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126044#comment-13126044
 ] 

Ted Yu commented on HBASE-4552:
---

Since LoadIncrementalHFiles uses ExecutorService to achieve parallelism, we 
should use QueuePairbyte[], String in place of List above so that 
concurrent queue can be instantiated.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-10 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124506#comment-13124506
 ] 

Ted Yu commented on HBASE-4552:
---

One solution is to add the following method to HRegionInterface:
{code}
  public void bulkLoadHFile(ListPairbyte[], String familyPaths, byte[] 
regionName)
  throws IOException;
{code}
familyPaths is a list of family, hfilePath pairs for the same region 
identified by regionName.

LoadIncrementalHFiles would need to group HFiles for the same region together 
before calling the above method.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124511#comment-13124511
 ] 

Todd Lipcon commented on HBASE-4552:


yep, that's what I meant, but the implementation isn't quite trivial since we 
have to do the locking at a higher level, so the change is made visible 
atomically.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-10 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124521#comment-13124521
 ] 

Ted Yu commented on HBASE-4552:
---

For HRegion, we can introduce the following new method:
{code}
  public void bulkLoadHFile(ListPairbyte[], String familyPaths)
  throws IOException {
{code}
where familyPaths is a list of family, hfilePath pairs identifying HFile path 
and the underlying family the HFile should be loaded to.

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4552) multi-CF bulk load is not atomic across column families

2011-10-10 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124660#comment-13124660
 ] 

Ted Yu commented on HBASE-4552:
---

Toward the end of LoadIncrementalHFiles.tryLoad() we can utilize startEndKeys 
of the underlying table to group (possibly split) HFiles by their first row 
keys.
Then the new bulkLoadHFile() method would be called by doBulkLoad().

 multi-CF bulk load is not atomic across column families
 ---

 Key: HBASE-4552
 URL: https://issues.apache.org/jira/browse/HBASE-4552
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.92.0
Reporter: Todd Lipcon
 Fix For: 0.92.0


 Currently the bulk load API simply imports one HFile at a time. With 
 multi-column-family support, this is inappropriate, since different CFs show 
 up separately. Instead, the IPC endpoint should take a of CF - HFiles, so we 
 can online them all under a single region-wide lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira