[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2016-01-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120874#comment-15120874
 ] 

Hadoop QA commented on HBASE-11368:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} HBASE-11368 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/latest/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12677543/hbase11368-master.patch
 |
| JIRA Issue | HBASE-11368 |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/331/console |


This message was automatically generated.



> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2016-01-14 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101219#comment-15101219
 ] 

Jerry He commented on HBASE-11368:
--

This ticket can be closed because the only sub-task would fix the problem. 
Right?

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-10-31 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983879#comment-14983879
 ] 

ramkrishna.s.vasudevan commented on HBASE-11368:


This comment 
https://issues.apache.org/jira/browse/HBASE-11368?focusedCommentId=14693166=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14693166
 is getting addressed as part of HBASE-13082.  So doing that JIRA would mean 
that any current on going scan will not be able to see the bulk loaded hfiles 
which is loaded just after the current scan has started. I think that behaviour 
should be acceptable, right?

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-10-31 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984102#comment-14984102
 ] 

Nick Dimiduk commented on HBASE-11368:
--

bq. any current on going scan will not be able to see the bulk loaded hfiles 
which is loaded just after the current scan has started. I think that behaviour 
should be acceptable, right?

I believe this should be correct behavior, yes.

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-10-07 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947820#comment-14947820
 ] 

Nick Dimiduk commented on HBASE-11368:
--

FYI, opened subtask HBASE-14575 to give [~devaraj]'s idea a spin. Mind having a 
look?

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-09-03 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730105#comment-14730105
 ] 

Nick Dimiduk commented on HBASE-11368:
--

Looks like HBASE-6028 wants to implement the meat of what I've proposed above. 
It also happens to be 2/3 of the work for HBASE-12446. Seems like good bang for 
the buck on this approach.

Chatting with [~enis] and [~devaraj] about this offline. Another idea is we can 
reduce the scope of when the read lock is held during compaction. In theory the 
compactor only needs a region read lock while deciding what files to compact 
and at the time of committing the compaction. We're protected from the case of 
region close events because compactions are checking (between every Cell!) if 
the store has been closed in order to abort in such a case. Is there another 
reason why we would want to hold the read lock for the entire duration of the 
compaction? [~stack] [~lhofhansl]?

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-09-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730283#comment-14730283
 ] 

stack commented on HBASE-11368:
---

[~ndimiduk] I like the idea of narrowing the lock scope but started to look and 
its a bit of a rats nest where locks are held (compactions checking on each row 
seems well dodgy... )  Yeah, a review of the attempt at undoing scanner locks 
so only a region-level lock sounds like it would help.
* 

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-09-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730140#comment-14730140
 ] 

Lars Hofhansl commented on HBASE-11368:
---

See also discussion in HBASE-13082.

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-09-02 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728155#comment-14728155
 ] 

Nick Dimiduk commented on HBASE-11368:
--

Check my facts:
# the only uses of the write lock are region close and bulkload, two rare 
events.
# the only long-running read lock holders are compactions, frequent events.

What if we allow write lock access requests to interrupt running compactions?

> Multi-column family BulkLoad fails if compactions go on too long
> 
>
> Key: HBASE-11368
> URL: https://issues.apache.org/jira/browse/HBASE-11368
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Qiang Tian
> Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
> key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch
>
>
> Compactions take a read lock.  If a multi-column family region, before bulk 
> loading, we want to take a write lock on the region.  If the compaction takes 
> too long, the bulk load fails.
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in 
> HBASE-10882 as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need 
> a full write lock when multiple column families?  Can we fail more gracefully 
> at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-20 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705298#comment-14705298
 ] 

Esteban Gutierrez commented on HBASE-11368:
---

Hey [~tianq] are you still working on this? 

Also I agree with [~enis] regarding ref-counting might be an alternative but I 
couldn't find the JIRA for that, any pointer [~lhofhansl]?

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-14 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696724#comment-14696724
 ] 

Enis Soztutar commented on HBASE-11368:
---

bq. How? Would the following case be true without the bulk load getting the 
region write lock?
We do not do this now, but in theory it can be done similarly to regular 
writes. 
 - obtain new seqId as a write transaction
 - bulk load all files across CFs with the seqId. 
 - advance mvcc read point only when all bulk loads are complete. 
This way the scanners are guaranteed to atomically observe the bulk loaded data 
atomically without the region-write-lock. 
bq. In the 0.98 code line, we don't have seqid, and the atomicity is still 
guaranteed there.
Yes. Not worth changing 0.98 line. 
bq. I think it is being propagated properly to the scanner. Think about the 
same notifyChangedReadersObservers is being used at the end of compaction and 
flushes as well. The reset of the readers should work.
I am not sure about that. Agreed that the cells at the store level will 
actually get re-ordered, but the heap at the region level is never re-ordered. 
So, after a bulk load, the ordering of store scanners at the region level might 
change, but the scanner will miss it if I understand this correctly. 
bq. Atomicity may be a false blanket considering HBASE-4652 is still unresolved.
Very good point. We need a transactional commit for the BL files. 

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-13 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695115#comment-14695115
 ] 

Enis Soztutar commented on HBASE-11368:
---

I was reading HBASE-4552 and RegionScannerImpl code again to try to understand 
why we need the write lock for multi-CF bulk loads in the first place. It seems 
that it was put there to ensure atomicity, but that should be guaranteed with 
the seqId / mvcc combination and not via region write lock. However, the bulk 
load files obtain a seqId, and acquiring the region write lock will block all 
flushes which may be the reason. On bulk load, we call 
HStore.notifyChangedReadersObservers(), which resets the KVHeap, but we never 
reset the RegionScanner from my reading of code. Is this a bug? The current 
scanners should not see the new bulk loaded data (via mvcc) so maybe it is ok? 

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-13 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696223#comment-14696223
 ] 

Nick Dimiduk commented on HBASE-11368:
--

Atomicity may be a false blanket considering HBASE-4652 is still unresolved.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-13 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695884#comment-14695884
 ] 

Jerry He commented on HBASE-11368:
--

bq.   but that should be guaranteed with the seqId / mvcc combination and not 
via region write lock.
How?  Would the following case be true without the bulk load getting the region 
write lock?
a.  the bulk load obtain a seqId
b. a read request comes in and gets the seqId as mvcc.
c. The read will be able to see the partially loaded data while the bulk is 
still in process

In the 0.98 code line, we don't have seqid, and the atomicity is still 
guaranteed there.

bq. On bulk load, we call HStore.notifyChangedReadersObservers(), which resets 
the KVHeap, but we never reset the RegionScanner from my reading of code. Is 
this a bug?
I think it is being propagated properly to the scanner.  Think about the same 
notifyChangedReadersObservers is being used at the end of compaction and 
flushes as well.  The reset of the readers should work.

I think the region write lock is still the only guarantee for bulk load 
atomicity.  On the high level, the region scan and next calls are within the 
region read lock, which is mutually elusive with bulk load process which needs 
the region write lock.  This is heavy.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-13 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696335#comment-14696335
 ] 

Jerry He commented on HBASE-11368:
--

You are right, Nick.  That is the unsolved issue.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693166#comment-14693166
 ] 

Enis Soztutar commented on HBASE-11368:
---

My concern with the patch is that it is acquiring yet another lock per get/scan 
on top of the already existing ones. Agreed that the region close lock is 
abused here for multi-CF bulkloads and have to be fixed. 

I believe the actual long term solution to this is to do ref-counting to Store 
files in the store, and have the store file list per scan immutable. Then we do 
not need the costly mechanism for keeping the store files updated between 
KVHea, scanner and store file list ({{notifyChangedReadersObservers}}). leveldb 
is doing ref counting for files I believe. [~lhofhansl] you had a jira for 
this? 

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2015-08-11 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682493#comment-14682493
 ] 

Stephen Yuan Jiang commented on HBASE-11368:


[~tianq] and [~stack], any update or concern on this patch?  We have a customer 
seeing this issue recently.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186581#comment-14186581
 ] 

Hadoop QA commented on HBASE-11368:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12677543/hbase11368-master.patch
  against trunk revision .
  ATTACHMENT ID: 12677543

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3781 checkstyle errors (more than the trunk's current 3780 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestHCM
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11489//console

This message is automatically generated.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, 
 key_stacktrace_hbase10882.TXT, performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-25 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184066#comment-14184066
 ] 

Qiang Tian commented on HBASE-11368:


the attachments:
{{key_stacktrace_hbase10882.TXT}} : the problem stacktrace
{{hbase-11368-0.98.5.patch}} : the fix
{{performance_improvement_verification_98.5.patch}}: the testcase to verify 
performance improvement




 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, key_stacktrace_hbase10882.TXT, 
 performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-24 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182647#comment-14182647
 ] 

Qiang Tian commented on HBASE-11368:


Hi [~stack], [~apurtell],
any comments?
thanks!


 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, 
 performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183019#comment-14183019
 ] 

stack commented on HBASE-11368:
---

Is this meant to be in the patch?

1449LOG.info(###compaction get the closelock, sleep 20s to simulate 
slow compaction);
1450try {
1451  Thread.sleep(2);
1452} catch (InterruptedException e) {
1453  LOG.info(###sleep interrupted);
1454}

What change did you do [~tianq]?

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch, 
 performance_improvement_verification_98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-23 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181168#comment-14181168
 ] 

Qiang Tian commented on HBASE-11368:


initial YCSB test:

Env:
---
hadoop 2.2.0
YCSB 1.0.4(Andrew's branch)
3 nodes, 1 master, 2 RS  //ignore cluster details since just to evaluate the 
new lock

Steps:
---
Followed Andrew's steps(see http://search-hadoop.com/m/DHED4hl7pC/)
the seed table has 3 CFs, pre-split to 20 regions
load 1 million rows to CF 'f1', using workloada
run 3 iterations for workloadc and workloada respectively. the parameter in 
each run:
bq. -p columnfamily=f1 -p operationcount=100 -s -threads 10


Results:
---
0.98.5:
workload c:
[READ], AverageLatency(us), 496.225811
[READ], AverageLatency(us), 510.206831
[READ], AverageLatency(us), 501.256123

workload a:
[READ], AverageLatency(us), 676.4527555821747
[READ], AverageLatency(us), 622.5544771452717
[READ], AverageLatency(us), 628.1365657163067


0.98.5+patch:
workload c:
[READ], AverageLatency(us), 536.334437
[READ], AverageLatency(us), 508.40
[READ], AverageLatency(us), 491.416182


workload a:
[READ], AverageLatency(us), 640.3625218319231
[READ], AverageLatency(us), 642.9719823488798
[READ], AverageLatency(us), 631.7491770928287

looks little performance penalty.

I also ran PE in the cluster, since the test table has only 1 CF, the new lock 
is actually not used. interestingly, with the patch the performance is even a 
bit better...

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-14 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170649#comment-14170649
 ] 

Qiang Tian commented on HBASE-11368:


it looks to me the patch could show the value only when there is long 
compaction + gets/scans,  not sure if [~victorunique] wants to try it in some 
test env?
thanks.


 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian
 Attachments: hbase-11368-0.98.5.patch


 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-10 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166466#comment-14166466
 ] 

Qiang Tian commented on HBASE-11368:


update: 
the idea will cause deadlock since bulkload and scanner follow different orders 
to acquire bulkload lock and StoreScanner.lock. will look at if we could lower 
the granularity of storescanner lock.


 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-09 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164957#comment-14164957
 ] 

Qiang Tian commented on HBASE-11368:


Thanks [~jinghe],
is it right way to run the bulkload test? {{mvn test 
-Dtest=TestHRegionServerBulkLoad}}
the test is supposed to run for 5 minutes, but only after about 1 minutes then 
it exits. is it expected?

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-09 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165411#comment-14165411
 ] 

Jerry He commented on HBASE-11368:
--

{code}
  /**
   * Atomic bulk load.
   */
  @Test
  public void testAtomicBulkLoad() throws Exception {
String TABLE_NAME = atomicBulkLoad;

int millisToRun = 3;
{code}

This test case is 30 sec.

{code}
  /**
   * Run test on an HBase instance for 5 minutes. This assumes that the table
   * under test only has a single region.
   */
  public static void main(String args[]) throws Exception {
{code}

main is not invoked during JUnit run.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Qiang Tian

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-08 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163162#comment-14163162
 ] 

Qiang Tian commented on HBASE-11368:


ideas for lowering down the lock granularity(based on 0.98.5 code base)
1)read/scan 
is it the primary goal for atomic multi-CF bulkload in HBASE-4552?

After DefaultStoreFileManager#storefiles is updated in HStore#bulkLoadHFile, 
notifyChangedReadersObservers is called to reset the StoreScanner#heap,  so 
checkReseek-resetScannerStack will be triggered in next scan/read to recreate 
store scanners based on new storefiles.

so we could introduce a new region level rwlock multiCFLock,  
HRegion#bulkLoadHFiles acquires the writelock before multi-CF 
HStore.bulkLoadHFile call. and StoreScanner#resetScannerStack acquires the 
readlock. this way the scanners are recreated after all CFs' store files are 
populated.

2)split region.
the region will be closed in SplitTransaction#stepsBeforePONR, which falls into 
the HRegion#lock protection area. bulk load still still need to acquire its 
readlock at start.

3) memstore flush.
we flush to a new file which is not related to the loaded files.

4)compaction.
the compaction is performed store by store. if bulkload inserts new files to 
{{storefiles}} during the selectCompaction process, the file list to be 
compacted might be impacted. e.g., the compaction for some CF do not include 
new loaded files, while others might include. but this does not impact the data 
integrity and read behavior?
at the end of compaction,  {{storefiles}} access is still protected by 
HStore#lock if there is bulk load change to the same CF.

comments?
thanks















 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-08 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163697#comment-14163697
 ] 

Jerry He commented on HBASE-11368:
--

Hi, [~tianq]

The idea may be feasible:

bulk load begins:  acquire region read lock + new bulk load write lock.
Scan//next begins:  acquire region read lock + new bulk load read lock.
Other region operations:  only acquire region read lock  

This will save the compaction and bulk load from blocking each other.

Do you mind drafting a patch and run thru the test suite?



 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-08 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163705#comment-14163705
 ] 

Jerry He commented on HBASE-11368:
--

bq. Other region operations: only acquire region read lock
-- Other region operations: only acquire region read or write lock, no change 
from existing behavior.

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-10-07 Thread Qiang Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163024#comment-14163024
 ] 

Qiang Tian commented on HBASE-11368:


As [~stack] mentioned in http://search-hadoop.com/m/DHED4NR0wT, the 
HRegion#lock is to protect region close. the comments in HRegion.java and the 
fact that only HRegion#doClose locks the writelock(if we do not consider 
HRegion#startBulkRegionOperation) also show that.

so using HRegion#lock to protect multi-CF bulkload in HBASE-4552 looks too 
heavy-weight?
from the stacktrace of HBASE-10882, all the read/scan are blocked since 
bulkload is waiting for lock.writelock, however compaction already acquired 
lock.readlock and is reading data, a time-consuming operation.

and related topic is discussed again in http://search-hadoop.com/m/DHED4I11p31. 
perhaps we need another region level lock.











 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long

2014-06-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033409#comment-14033409
 ] 

stack commented on HBASE-11368:
---

This is an old issue, 
http://search-hadoop.com/m/0AGoj1C1AXY/org.apache.hadoop.hbase.RegionTooBusyException%253A+failed+to+get+a+lock+in+6mssubj=Bulk+loading+HFiles+via+LoadIncrementalHFiles+fails+at+a+region+that+is+being+compacted+a+bug+
 

 Multi-column family BulkLoad fails if compactions go on too long
 

 Key: HBASE-11368
 URL: https://issues.apache.org/jira/browse/HBASE-11368
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Compactions take a read lock.  If a multi-column family region, before bulk 
 loading, we want to take a write lock on the region.  If the compaction takes 
 too long, the bulk load fails.
 Various recipes include:
 + Making smaller regions (lame)
 + [~victorunique] suggests major compacting just before bulk loading over in 
 HBASE-10882 as a work around.
 Does the compaction need a read lock for that long?  Does the bulk load need 
 a full write lock when multiple column families?  Can we fail more gracefully 
 at least?



--
This message was sent by Atlassian JIRA
(v6.2#6252)