[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2021-06-25 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369296#comment-17369296
 ] 

Viraj Jasani commented on HBASE-23349:
--

{quote} the issue already fixed in 1.7.0  released on 2021/06/12? It is still 
marked as UNRESOLVED.
{quote}
This Jira is not yet resolved. Any Jira that is resolved is always marked 
"Resolved" with fix versions that indicate which HBase releases the Jira 
fix/improvement has landed on.

[~larry1285] how long have you been facing this issue? Are you getting this log 
for same HFile for long time or for different HFiles? Are you using any custom 
coprocs that might be leaking refCounts? Which HBase version are you using?

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Priority: Major
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2021-06-24 Thread Chengliang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17369205#comment-17369205
 ] 

Chengliang commented on HBASE-23349:


Is the issue already fixed in 1.7.0  released on 2021/06/12? It is still marked 
as UNRESOLVED.
I faced exactly the same error.
{code:java}
regionserver.HStore - Can't archive compacted file 
hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
 because of either isCompactedAway=true or file has reference, 
isReferencedInReads=true, refCount=1, skipping for now.{code}
Thanks you so much for the clarification.

Best & Regards,
Larry

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-02-13 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036467#comment-17036467
 ] 

Andrew Kyle Purtell commented on HBASE-23349:
-

We are releasing 1.6.0 now because of HBASE-23825, moving to 1.7.0.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-29 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026433#comment-17026433
 ] 

ramkrishna.s.vasudevan commented on HBASE-23349:


I will list out my points here
-> Generally the scanners created at the Region level due to incoming user 
requests have a lease mechanism. It is basically to release the resources. So 
any long running scan there will be a point of end for the scan or may be even 
if the user
is not responsive or not closing the scan we have the lease mechanism. In any 
such case the scanners gets closed and the resources get released. 
-> The case where it cannot happen is when CPs create their own scanners. Then 
there are chances that if the CP scanner fails or does not release the 
resources we may hold up the underlying resources and even recover lease will 
not work
-> One way to solve this is to have a lease mechanism for CP scans also so that 
we don end up in scans being alive for a longer time.
-> Coming to the benefit of the ref count based mechanism, it solves the sync 
block issues which was happening for every next call. But not only that we have 
two other benefits
  
  For the first benefit, pls refer to this user mailing list
  
http://apache-hbase.679495.n3.nabble.com/Extremely-long-flush-times-td4104190.html

  
http://mail-archives.apache.org/mod_mbox/hbase-dev/201208.mbox/%3c6548f17059905b48b2a6f28ce3692baa0ce29...@oaexch4server.oa.oclc.org%3E
 (see the later part of it).

  We are helping the readers to carry on without the impact of flushes and the 
reverse way where flushes are not blocked due to readers. Here the scans are 
heavier where either you have filters applied, more deletes/versions to skip 
through.
  In such cases having a non sync way of readers always helps.

  The other benefit is that, the current readers need not reset itself, load 
the new files(after compaction which may not be in cache), reseek to the last 
fetched row and then again proceed with the scan. Obviously it means that 
  the next scan that comes will have to anyway read from the filesystem and 
then load to the cache but atleast the ongoing scans are not impacted. 

-> Finally I would like to mention that Phoenix like cases where there could be 
a query that reads large amount of data and has filtering applied along with 
heavy writes, it may be obvious that we may face the issues
as the users have faced in the mailing list. 

I am fine if every body agrees to revert the patch and put back the sync way of 
readers (or any other better soln). Just saying because it should be giving a 
view that am in favour of the exisiting behaviour and against any changes to it.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-27 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024863#comment-17024863
 ] 

Andrew Kyle Purtell commented on HBASE-23349:
-

Some regions are hot, with readers mostly always active, and also taking 
writes, enough to generate flush and compaction activity.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-27 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024855#comment-17024855
 ] 

ramkrishna.s.vasudevan commented on HBASE-23349:


bq.Now the issue is that if there are readers active on a region always it will 
never be allowed to discharge compacted files
Ya if readers are active discharger will never come into play. Is there any 
specific reason why readers are always active? 
Are you suggesting we go back to the locking way only? If not then atleast i 
prefer the approach here where CPs can make the store scanner reset itself on 
compaction. 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-27 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024520#comment-17024520
 ] 

Andrew Kyle Purtell commented on HBASE-23349:
-

And just to be clear we are seeing this issue in production so it is not only a 
theoretical concern. There really are regions in production where refcount is 
nonzero for so long that failure to discharge compacted files is a performance 
and operational issue, leading to incidents. 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-27 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024517#comment-17024517
 ] 

Andrew Kyle Purtell commented on HBASE-23349:
-

Phoenix is a red herring now. 

There was some past issue with leaks which is why we added the metric to make 
ref counts visible. 

Now the issue is that if there are readers active on a region always it will 
never be allowed to discharge compacted files. That is an HBase level problem 
for certain. We looked at solutions and without bringing some locking back none 
of the solutions are safe. So here we are. 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-26 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024085#comment-17024085
 ] 

ramkrishna.s.vasudevan commented on HBASE-23349:


[~stack], [~larsh], [~vjasani]
Just my thoughts here. Seems currently this ref counting issue happens due to 
Phoenix not able to close the scanners properly in some cases. It is not coming 
out of hbase. Correct me if am wrong here [~vjasani]. 
Next thing is that even recently a 1.3.0 user had faced issue with sync blocks 
exactly the issue HBASE-13082 was trying to solve because the user had rows to 
read with lots of deletes and frequent flushes/compactions were going on. For 
all such cases this non sync way will help them.
Also I believe some of the features like external compaction may depend or make 
use of this async way of removing the compacted files rather than as part of 
the scan flow.
The code fix may become more uglier if we keep adding another boolean and have 
two code paths either to do scanner reset as part of scans or just allow the 
scanner to continue as is and do the async way of removals. 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-25 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023559#comment-17023559
 ] 

Anoop Sam John commented on HBASE-23349:


The ref counting way of lazy deletion of compacted files was done primarily to 
avoid the syncronized blocks in the read path.  On that jira, the perf test 
results were with removal of all such i believe.  Later we had to fix one issue 
wrt memstore flush during the read and as part of that a volatile bool lookup 
came in the read path (during seek, next etc).  Am not sure whether some perf 
reports been taken after that. and now we were trying to add another volatile 
boolean.
Now it might be good to do a perf test comparing
current way (no sync blocks but with a volatile bool read)  VS change to old 
way with sync blocks - ref counting.
Not sure how easy/difficult it will be.
[~vjasani] You have some bandwidth for this?  Lets have an offline detailed 
discussion if so.  Can explain the old way before the ref counting came in (If 
u wish to do so)

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-23 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022411#comment-17022411
 ] 

Michael Stack commented on HBASE-23349:
---

[~anoop.hbase] / [~ram_krish] Any comments lads?

What we going to do w/ this one? Seems nasty. The refcounting is nice. Would be 
pity to undo it. Wonder if other repercussions than this issues's when refcount 
doesn't go to zero.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015365#comment-17015365
 ] 

Lars Hofhansl commented on HBASE-23349:
---

Minor nit: This is not lock coarsening. That was the failed I attempt I had to 
reduce the frequency of taking memory barriers (the locks were almost never 
contended), by pushing the locking up the stack into the region scanner.
[~ram_krish] and [~anoop.hbase] then came up with an actual solution :), but 
that then required the reference counting.
Note that the numbers on HBASE-13082, where with the lock coarsening, not with 
reference counting.

At this point my concern is just about correctness and the issues we have seen 
with reference counting. It is generally very hard to retrofit reference 
counting into a large, complex system. Ram and Anoop did an awesome job! 
Perhaps HBase is just too complex to add this reliably.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-13 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014683#comment-17014683
 ] 

Viraj Jasani commented on HBASE-23349:
--

Sure [~apurtell] I will get back on this in some days and yes agree that "the 
scope of change will be less and reviews will be smoother and risk will be 
lower".

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-13 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014488#comment-17014488
 ] 

Andrew Kyle Purtell commented on HBASE-23349:
-

If you can find an acceptable solution short of reintroducing locking it will 
go better for everyone because the scope of change will be less and reviews 
will be smoother and risk will be lower. So please have at it. That said, if we 
have reached the end of the road with the "storefile lock coarsening" then we 
need to recognize this and avoid the sunk cost fallacy.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-10 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013344#comment-17013344
 ] 

Viraj Jasani commented on HBASE-23349:
--

I agree on considering removal of refCount system based on your suggestions 
[~apurtell] [~larsh]

However, I am just trying to give one chance to consider both points:
 # Perf improvement as part of HBASE-13082
 # Scanner reset during compaction if required(config based)

Tried to use volatile enum(NONE, FLUSH, COMPACTION) instead of 2 volatile 
booleans for Scanner.next(), seek() calls to not let perf degrade for normal 
scans. Hence, if archival is not happening correctly, we can notify open 
scanners and reset KV Heap in the next(), seek() runs. However, whether next() 
has to reset KVHeap is something that can be determined based on volatile enum 
value which would be set while notifying scanners.

[https://github.com/apache/hbase/pull/939] with some tests for Scanner reset 
during compaction and successful archival thereafter.

Considering refCount presence in HBase for some time, someone might have 
started building some system(alert, recovery etc) based on refCount usecase. In 
fact, we have also done auto region reopen etc but other users might have built 
some other usecases too.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-10 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013255#comment-17013255
 ] 

Andrew Kyle Purtell commented on HBASE-23349:
-

{quote}
I think we should step back and remember why we have the ref counting in the 
first place. This came from a discussion started in HBASE-13082 and 
HBASE-10060, namely too much synchronization.

If any changes we make now needs new synchronization in the scanner.next(...) 
path we're back to where we started and in that case we should remove the ref 
counting and bring back the old notification and scanner switching we had 
before.
{quote}

I made a similar comment on an internal discussion yesterday. If we have to 
walk back the StoreScanner "lock coarsening" work, then let's not be afraid to 
do it. There is a nuanced decision we would have to make, but let's not be 
concerned about sunk costs. 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-06 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008612#comment-17008612
 ] 

Viraj Jasani commented on HBASE-23349:
--

I have incorporated some reviews in the linked PR(#939). Please review

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-01 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006485#comment-17006485
 ] 

Lars Hofhansl commented on HBASE-23349:
---

Sure.

[~ram_krish], [~anoop.hbase], FYI. I know you guys invested a lot of time in 
this. In light of the issues I'm in favor removing the refcounting code and 
restoring the old behavior. Let's have a discussion.

 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2019-12-29 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004841#comment-17004841
 ] 

Viraj Jasani commented on HBASE-23349:
--

Thanks [~larsh] 

Will try to see where else refCounts are being used. Is it better to take this 
in 2 phase? For now, we can bring scanner notification for compaction, and then 
if refCount usage is not that widespread, we can remove it as part of different 
Jira?

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2019-12-22 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001958#comment-17001958
 ] 

Lars Hofhansl commented on HBASE-23349:
---

I think we should step back and remember why we have the ref counting in the 
first place. This came from a discussion started in HBASE-13082 and 
HBASE-10060, namely too much synchronization.

If any changes we make now needs new synchronization in the scanner.next(...) 
path we're back to where we started and in that case we should remove the ref 
counting and bring back the old notification and scanner switching we had 
before.

My apologies that I had triggered the original discussion, and then completely 
dropped off (worked on other stuff) when we attempted to fix it. Reference 
counting is bad (I've never seen this successful implemented), if we can avoid 
it we should a bit of performance drop is acceptable.

Long story for: If we bring back scanner notification then let's get rid of ref 
counting completely.

 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)