[jira] [Commented] (HBASE-25505) ZK watcher threads are daemonized; reconsider

2021-01-13 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264588#comment-17264588
 ] 

Lars Hofhansl commented on HBASE-25505:
---

In case someone wants to - as I said above - I tagged the thread pool with the 
identifier of the watcher and then checked the hung thread, which shows this to 
be a zk wacther on behalf of the ReplicationLogCleaner.

 

> ZK watcher threads are daemonized; reconsider
> -
>
> Key: HBASE-25505
> URL: https://issues.apache.org/jira/browse/HBASE-25505
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Andrew Kyle Purtell
>Priority: Major
>
> On HBASE-25279 there was some discussion and difference of opinion about 
> having ZK watcher pool threads be daemonized. This is not necessarily a 
> problem but should be reconsidered. 
> Daemon threads are subject to abrupt termination during JVM shutdown and 
> therefore may be interrupted before state changes are complete or resources 
> are released. 
> As long as ZK watchers are properly closed by shutdown logic the pool threads 
> will be terminated in a controlled manner and the JVM will exit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25279) Non-daemon thread in ZKWatcher

2021-01-13 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264587#comment-17264587
 ] 

Lars Hofhansl commented on HBASE-25279:
---

I agree the failure to close the watcher is the bug. We're pasting over the 
actual problem, and I bet that nobody will fix HBASE-25505 :)

> Non-daemon thread in ZKWatcher
> --
>
> Key: HBASE-25279
> URL: https://issues.apache.org/jira/browse/HBASE-25279
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.1
>
>
> ZKWatcher spawns an ExecutorService which doesn't mark its threads as daemons 
> which will prevent clean shut downs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25279) Non-daemon thread in ZKWatcher

2021-01-13 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264527#comment-17264527
 ] 

Lars Hofhansl commented on HBASE-25279:
---

I just came across this as well. 

Currently my 2.4.1 the HMaster still hangs upon shutdown. When I annotated the 
thread names with the identifier of the ZKWatcher owning that pool I see it's 
on behalf of the ReplicationLogCleaner. Following the life-cycle of 
HFileLogCleaner, CleanerChore, and ScheduledChore I can't find anything 
obviously wrong. (If you called setConf more than once the previous ZKWatcher 
would not get closed, but that turned out to be not the problem.)

[~apurtell] also tried to reproduce but he could not reproduce.

 

[~elserj], [~vjasani] are you still seeing this problem?

> Non-daemon thread in ZKWatcher
> --
>
> Key: HBASE-25279
> URL: https://issues.apache.org/jira/browse/HBASE-25279
> Project: HBase
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 3.0.0-alpha-1, 2.4.1
>
>
> ZKWatcher spawns an ExecutorService which doesn't mark its threads as daemons 
> which will prevent clean shut downs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-08-02 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169628#comment-17169628
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Re: Tests. This is purely an internal optimization of another optimization (no 
kidding) with no functional impact. For the SEEK optimization we have tests 
that assert the number of SEEKs vs SKIPs during scanning.

I cannot think of any useful additional tests. Lemme perhaps check if there are 
SEEK vs SKIP tests with ROWCOL BFs enabled. Or [~bharathv], could you also 
perhaps have a look as I'm off the next few weeks.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.1.10, 2.2.6
>
> Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24637) Reseek regression related to filter SKIP hinting

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159618#comment-17159618
 ] 

Lars Hofhansl commented on HBASE-24637:
---

I see the logic in UserScanQueryMatcher (in mergeFilterResponse())) has changed 
to do exactly the logic I described above.

It tries to be smarter in the case where the Filter said SKIP but the SQM said 
SEEK.
In theory SEEK'ing is better, but it looks like it's causing exactly this 
change of behavior.


> Reseek regression related to filter SKIP hinting
> 
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159486#comment-17159486
 ] 

Lars Hofhansl commented on HBASE-24637:
---

Yep. And unfortunately whether a SEEK is an advantage depends on many factors. 
If there are many versions then seeking to the next column and row is cheaper 
than skipping, and having theses hints enables that.
However if there are few versions that SKIP'ing is better, and the optimization 
can only figure so much.

I agree that we should restore the previous behavior. It pains me a bit, since 
getting more information from the SQM is good thing - looks like this was too 
much a good thing :) And likely just introduced by accident anyway.


> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159477#comment-17159477
 ] 

Lars Hofhansl edited comment on HBASE-24637 at 7/16/20, 8:46 PM:
-

I see. SKIP is not a hint as such, though, it's the default. The hint (which 
can be ignore) is the SEEK hint. Both are implemented with return codes.

Looks like the SQM is now marking transitions from Column to Column with a 
SEEK-to-next-column hint, and for each row with a SEEK-to-next-row.

Also looking at the numbers, the optimization I mentioned is turning the vast 
majority of SEEKs back into SKIPs (and that check is not free).

As I said, it's not wrong per se (need to look at the code more), but that does 
not mean that there isn't a performance regression - as I have described in the 
previous comment - that we need to fix, possibly by restoring the old behavior.

Edit: Grammar :)


was (Author: lhofhansl):
I see. SKIP is not a hint as such, though, it's the default. The hint (which 
can be ignore) is the SEEK hint. Both are implemented with return codes.

Looks like the SQM is now marking transitions from Column to Column with a 
SEEK-to-next-column hint, and for each row with a SEEK-to-next-row.

Also look at the numbers the optimization I mentioned is turning the vast 
majority back into SKIPs (and that check is not free).

As I said, it's not wrong per se (need to look at the code more), but that does 
not mean that there isn't a performance regression - as I have described in the 
previous comment - that we need to fix, possibly by restoring the old behavior.


> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159480#comment-17159480
 ] 

Lars Hofhansl commented on HBASE-24637:
---

That is to say: A SEEK is a hint that can be turned into a series of SKIP, but 
a SKIP has no extra information.

> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159477#comment-17159477
 ] 

Lars Hofhansl commented on HBASE-24637:
---

I see. SKIP is not a hint as such, though, it's the default. The hint (which 
can be ignore) is the SEEK hint. Both are implemented with return codes.

Looks like the SQM is now marking transitions from Column to Column with a 
SEEK-to-next-column hint, and for each row with a SEEK-to-next-row.

Also look at the numbers the optimization I mentioned is turning the vast 
majority back into SKIPs (and that check is not free).

As I said, it's not wrong per se (need to look at the code more), but that does 
not mean that there isn't a performance regression - as I have described in the 
previous comment - that we need to fix, possibly by restoring the old behavior.


> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159467#comment-17159467
 ] 

Lars Hofhansl commented on HBASE-24637:
---

Maybe I misunderstood the data in the pdf...? Looks like the SQM is hinting 
seeks way more in branch-2 than branch-1.

> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159459#comment-17159459
 ] 

Lars Hofhansl edited comment on HBASE-24637 at 7/16/20, 8:31 PM:
-

Hmm... The SQM giving more precise SEEK hints is not necessarily wrong. It's a 
hint that a SEEK is *possible*.

With the SKIP vs SEEK optimization I put in place a while ago then decides at 
the StoreScanner to follow that hint or not. Now, that optimization itself is 
not free, it adds 1 or 2 extra compares.

In HBASE-24742 I managed to remove one compare in most cases. So it might 
better now, but it's still not good if we issue too many SEEK hints, for each 
of which we then have to decide to follow it or not.



was (Author: lhofhansl):
Hmm... The SQM giving more precise SEEK hints is not necessarily wrong. It's a 
hint that a SEEK is *possible*.

With the SKIP vs SEEK optimization I put in place a while ago then decides at 
the StoreScanner to follow that hint or not. Now, that itself optimization is 
not free, it adds one compare per Cell-version + 1 or 2 extra compares (# 
versions + 1 or 2 in total).

In HBASE-24742 I managed to remove one compare in most cases. So it might 
better now, but it's still not good if we issue too many SEEK hints, for each 
of which we then have to decide to follow it or not.


> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159459#comment-17159459
 ] 

Lars Hofhansl commented on HBASE-24637:
---

Hmm... The SQM giving more precise SEEK hints is not necessarily wrong. It's a 
hint that a SEEK is *possible*.

With the SKIP vs SEEK optimization I put in place a while ago then decides at 
the StoreScanner to follow that hint or not. Now, that itself optimization is 
not free, it adds one compare per Cell-version + 1 or 2 extra compares (# 
versions + 1 or 2 in total).

In HBASE-24742 I managed to remove one compare in most cases. So it might 
better now, but it's still not good if we issue too many SEEK hints, for each 
of which we then have to decide to follow it or not.


> Filter SKIP hinting regression
> --
>
> Key: HBASE-24637
> URL: https://issues.apache.org/jira/browse/HBASE-24637
> Project: HBase
>  Issue Type: Bug
>  Components: Filters, Performance, Scanners
>Affects Versions: 2.2.5
>Reporter: Andrew Kyle Purtell
>Priority: Major
> Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, 
> W-7665966-Instrument-low-level-scan-details-branch-1.patch, 
> W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, 
> parse_call_trace.pl
>
>
> I have been looking into reported performance regressions in HBase 2 relative 
> to HBase 1. Depending on the test scenario, HBase 2 can demonstrate 
> significantly better microbenchmarks in a number of cases, and usually shows 
> improvement in whole cluster benchmarks like YCSB.
> To assist in debugging I added methods to RpcServer for updating per-call 
> metrics that leverage the fact it puts a reference to the current Call into a 
> thread local and that all activity for a given RPC is processed by a single 
> thread context. I then instrumented ScanQueryMatcher (in branch-1) and its 
> various friends (in branch-2.2), StoreScanner, HFileReaderV2 and 
> HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, 
> and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables 
> with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per 
> row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 
> and 2.2 versions under test operated on identical data files in HDFS. For 
> tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to 
> ensure only the server side differed.
> The results for pe --filterAll were revealing. See attached. 
> It appears a refactor to ScanQueryMatcher and friends has disabled the 
> ability of filters to provide meaningful SKIP hints, which disables an 
> optimization that avoids reseeking, leading to a serious and proportional 
> regression in reseek activity and time spent in that code path. So for 
> queries that use filters, there can be a substantial regression.
> Other test cases that did not use filters did not show this regression. If 
> filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was 
> almost identical, as measured by counts of the hint types returned, whether 
> or not column or version trackers are called, and counts of store seeks or 
> reseeks. Regarding micro-timings, there was a 10% variance in my testing and 
> results generally fell within this range, except for the filter all case of 
> course. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-24742.
---
Resolution: Fixed

Also pushed to branch-2 and master.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0
>
> Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-24742:
--
Attachment: 24742-master.txt

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0
>
> Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159444#comment-17159444
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Master (and branch-2) patch. Will just apply as they're the same as the 
branch-1 patch.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0
>
> Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-24742:
--
Fix Version/s: 2.4.0
   3.0.0-alpha-1

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-24742:
---

Lemme put this into branch-2 and master as well.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159385#comment-17159385
 ] 

Lars Hofhansl edited comment on HBASE-24742 at 7/16/20, 7:50 PM:
-

Merged into branch-1.

I'll look into master/branch-2, but my feeling is that things are quite 
different there.

[~apurtell] (before you yell at me for not looking at branch-2/master) :)


was (Author: lhofhansl):
Merged into branch-1.

I'll look into master/branch-2, but my feeling is that things are quite 
different there.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-24742.
---
Resolution: Fixed

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159385#comment-17159385
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Merged into branch-1.

I'll look into master/branch-2, but my feeling is that things are quite 
different there.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-24742:
--
Fix Version/s: 1.7.0

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-15 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17158497#comment-17158497
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Created a PR for observation #1 above.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157895#comment-17157895
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Upon second thought. Perhaps a seek (not a reseek, but a seek that could 
actually goes backwards) could make it so that previousIndexedKey and 
nextIndexedKey are accidentally the same and we *still* would have to do the 
compare.

In my test the majority of the improvement came from the first change.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157890#comment-17157890
 ] 

Lars Hofhansl commented on HBASE-24742:
---

[~zhangduo] I'm with you there. The second part was harder to reason about, and 
I feel a bit less easy about it.
In the end it's an optimization to save a comparison, previousIndexedKey and 
nextIndexedKey will never accidentally be the same (as in identical), so I 
*think* it should OK.

I'm not sure how we can avoid passing the fake keys up, since it is designed to 
handle things and the "upper" heap. At least not without a lot refactoring.

[~apurtell] I'll take a look at HBASE-24637 (I'm a bit thinly spread, though)


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157774#comment-17157774
 ] 

Lars Hofhansl edited comment on HBASE-24742 at 7/15/20, 12:43 AM:
--

There are two observations:
1. We do not need to check for "fake" keys inserted by the ROWCOL BF logic if 
there are not ROWCOL BFs (or if they are not used)
2. We can extend the identity-compare of the nextIndexedKey across multiple 
calls. It's just an optimization and not for correctness.



was (Author: lhofhansl):
There are two observations:
1. We do not need to check for "fake" keys inserted by the ROWCOL BF logic if 
there are not ROWCOL BFs (or if they are not used)
2. We can extend the identify compare of the nextIndexedKey across multiple 
calls. It's just an optimization and not for correctness.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157774#comment-17157774
 ] 

Lars Hofhansl edited comment on HBASE-24742 at 7/15/20, 12:42 AM:
--

There are two observations:
1. We do not need to check for "fake" keys inserted by the ROWCOL BF logic if 
there are not ROWCOL BFs (or if they are not used)
2. We can extend the identify compare of the nextIndexedKey across multiple 
calls. It's just an optimization and not for correctness.



was (Author: lhofhansl):
There are two observations:
1. We do not need for "fake" keys inserted by the ROWCOL BF logic if there are 
not ROWCOL BFs (or if they are not used)
2. We can extend the identify compare of the nextIndexedKey across multiple 
calls. It's just an optimization and not for correctness.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157779#comment-17157779
 ] 

Lars Hofhansl edited comment on HBASE-24742 at 7/15/20, 12:42 AM:
--

Passes the test added in HBASE-19863 and brings the runtime of a test Phoenix 
query from 5.8s to 4.2s.
(This is for a fully compacted table and VERSIONS=1, which represents the worst 
case, where the two linked jiras triple the number of comparisons per Cell).

I'll post a PR tomorrow.


was (Author: lhofhansl):
Passes the test added in HBASE-19863 and brings the runtime of a test Phoenix 
query from 5.8s to 4.2s.
(This is for a fully compacted table, which represents the worst case, where 
the two linked jiras triple the number of comparisons per Cell).

I'll post a PR tomorrow.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157779#comment-17157779
 ] 

Lars Hofhansl edited comment on HBASE-24742 at 7/15/20, 12:40 AM:
--

Passes the test added in HBASE-19863 and brings the runtime of a test Phoenix 
query from 5.8s to 4.2s.
(This is for a fully compacted table, which represents the worst case, where 
the two linked jiras triple the number of comparisons per Cell).

I'll post a PR tomorrow.


was (Author: lhofhansl):
Passes the test added in HBASE-19863 and brings the runtime of a test Phoenix 
query from 5.8s to 4.2s.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157779#comment-17157779
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Passes the test added in HBASE-19863 and brings the runtime of a test Phoenix 
query from 5.8s to 4.2s.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-24742:
-

Assignee: Lars Hofhansl

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157775#comment-17157775
 ] 

Lars Hofhansl commented on HBASE-24742:
---

Here's a patch.
Please have a careful look, especially at the part that turns 
previousIndexedKey into a member.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-24742:
--
Attachment: hbase-24742-branch-1.txt

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
> Attachments: hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157774#comment-17157774
 ] 

Lars Hofhansl commented on HBASE-24742:
---

There are two observations:
1. We do not need for "fake" keys inserted by the ROWCOL BF logic if there are 
not ROWCOL BFs (or if they are not used)
2. We can extend the identify compare of the nextIndexedKey across multiple 
calls. It's just an optimization and not for correctness.


> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-24742:
-

 Summary: Improve performance of SKIP vs SEEK logic
 Key: HBASE-24742
 URL: https://issues.apache.org/jira/browse/HBASE-24742
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
slowdown in scanning scenarios.

We tracked it back to HBASE-17958 and HBASE-19863.
Both add comparisons to one of the tightest HBase has.

[~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015365#comment-17015365
 ] 

Lars Hofhansl commented on HBASE-23349:
---

Minor nit: This is not lock coarsening. That was the failed I attempt I had to 
reduce the frequency of taking memory barriers (the locks were almost never 
contended), by pushing the locking up the stack into the region scanner.
[~ram_krish] and [~anoop.hbase] then came up with an actual solution :), but 
that then required the reference counting.
Note that the numbers on HBASE-13082, where with the lock coarsening, not with 
reference counting.

At this point my concern is just about correctness and the issues we have seen 
with reference counting. It is generally very hard to retrofit reference 
counting into a large, complex system. Ram and Anoop did an awesome job! 
Perhaps HBase is just too complex to add this reliably.

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23602) TTL Before Which No Data is Purged

2020-01-08 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011414#comment-17011414
 ] 

Lars Hofhansl commented on HBASE-23602:
---

When you set a TTL and KEEP_DELETED_CELLS=TTL and *MIN_VERSIONS* you get that.

Now HBase will keep everything (up to VERSIONS) until the TTL expires, after 
that it keep MIN_VERSIONS.

At least that's what I had in mind when I added MIN_VERSIONS and 
KEEP_DELETED_CELLS to HBase back in the day. Granted it's a bit convoluted, but 
pretty flexible this way.

Say VERSIONS=MAX_INT, TTL=5days, KEEP_DELETED_CELLS=TTL, MIN_VERSIONS=2. Now 
within 5 days you have everything - all Puts, all Deletes, etc, and you can do 
correct point-in-time queries. After 5 days HBase retains 2 versions only.

> TTL Before Which No Data is Purged
> --
>
> Key: HBASE-23602
> URL: https://issues.apache.org/jira/browse/HBASE-23602
> Project: HBase
>  Issue Type: New Feature
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> HBase currently offers operators a choice. They can set 
> KEEP_DELETED_CELLS=true and VERSIONS to max value, plus no TTL, and they will 
> always have a complete history of all changes (but high storage costs and 
> penalties to read performance). Or they can have KEEP_DELETED_CELLS=false and 
> VERSIONS/TTL set to some reasonable values, but that means that major 
> compactions can destroy the ability to do a consistent snapshot read of any 
> prior time. (This limits the usefulness and correctness of, for example, 
> Phoenix's SCN lookback feature.) 
> I propose having a new TTL property to give a minimum age that an expired or 
> deleted Cell would have to achieve before it could be purged. (I see that 
> HBASE-10118 already does something similar for the delete markers 
> themselves.) 
> This would allow operators to have a consistent history for some finite 
> amount of recent time while still purging out the "long tail" of obsolete / 
> deleted versions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23602) TTL Before Which No Data is Purged

2020-01-08 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011089#comment-17011089
 ] 

Lars Hofhansl commented on HBASE-23602:
---

See HBASE-12363 for a (looong) discussion.

> TTL Before Which No Data is Purged
> --
>
> Key: HBASE-23602
> URL: https://issues.apache.org/jira/browse/HBASE-23602
> Project: HBase
>  Issue Type: New Feature
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> HBase currently offers operators a choice. They can set 
> KEEP_DELETED_CELLS=true and VERSIONS to max value, plus no TTL, and they will 
> always have a complete history of all changes (but high storage costs and 
> penalties to read performance). Or they can have KEEP_DELETED_CELLS=false and 
> VERSIONS/TTL set to some reasonable values, but that means that major 
> compactions can destroy the ability to do a consistent snapshot read of any 
> prior time. (This limits the usefulness and correctness of, for example, 
> Phoenix's SCN lookback feature.) 
> I propose having a new TTL property to give a minimum age that an expired or 
> deleted Cell would have to achieve before it could be purged. (I see that 
> HBASE-10118 already does something similar for the delete markers 
> themselves.) 
> This would allow operators to have a consistent history for some finite 
> amount of recent time while still purging out the "long tail" of obsolete / 
> deleted versions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23602) TTL Before Which No Data is Purged

2020-01-08 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17011087#comment-17011087
 ] 

Lars Hofhansl commented on HBASE-23602:
---

You can set KEEP_DELETED_CELLS=TTL (you can set that to true, false, and TTL), 
and get what you want, I think.

> TTL Before Which No Data is Purged
> --
>
> Key: HBASE-23602
> URL: https://issues.apache.org/jira/browse/HBASE-23602
> Project: HBase
>  Issue Type: New Feature
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> HBase currently offers operators a choice. They can set 
> KEEP_DELETED_CELLS=true and VERSIONS to max value, plus no TTL, and they will 
> always have a complete history of all changes (but high storage costs and 
> penalties to read performance). Or they can have KEEP_DELETED_CELLS=false and 
> VERSIONS/TTL set to some reasonable values, but that means that major 
> compactions can destroy the ability to do a consistent snapshot read of any 
> prior time. (This limits the usefulness and correctness of, for example, 
> Phoenix's SCN lookback feature.) 
> I propose having a new TTL property to give a minimum age that an expired or 
> deleted Cell would have to achieve before it could be purged. (I see that 
> HBASE-10118 already does something similar for the delete markers 
> themselves.) 
> This would allow operators to have a consistent history for some finite 
> amount of recent time while still purging out the "long tail" of obsolete / 
> deleted versions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2020-01-01 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006485#comment-17006485
 ] 

Lars Hofhansl commented on HBASE-23349:
---

Sure.

[~ram_krish], [~anoop.hbase], FYI. I know you guys invested a lot of time in 
this. In light of the issues I'm in favor removing the refcounting code and 
restoring the old behavior. Let's have a discussion.

 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-6970) hbase-deamon.sh creates/updates pid file even when that start failed.

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6970.
--
Resolution: Won't Fix

> hbase-deamon.sh creates/updates pid file even when that start failed.
> -
>
> Key: HBASE-6970
> URL: https://issues.apache.org/jira/browse/HBASE-6970
> Project: HBase
>  Issue Type: Bug
>  Components: Usability
>Reporter: Lars Hofhansl
>Priority: Major
>
> We just ran into a strange issue where could neither start nor stop services 
> with hbase-deamon.sh.
> The problem is this:
> {code}
> nohup nice -n $HBASE_NICENESS "$HBASE_HOME"/bin/hbase \
> --config "${HBASE_CONF_DIR}" \
> $command "$@" $startStop > "$logout" 2>&1 < /dev/null &
> echo $! > $pid
> {code}
> So the pid file is created or updated even when the start of the service 
> failed. The next stop command will then fail, because the pid file has the 
> wrong pid in it.
> Edit: Spelling and more spelling errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-9272) A parallel, unordered scanner

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-9272.
--
Resolution: Won't Fix

> A parallel, unordered scanner
> -
>
> Key: HBASE-9272
> URL: https://issues.apache.org/jira/browse/HBASE-9272
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Priority: Minor
> Attachments: 9272-0.94-v2.txt, 9272-0.94-v3.txt, 9272-0.94-v4.txt, 
> 9272-0.94.txt, 9272-trunk-v2.txt, 9272-trunk-v3.txt, 9272-trunk-v3.txt, 
> 9272-trunk-v4.txt, 9272-trunk.txt, ParallelClientScanner.java, 
> ParallelClientScanner.java
>
>
> The contract of ClientScanner is to return rows in sort order. That limits 
> the order in which region can be scanned.
> I propose a simple ParallelScanner that does not have this requirement and 
> queries regions in parallel, return whatever gets returned first.
> This is generally useful for scans that filter a lot of data on the server, 
> or in cases where the client can very quickly react to the returned data.
> I have a simple prototype (doesn't do error handling right, and might be a 
> bit heavy on the synchronization side - it used a BlockingQueue to hand data 
> between the client using the scanner and the threads doing the scanning, it 
> also could potentially starve some scanners long enugh to time out at the 
> server).
> On the plus side, it's only a 130 lines of code. :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-13751) Refactoring replication WAL reading logic as WAL Iterator

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-13751.
---
Resolution: Won't Fix

> Refactoring replication WAL reading logic as WAL Iterator
> -
>
> Key: HBASE-13751
> URL: https://issues.apache.org/jira/browse/HBASE-13751
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> The current replication code is all over the place.
> A simple refactoring that we could consider is to factor out the part that 
> reads from the WALs. Could be a simple iterator interface with one additional 
> wrinkle: The iterator needs to be able to provide the position (file and 
> offset) of the last read edit.
> Once we have this, we use this as a building block to many other changes in 
> the replication code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-14014) Explore row-by-row grouping options

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14014.
---
Resolution: Won't Fix

> Explore row-by-row grouping options
> ---
>
> Key: HBASE-14014
> URL: https://issues.apache.org/jira/browse/HBASE-14014
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>
> See discussion in parent.
> We need to considering the following attributes of WALKey:
> * The cluster ids
> * Table Name
> * write time (here we could use the latest of any batch)
> * seqNum
> As long as we preserve these we can rearrange the cells between WALEdits. 
> Since seqNum is unique this will be a challenge. Currently it is not used, 
> but we shouldn't design anything that prevents us guaranteeing better 
> ordering guarantees using seqNum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-14509) Configurable sparse indexes?

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14509.
---
Resolution: Won't Fix

> Configurable sparse indexes?
> 
>
> Key: HBASE-14509
> URL: https://issues.apache.org/jira/browse/HBASE-14509
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> This idea just popped up today and I wanted to record it for discussion:
> What if we kept sparse column indexes per region or HFile or per configurable 
> range?
> I.e. For any given CQ we record the lowest and highest value for a particular 
> range (HFile, Region, or a custom range like the Phoenix guide post).
> By tweaking the size of these ranges we can control the size of the index, vs 
> its selectivity.
> For example if we kept it by HFile we can almost instantly decide whether we 
> need scan a particular HFile at all to find a particular value in a Cell.
> We can also collect min/max values for each n MB of data, for example when we 
> can the region the first time. Assuming ranges are large enough we can always 
> keep the index in memory together with the region.
> Kind of a sparse local index. Might much easier than the buddy region stuff 
> we've been discussing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files

2019-12-22 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001958#comment-17001958
 ] 

Lars Hofhansl commented on HBASE-23349:
---

I think we should step back and remember why we have the ref counting in the 
first place. This came from a discussion started in HBASE-13082 and 
HBASE-10060, namely too much synchronization.

If any changes we make now needs new synchronization in the scanner.next(...) 
path we're back to where we started and in that case we should remove the ref 
counting and bring back the old notification and scanner switching we had 
before.

My apologies that I had triggered the original discussion, and then completely 
dropped off (worked on other stuff) when we attempted to fix it. Reference 
counting is bad (I've never seen this successful implemented), if we can avoid 
it we should a bit of performance drop is acceptable.

Long story for: If we bring back scanner notification then let's get rid of ref 
counting completely.

 

> Low refCount preventing archival of compacted away files
> 
>
> Key: HBASE-23349
> URL: https://issues.apache.org/jira/browse/HBASE-23349
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
>
> We have observed that refCount on compacted away store files as low as 1 is 
> prevent archival.
> {code:java}
> regionserver.HStore - Can't archive compacted file 
> hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9
>  because of either isCompactedAway=true or file has reference, 
> isReferencedInReads=true, refCount=1, skipping for now.
> {code}
> We should come up with core code (run as part of discharger thread) 
> gracefully resolve reader lock issue by resetting ongoing scanners to start 
> pointing to new store files instead of compacted away store files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-05 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-23364.
---
Fix Version/s: 1.6.0
   2.3.0
   3.0.0
   Resolution: Fixed

Committed to branch-1, branch-2, and master.

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
> Attachments: 23364-branch-1.txt
>
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-04 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988540#comment-16988540
 ] 

Lars Hofhansl commented on HBASE-23364:
---

Thanks for looking [~vjasani] .

I'll fix the long line on commit, and commit to above mentioned branches.

 

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: 23364-branch-1.txt
>
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-04 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988456#comment-16988456
 ] 

Lars Hofhansl commented on HBASE-23364:
---

Here's a simple patch. Seems to fix the problem. Please have a look.

This is a problem in branch-1, branch-2, and master, but not on any of the 
other branches.

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: 23364-branch-1.txt
>
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-04 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-23364:
-

Assignee: Lars Hofhansl

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: 23364-branch-1.txt
>
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-04 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23364:
--
Attachment: 23364-branch-1.txt

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>Priority: Major
> Attachments: 23364-branch-1.txt
>
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-04 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23364:
--
Affects Version/s: 2.3.0
   3.0.0

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>Priority: Major
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-04 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23364:
--
Affects Version/s: 1.6.0

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.6.0
>Reporter: Lars Hofhansl
>Priority: Major
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-03 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987585#comment-16987585
 ] 

Lars Hofhansl edited comment on HBASE-23364 at 12/4/19 6:54 AM:


So back to the theory above. I think I tracked it down to HBASE-23210.

[~apurtell], FYI.

I think it's this change:
{code:java}
 +executor = Executors.newSingleThreadExecutor({code}
That causes the problem. That executor neither has daemon thread factory, nor 
is it shutdown (as far as I can see from looking briefly)

I'll look more tomorrow and provide a fix - if I get time.


was (Author: lhofhansl):
So back to the theory above. I think I tracked it down to HBASE-23210.

[~apurtell], FYI.

I think it's this change:
{code:java}
 +executor = Executors.newSingleThreadExecutor({code}
That causes the problem. That executor neither has daemon thread factory, nor 
is it shutdown (as far as I can see from looking briefly)

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23210) Backport HBASE-15519 (Add per-user metrics) to branch-1

2019-12-03 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987587#comment-16987587
 ] 

Lars Hofhansl commented on HBASE-23210:
---

See HBASE-23364 ... I think this causes the region server to "hang" upon 
shutdown.

> Backport HBASE-15519 (Add per-user metrics) to branch-1
> ---
>
> Key: HBASE-23210
> URL: https://issues.apache.org/jira/browse/HBASE-23210
> Project: HBase
>  Issue Type: New Feature
>Reporter: Andrew Kyle Purtell
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 1.6.0
>
>
> We will need HBASE-15519 in branch-1 for eventual backport of HBASE-23065.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-03 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987585#comment-16987585
 ] 

Lars Hofhansl edited comment on HBASE-23364 at 12/4/19 6:50 AM:


So back to the theory above. I think I tracked it down to HBASE-23210.

[~apurtell], FYI.

I think it's this change:
{code:java}
 +executor = Executors.newSingleThreadExecutor({code}
That causes the problem. That executor neither has daemon thread factory, nor 
is it shutdown (as far as I can see from looking briefly)


was (Author: lhofhansl):
So back to the theory above. I think I tracked it down to HBASE-23210.

 

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-03 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23364:
--
Description: 
Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
to HBase.



I noticed that recently only. Latest build from HBase's branch-1 and latest 
build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix or 
an HBase issues.

Just filing it here for later reference.

jstack show this thread as the only non-daemon thread:
{code:java}
"pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
waiting on condition [0x7f213ad68000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00058eafece8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}
No other information. Somebody created a thread pool somewhere and forgot to 
set the threads to daemon or is not shutting down the pool properly.

Edit: I looked for other reference of the locked objects in the stack dump, but 
didn't find any.

 

 

  was:
I noticed that recently only. Latest build from HBase's branch-1 and latest 
build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix or 
an HBase issues.

Just filing it here for later reference.

jstack show this thread as the only non-daemon thread:
{code:java}
"pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
waiting on condition [0x7f213ad68000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00058eafece8> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}
No other information. Somebody created a thread pool somewhere and forgot to 
set the threads to daemon or is not shutting down the pool properly.

Edit: I looked for other reference of the locked objects in the stack dump, but 
didn't find any.

 

 


> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at 

[jira] [Moved] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-03 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl moved PHOENIX-5579 to HBASE-23364:


Key: HBASE-23364  (was: PHOENIX-5579)
Project: HBase  (was: Phoenix)

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Priority: Major
>
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-12-02 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-22457:
-

Assignee: (was: Lars Hofhansl)

> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
> Attachments: 22457-random-1.5.txt
>
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-23 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16980983#comment-16980983
 ] 

Lars Hofhansl commented on HBASE-23279:
---

Patch looks good generally. Are the changes for the tests necessary (keeping 
NONE as encoding) required to have them pass? That would be a bit scary.

Also the size difference is unexpectedly high... I guess it depends on the size 
of the keys relative to the total size of the Cells, in my tests I've seen 
about 3%. I tested with Phoenix:

{{CREATE TABLE  (pk INTEGER PRIMARY key, v1 FLOAT, v2 FLOAT, v3 
INTEGER)}}

So there would 3 cells per "row" with 4 bytes as the row key, each with a 4 
byte value. I'd expect that to be pretty bad case for row indexing. Are those 
heap sizes, file sizes, or bucket cache sizes?


> Switch default block encoding to ROW_INDEX_V1
> -
>
> Key: HBASE-23279
> URL: https://issues.apache.org/jira/browse/HBASE-23279
> Project: HBase
>  Issue Type: Wish
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Lars Hofhansl
>Assignee: Viraj Jasani
>Priority: Minor
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HBASE-23279.master.000.patch, 
> HBASE-23279.master.001.patch, HBASE-23279.master.002.patch, 
> HBASE-23279.master.003.patch
>
>
> Currently we set both block encoding and compression to NONE.
> ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles 
> are slightly larger about 3% or so). I think that would a better default than 
> NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23309) Add support in ChainWalEntryFilter to filter Entry if all cells get filtered through WalCellFilter

2019-11-21 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979690#comment-16979690
 ] 

Lars Hofhansl commented on HBASE-23309:
---

Let me ask a more radical question: Is it a bug to return an empty WALEdit 
after all Cells have been removed? In other words should we just change the 
behavior?

> Add support in ChainWalEntryFilter to filter Entry if all cells get filtered 
> through WalCellFilter
> --
>
> Key: HBASE-23309
> URL: https://issues.apache.org/jira/browse/HBASE-23309
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.3.6, 2.3.3
>Reporter: Sandeep Pal
>Assignee: Sandeep Pal
>Priority: Major
> Attachments: HBASE-23309.branch-1.patch, HBASE-23309.branch-2.patch, 
> HBASE-23309.patch
>
>
> ChainWalEntryFilter applies the filter on entry followed by filter on cells. 
>  If filter on cells remove all the cells from the entry, we should add an 
> option in chainwalentryfilter to filter the entry as well. 
> Here is the snippet for ChainWalEntryFilter filter. After filterCells we 
> should check if there is any cells remaining in the entry. 
> {code:java}
> @Override
> public Entry filter(Entry entry) {
>  for (WALEntryFilter filter : filters) {
>  if (entry == null) {
>  return null;
>  }
>  entry = filter.filter(entry);
>  }
>  filterCells(entry);
>  return entry;
> }{code}
>  Customer replication endpoints may use this flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974596#comment-16974596
 ] 

Lars Hofhansl edited comment on HBASE-23279 at 11/15/19 2:44 AM:
-

Thanks all.

So based on the discussion here ... Is already a problem if I went and changed 
the block encoding to anything other than NONE? Most block encodings (like 
FAST_DIFF, etc) will decrease the size, but even there there're abnormal cases 
where the size might be increased.

Or in other words any encoding (or compression) will cause the actual size of a 
block to not be a constant.

Other cases are large key values. The block is extended at the end to hold the 
last key value, right?

NM: Read the above again. I think we do not have to change the formula. How 
much bigger the index encoded file is depends on the type of the data.


was (Author: lhofhansl):
Thanks all.

So based on the discussion here ... Is already a problem if I went and changed 
the block encoding to anything other than NONE? Most block encodings (like 
FAST_DIFF, etc) will decrease the size, but even there there're abnormal cases 
where the size might be increased.

Or in other words any encoding (or compression) will cause the actual size of a 
block to not be a constant.

Other cases are large key values. The block is extended at the end to hold the 
last key value, right?

> Switch default block encoding to ROW_INDEX_V1
> -
>
> Key: HBASE-23279
> URL: https://issues.apache.org/jira/browse/HBASE-23279
> Project: HBase
>  Issue Type: Wish
>Reporter: Lars Hofhansl
>Assignee: Viraj Jasani
>Priority: Minor
>
> Currently we set both block encoding and compression to NONE.
> ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles 
> are slightly larger about 3% or so). I think that would a better default than 
> NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974596#comment-16974596
 ] 

Lars Hofhansl edited comment on HBASE-23279 at 11/14/19 8:56 PM:
-

Thanks all.

So based on the discussion here ... Is already a problem if I went and changed 
the block encoding to anything other than NONE? Most block encodings (like 
FAST_DIFF, etc) will decrease the size, but even there there're abnormal cases 
where the size might be increased.

Or in other words any encoding (or compression) will cause the actual size of a 
block to not be a constant.

Other cases are large key values. The block is extended at the end to hold the 
last key value, right?


was (Author: lhofhansl):
Thanks all.

So based on the discussion here ... Is already a problem if I went and changed 
the block encoding to anything other than NONE? Most block encodings (like 
FAST_DIFF, etc) will decrease the size, but even there there're abnormal cases 
where the size might be increased.

> Switch default block encoding to ROW_INDEX_V1
> -
>
> Key: HBASE-23279
> URL: https://issues.apache.org/jira/browse/HBASE-23279
> Project: HBase
>  Issue Type: Wish
>Reporter: Lars Hofhansl
>Assignee: Viraj Jasani
>Priority: Minor
>
> Currently we set both block encoding and compression to NONE.
> ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles 
> are slightly larger about 3% or so). I think that would a better default than 
> NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-14 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974596#comment-16974596
 ] 

Lars Hofhansl commented on HBASE-23279:
---

Thanks all.

So based on the discussion here ... Is already a problem if I went and changed 
the block encoding to anything other than NONE? Most block encodings (like 
FAST_DIFF, etc) will decrease the size, but even there there're abnormal cases 
where the size might be increased.

> Switch default block encoding to ROW_INDEX_V1
> -
>
> Key: HBASE-23279
> URL: https://issues.apache.org/jira/browse/HBASE-23279
> Project: HBase
>  Issue Type: Wish
>Reporter: Lars Hofhansl
>Assignee: Viraj Jasani
>Priority: Minor
>
> Currently we set both block encoding and compression to NONE.
> ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles 
> are slightly larger about 3% or so). I think that would a better default than 
> NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-11 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23279:
--
Issue Type: Wish  (was: Improvement)

> Switch default block encoding to ROW_INDEX_V1
> -
>
> Key: HBASE-23279
> URL: https://issues.apache.org/jira/browse/HBASE-23279
> Project: HBase
>  Issue Type: Wish
>Reporter: Lars Hofhansl
>Priority: Minor
>
> Currently we set both block encoding and compression to NONE.
> ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles 
> are slightly larger about 3% or so). I think that would a better default than 
> NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-11 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-23279:
-

 Summary: Switch default block encoding to ROW_INDEX_V1
 Key: HBASE-23279
 URL: https://issues.apache.org/jira/browse/HBASE-23279
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl


Currently we set both block encoding and compression to NONE.

ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles are 
slightly larger about 3% or so). I think that would a better default than NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23240) branch-1 master and regionservers do not start when compiled against Hadoop 3.2.1

2019-11-09 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970942#comment-16970942
 ] 

Lars Hofhansl edited comment on HBASE-23240 at 11/9/19 8:40 PM:


Yeah I had a brief look but ran out of time for this.

I guess Hadoop is not as strict as we are w.r.t. backwards compatibility in 
minor and patch releases.


was (Author: lhofhansl):
Yeah I had a brief look but ran out of time for this.

> branch-1 master and regionservers do not start when compiled against Hadoop 
> 3.2.1
> -
>
> Key: HBASE-23240
> URL: https://issues.apache.org/jira/browse/HBASE-23240
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Lars Hofhansl
>Priority: Major
> Fix For: 1.6.0, 1.5.1
>
>
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>  at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:572)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23240) branch-1 master and regionservers do not start when compiled against Hadoop 3.2.1

2019-11-09 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970942#comment-16970942
 ] 

Lars Hofhansl commented on HBASE-23240:
---

Yeah I had a brief look but ran out of time for this.

> branch-1 master and regionservers do not start when compiled against Hadoop 
> 3.2.1
> -
>
> Key: HBASE-23240
> URL: https://issues.apache.org/jira/browse/HBASE-23240
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Lars Hofhansl
>Priority: Major
> Fix For: 1.6.0, 1.5.1
>
>
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>  at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:572)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23240) branch-1 master and regionservers do not start when compiled against Hadoop 3.2.1

2019-10-31 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23240:
--
Affects Version/s: 1.5.0

> branch-1 master and regionservers do not start when compiled against Hadoop 
> 3.2.1
> -
>
> Key: HBASE-23240
> URL: https://issues.apache.org/jira/browse/HBASE-23240
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Lars Hofhansl
>Priority: Major
>
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>  at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:572)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23240) branch-1 master and regionservers do not start when compiled against Hadoop 3.2.1

2019-10-31 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-23240:
--
Fix Version/s: 1.5.1
   1.6.0

> branch-1 master and regionservers do not start when compiled against Hadoop 
> 3.2.1
> -
>
> Key: HBASE-23240
> URL: https://issues.apache.org/jira/browse/HBASE-23240
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Lars Hofhansl
>Priority: Major
> Fix For: 1.6.0, 1.5.1
>
>
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
>  at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
>  at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:572)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
>  at 
> org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>  at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23240) branch-1 master and regionservers do not start when compiled against Hadoop 3.2.1

2019-10-31 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-23240:
-

 Summary: branch-1 master and regionservers do not start when 
compiled against Hadoop 3.2.1
 Key: HBASE-23240
 URL: https://issues.apache.org/jira/browse/HBASE-23240
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl


Exception in thread "main" java.lang.NoSuchMethodError: 
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
 at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
 at 
org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
 at 
org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:572)
 at 
org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
 at 
org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21856) Consider Causal Replication Ordering

2019-10-10 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16948976#comment-16948976
 ] 

Lars Hofhansl commented on HBASE-21856:
---

[~bharathv] See description (I mention "Serial Replication" there) :) ... The 
thought is that serial replication (i.e. a global ordering) is very expensive 
and not always needed.
I propose that for most use cases the ordering I propose here is sufficient. In 
the end we should have a discussion about what the specific problem is that we 
want to solve.

> Consider Causal Replication Ordering
> 
>
> Key: HBASE-21856
> URL: https://issues.apache.org/jira/browse/HBASE-21856
> Project: HBase
>  Issue Type: Brainstorming
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>  Labels: Replication
>
> We've had various efforts to improve the ordering guarantees for HBase 
> replication, most notably Serial Replication.
> I think in many cases guaranteeing a Total Replication Order is not required, 
> but a simpler Causal Replication Order is sufficient.
> Specifically we would guarantee causal ordering for a single Rowkey. Any 
> changes to a Row - Puts, Deletes, etc - would be replicated in the exact 
> order in which they occurred in the source system.
> Unlike total ordering this can be accomplished with only local region server 
> control.
> I don't have a full design in mind, let's discuss here. It should be 
> sufficient to to the following:
> # RegionServers only adopt the replication queues from other RegionServers 
> for regions they (now) own. This requires log splitting for replication.
> # RegionServers ship all edits for queues adopted from other servers before 
> any of their "own" edits are shipped.
> It's probably a bit more involved, but should be much cheaper that the total 
> ordering provided by serial replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23015) branch-1 hbase-server, testing util, and shaded testing util need jackson

2019-09-20 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934529#comment-16934529
 ] 

Lars Hofhansl commented on HBASE-23015:
---

> If you're up for RM'ing a 1.5 release in parallel once this blocker closes 
> that'd be wonderful.

I am.  But I'll also check in with [~apurtell] who had volunteered before when 
he's back from vacation.

> branch-1 hbase-server, testing util, and  shaded testing util need jackson
> --
>
> Key: HBASE-23015
> URL: https://issues.apache.org/jira/browse/HBASE-23015
> Project: HBase
>  Issue Type: Bug
>  Components: Client, shading
>Affects Versions: 1.5.0, 1.3.6, 1.4.11
>Reporter: Sean Busbey
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 1.5.0, 1.3.6, 1.4.11
>
> Attachments: HBASE-23015.branch-1.3.000.patch, 
> HBASE-23015.branch-1.3.001.patch
>
>
> HBASE-22728 moved out jackson transitive dependencies. mostly good, but 
> moving jackson2 to provided in hbase-server broke few things
> testing-util needs a transitive jackson 2 in order to start the minicluster, 
> currently fails with CNFE for {{com.fasterxml.jackson.databind.ObjectMapper}} 
> when trying to initialize the master.
> shaded-testing-util needs a relocated jackson 2 for the same reason
> it's not used for any of the mapreduce stuff in hbase-server, so 
> {{hbase-shaded-server}} for that purpose should be fine. But it is used by 
> {{WALPrettyPrinter}} and some folks might expect that to work from that 
> artifact since it is present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-23015) branch-1 hbase-server, testing util, and shaded testing util need jackson

2019-09-19 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934004#comment-16934004
 ] 

Lars Hofhansl edited comment on HBASE-23015 at 9/20/19 3:22 AM:


I meant apply *locally* above. In any case it does not apply cleanly in my 
setup, so it'd be a chunk of work anyway.

And totally agree that non-released branches are at downstream project's own 
risk.

I will say that we have been dragging our feet on 1.5 trying to make it perfect 
instead of just releasing (as long as there are *no new* problems) and fix 
outstanding in the next release (i.e. 1.5.1) which should only be a month away.

 


was (Author: lhofhansl):
I meant apply *locally* above. In any case it does not apply cleanly in my 
setup, so it'd be a chunk of work anyway.

> branch-1 hbase-server, testing util, and  shaded testing util need jackson
> --
>
> Key: HBASE-23015
> URL: https://issues.apache.org/jira/browse/HBASE-23015
> Project: HBase
>  Issue Type: Bug
>  Components: Client, shading
>Affects Versions: 1.5.0, 1.3.6, 1.4.11
>Reporter: Sean Busbey
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 1.5.0, 1.3.6, 1.4.11
>
> Attachments: HBASE-23015.branch-1.3.000.patch
>
>
> HBASE-22728 moved out jackson transitive dependencies. mostly good, but 
> moving jackson2 to provided in hbase-server broke few things
> testing-util needs a transitive jackson 2 in order to start the minicluster, 
> currently fails with CNFE for {{com.fasterxml.jackson.databind.ObjectMapper}} 
> when trying to initialize the master.
> shaded-testing-util needs a relocated jackson 2 for the same reason
> it's not used for any of the mapreduce stuff in hbase-server, so 
> {{hbase-shaded-server}} for that purpose should be fine. But it is used by 
> {{WALPrettyPrinter}} and some folks might expect that to work from that 
> artifact since it is present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23015) branch-1 hbase-server, testing util, and shaded testing util need jackson

2019-09-19 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934004#comment-16934004
 ] 

Lars Hofhansl commented on HBASE-23015:
---

I meant apply *locally* above. In any case it does not apply cleanly in my 
setup, so it'd be a chunk of work anyway.

> branch-1 hbase-server, testing util, and  shaded testing util need jackson
> --
>
> Key: HBASE-23015
> URL: https://issues.apache.org/jira/browse/HBASE-23015
> Project: HBase
>  Issue Type: Bug
>  Components: Client, shading
>Affects Versions: 1.5.0, 1.3.6, 1.4.11
>Reporter: Sean Busbey
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 1.5.0, 1.3.6, 1.4.11
>
> Attachments: HBASE-23015.branch-1.3.000.patch
>
>
> HBASE-22728 moved out jackson transitive dependencies. mostly good, but 
> moving jackson2 to provided in hbase-server broke few things
> testing-util needs a transitive jackson 2 in order to start the minicluster, 
> currently fails with CNFE for {{com.fasterxml.jackson.databind.ObjectMapper}} 
> when trying to initialize the master.
> shaded-testing-util needs a relocated jackson 2 for the same reason
> it's not used for any of the mapreduce stuff in hbase-server, so 
> {{hbase-shaded-server}} for that purpose should be fine. But it is used by 
> {{WALPrettyPrinter}} and some folks might expect that to work from that 
> artifact since it is present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23015) branch-1 hbase-server, testing util, and shaded testing util need jackson

2019-09-19 Thread Lars Hofhansl (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933851#comment-16933851
 ] 

Lars Hofhansl commented on HBASE-23015:
---

Lemme apply this and see if it fixes the problem I've been seeing.

> branch-1 hbase-server, testing util, and  shaded testing util need jackson
> --
>
> Key: HBASE-23015
> URL: https://issues.apache.org/jira/browse/HBASE-23015
> Project: HBase
>  Issue Type: Bug
>  Components: Client, shading
>Affects Versions: 1.5.0, 1.3.6, 1.4.11
>Reporter: Sean Busbey
>Assignee: Viraj Jasani
>Priority: Blocker
> Fix For: 1.5.0, 1.3.6, 1.4.11
>
> Attachments: HBASE-23015.branch-1.3.000.patch
>
>
> HBASE-22728 moved out jackson transitive dependencies. mostly good, but 
> moving jackson2 to provided in hbase-server broke few things
> testing-util needs a transitive jackson 2 in order to start the minicluster, 
> currently fails with CNFE for {{com.fasterxml.jackson.databind.ObjectMapper}} 
> when trying to initialize the master.
> shaded-testing-util needs a relocated jackson 2 for the same reason
> it's not used for any of the mapreduce stuff in hbase-server, so 
> {{hbase-shaded-server}} for that purpose should be fine. But it is used by 
> {{WALPrettyPrinter}} and some folks might expect that to work from that 
> artifact since it is present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-21158) Empty qualifier cell should not be returned if it does not match QualifierFilter

2019-05-28 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849352#comment-16849352
 ] 

Lars Hofhansl edited comment on HBASE-21158 at 5/28/19 6:16 AM:


This causes *very* subtle changes, and actually break Phoenix secondary 
indexing.

Furthermore is different between
 * branch-1.3 (check remove)
 * branch-1.4 (check still present)
 * branch-1 (check removed, g, took me 3h to track this down)
 * branch-2 (check still present) and
 * master (check removed)

This is bad. And bad that we managed to leave it all in different states in 
different HBase branches.

What happened here?

[~apurtell], we should check whether we have this in our HBase. If so it can 
cause subtle index out of sync problems (see linked Phoenix jira).

Phoenix in this case relies on that a family delete marker (which does not have 
a qualifier) is flowing through this filter along with all other K/Vs it might 
affect (but limited to a known set of qualifiers).


was (Author: lhofhansl):
This causes *very* subtle changes, and actually break Phoenix secondary 
indexing.

Furthermore is different between
 * branch-1.3 (check remove)
 * branch-1.4 (check still present)
 * branch-1 (check removed, g, took me 3h to track this down)
 * branch-2 (check still present) and
 * master (check removed)

This is bad. And bad that we managed to leave it all in different states in 
different HBase branches.

What happened here?

[~apurtell],

> Empty qualifier cell should not be returned if it does not match 
> QualifierFilter
> 
>
> Key: HBASE-21158
> URL: https://issues.apache.org/jira/browse/HBASE-21158
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Critical
> Fix For: 3.0.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21158.branch-1.001.patch, 
> HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, 
> HBASE-21158.master.003.patch, HBASE-21158.master.004.patch
>
>
> {code:xml}
> hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1'
> 0 row(s) in 0.0040 seconds
> hbase(main):003:0> put 'testTable','testrow','f:','testvalue2'
> 0 row(s) in 0.0070 seconds
> # get row with empty column f:, result is correct.
> hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, 
> 'binary:')"}
> ROW COLUMN+CELL   
>   
>
>  testrowcolumn=f:, 
> timestamp=1536218563581, value=testvalue2 
>   
> 1 row(s) in 0.0460 seconds
> # get row with column f:testcol1, result is incorrect.
> hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, 
> 'binary:testcol1')"}
> ROW COLUMN+CELL   
>   
>
>  testrowcolumn=f:, 
> timestamp=1536218563581, value=testvalue2 
>   
>  testrowcolumn=f:testcol1, 
> timestamp=1536218550827, value=testvalue1 
>   
> 1 row(s) in 0.0070 seconds
> {code}
> As the above operation, when the row contains empty qualifier column, empty 
> qualifier cell is always returned when using QualifierFilter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21158) Empty qualifier cell should not be returned if it does not match QualifierFilter

2019-05-27 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849352#comment-16849352
 ] 

Lars Hofhansl commented on HBASE-21158:
---

This causes *very* subtle changes, and actually break Phoenix secondary 
indexing.

Furthermore is different between
 * branch-1.3 (check remove)
 * branch-1.4 (check still present)
 * branch-1 (check removed, g, took me 3h to track this down)
 * branch-2 (check still present) and
 * master (check removed)

This is bad. And bad that we managed to leave it all in different states in 
different HBase branches.

What happened here?

[~apurtell],

> Empty qualifier cell should not be returned if it does not match 
> QualifierFilter
> 
>
> Key: HBASE-21158
> URL: https://issues.apache.org/jira/browse/HBASE-21158
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Critical
> Fix For: 3.0.0, 1.3.3, 1.2.8, 2.2.0, 1.4.8, 2.1.1, 2.0.3
>
> Attachments: HBASE-21158.branch-1.001.patch, 
> HBASE-21158.master.001.patch, HBASE-21158.master.002.patch, 
> HBASE-21158.master.003.patch, HBASE-21158.master.004.patch
>
>
> {code:xml}
> hbase(main):002:0> put 'testTable','testrow','f:testcol1','testvalue1'
> 0 row(s) in 0.0040 seconds
> hbase(main):003:0> put 'testTable','testrow','f:','testvalue2'
> 0 row(s) in 0.0070 seconds
> # get row with empty column f:, result is correct.
> hbase(main):004:0> scan 'testTable',{FILTER => "QualifierFilter (=, 
> 'binary:')"}
> ROW COLUMN+CELL   
>   
>
>  testrowcolumn=f:, 
> timestamp=1536218563581, value=testvalue2 
>   
> 1 row(s) in 0.0460 seconds
> # get row with column f:testcol1, result is incorrect.
> hbase(main):005:0> scan 'testTable',{FILTER => "QualifierFilter (=, 
> 'binary:testcol1')"}
> ROW COLUMN+CELL   
>   
>
>  testrowcolumn=f:, 
> timestamp=1536218563581, value=testvalue2 
>   
>  testrowcolumn=f:testcol1, 
> timestamp=1536218550827, value=testvalue1 
>   
> 1 row(s) in 0.0070 seconds
> {code}
> As the above operation, when the row contains empty qualifier column, empty 
> qualifier cell is always returned when using QualifierFilter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846258#comment-16846258
 ] 

Lars Hofhansl edited comment on HBASE-22457 at 5/22/19 10:17 PM:
-

> No, but this is no more or less fast then the close then open we do for 
> 'alter' processing. It would be implemented the same way, ideally. 

We came to that conclusion as well in a discussion in the office. Just alter 
some minor thing on the Table/ColumnDescriptor so that all regions are 
closed/reopened resulting "Wouldn't it be nice if we had a tool that could do 
that without forcing us to change something." :)

> Scanner wrapping is a key thing. Without it I don't think Phoenix works. 

Oh totally agree. Though perhaps it is structurally somehow possible to ensure 
that passed scanner is either wrapped or closed. I can't think of anything, 
though.
(I.e. we can check after the hook invocation whether the returned scanner is 
different from the passed one... but of course we cannot tell whether it 
wrapped the passed scanner and will eventually close it.)


was (Author: lhofhansl):
> No, but this is no more or less fast then the close then open we do for 
> 'alter' processing. It would be implemented the same way, ideally. 

We came to that conclusion as well in a discussion in the office. Just alter 
some minor thing on the Table/ColumnDescriptor so that all regions are 
closed/reopened resulting "Wouldn't it be nice if we had a tool that could do 
that without forcing us to change something." :)

> Scanner wrapping is a key thing. Without it I don't think Phoenix works. 

Oh totally agree. Though perhaps it is structurally somehow possible to ensure 
that passed scanner is either wrapped or closed. I can't think of anything, 
though.

> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: 22457-random-1.5.txt
>
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846258#comment-16846258
 ] 

Lars Hofhansl commented on HBASE-22457:
---

> No, but this is no more or less fast then the close then open we do for 
> 'alter' processing. It would be implemented the same way, ideally. 

We came to that conclusion as well in a discussion in the office. Just alter 
some minor thing on the Table/ColumnDescriptor so that all regions are 
closed/reopened resulting "Wouldn't it be nice if we had a tool that could do 
that without forcing us to change something." :)

> Scanner wrapping is a key thing. Without it I don't think Phoenix works. 

Oh totally agree. Though perhaps it is structurally somehow possible to ensure 
that passed scanner is either wrapped or closed. I can't think of anything, 
though.

> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Attachments: 22457-random-1.5.txt
>
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-22457:
--
Attachment: 22457-random-1.5.txt

> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
> Attachments: 22457-random-1.5.txt
>
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846172#comment-16846172
 ] 

Lars Hofhansl commented on HBASE-22457:
---

Some random things that didn't quite look right.

> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
> Attachments: 22457-random-1.5.txt
>
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846161#comment-16846161
 ] 

Lars Hofhansl commented on HBASE-22457:
---

Oh, it's a hack for sure and, as you point out, might hide the actual problem.

I do like your idea. Can we do a fast close without flushing the memstore? 
(Otherwise it might not be "fast" :))

Another hacks/ideas:
# detect run-away refCount by comparing any reader's refCount with the actual 
number of open scanners (which we track for each HRegion) if the refCount is 
larger we know we have a problem.
# (a variation) when we attempt to archive an HFile that has refCount, check if 
there're any open scanners, if not archive anyway.

For #1 at least we could enhance the logging and include the number of 
currently scanners in the log (where we say that we cannot archive an HFile)

What I'm really looking for is a structural fix where a coprocessor cannot mess 
things. Perhaps that's not possible without severely limiting what coprocessors 
are allowed to do.


> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846075#comment-16846075
 ] 

Lars Hofhansl commented on HBASE-22457:
---

One idea is: Are there any "safepoints"? I.e. points where we know what the 
reference could should be, so we can reset it? An obvious one is when there are 
no scanners running at all; in that case we could reset all refCounts to 0.


> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22457) Harden the HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-22457:
--
Summary: Harden the HBase HFile reader reference counting  (was: Harden rhe 
HBase HFile reader reference counting)

> Harden the HBase HFile reader reference counting
> 
>
> Key: HBASE-22457
> URL: https://issues.apache.org/jira/browse/HBASE-22457
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> The problem that any coprocessor hook that replaces a passed scanner without 
> closing it can cause an incorrect reference count.
> This was bad and wrong before of course, but now it has pretty bad 
> consequences, since an incorrect reference could will prevent HFiles from 
> being archived indefinitely.
> All hooks that are passed a scanner and return a scanner are suspect, since 
> the returned scanner may or may not close the passed scanner:
> * preCompact
> * preCompactScannerOpen
> * preFlush
> * preFlushScannerOpen
> * preScannerOpen
> * preStoreScannerOpen
> * preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
> reader)
> I sampled the Phoenix and also Tephra code, and found a few instances where 
> this is happening.
> And for those I filed issued: TEPHRA-300, PHOENIX-5291
> (We're not using Tephra)
> The Phoenix ones should be rare. In our case we are seeing readers with 
> refCount > 1000.
> Perhaps there are other issues, a path where not all exceptions are caught 
> and scanner is left open that way perhaps. (Generally I am not a fan of 
> reference counting in complex systems - it's too easy to miss something. But 
> that's a different discussion. :) ).
> Let's brainstorm some way in which we can harden this.
> [~ram_krish], [~anoop.hbase], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22457) Harden rhe HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-22457:
-

 Summary: Harden rhe HBase HFile reader reference counting
 Key: HBASE-22457
 URL: https://issues.apache.org/jira/browse/HBASE-22457
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


The problem that any coprocessor hook that replaces a passed scanner without 
closing it can cause an incorrect reference count.
This was bad and wrong before of course, but now it has pretty bad 
consequences, since an incorrect reference could will prevent HFiles from being 
archived indefinitely.

All hooks that are passed a scanner and return a scanner are suspect, since the 
returned scanner may or may not close the passed scanner:
* preCompact
* preCompactScannerOpen
* preFlush
* preFlushScannerOpen
* preScannerOpen
* preStoreScannerOpen
* preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
reader)

I sampled the Phoenix and also Tephra code, and found a few instances where 
this is happening.
And for those I filed issued: TEPHRA-300, PHOENIX-5291
(We're not using Tephra)

The Phoenix ones should be rare. In our case we are seeing readers with 
refCount > 1000.
Perhaps there are other issues, a path where not all exceptions are caught and 
scanner is left open that way perhaps. (Generally I am not a fan of reference 
counting in complex systems - it's too easy to miss something. But that's a 
different discussion. :) ).

Let's brainstorm some way in which we can harden this.

[~ram_krish], [~anoop.hbase], [~apurtell]




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22072) High read/write intensive regions may cause long crash recovery

2019-05-22 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846057#comment-16846057
 ] 

Lars Hofhansl commented on HBASE-22072:
---

Lemme file a separate "Discussion" issue.

> High read/write intensive regions may cause long crash recovery
> ---
>
> Key: HBASE-22072
> URL: https://issues.apache.org/jira/browse/HBASE-22072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, Recovery
>Affects Versions: 2.0.0
>Reporter: Pavel
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
>  Labels: compaction
> Fix For: 2.2.0, 2.3.0, 2.0.6, 2.1.5
>
> Attachments: HBASE-22072.HBASE-21879-v1.patch
>
>
> Compaction of high read loaded region may leave compacted files undeleted 
> because of existing scan references:
> INFO org.apache.hadoop.hbase.regionserver.HStore - Can't archive compacted 
> file hdfs://hdfs-ha/hbase... because of either isCompactedAway=true or file 
> has reference, isReferencedInReads=true, refCount=1, skipping for now
> If region is either high write loaded this happens quite often and region may 
> have few storefiles and tons of undeleted compacted hdfs files.
> Region keeps all that files (in my case thousands) untill graceful region 
> closing procedure, which ignores existing references and drop obsolete files. 
> It works fine unless consuming some extra hdfs space, but only in case of 
> normal region closing. If region server crashes than new region server, 
> responsible for that overfiling region, reads hdfs folder and try to deal 
> with all undeleted files, producing tons of storefiles, compaction tasks and 
> consuming abnormal amount of memory, wich may lead to OutOfMemory Exception 
> and further region servers crash. This stops writing to region because number 
> of storefiles reach *hbase.hstore.blockingStoreFiles* limit, forces high GC 
> duty and may take hours to compact all files into working set of files.
> Workaround is a periodically check hdfs folders files count and force region 
> assign for ones with too many files.
> It could be nice if regionserver had a setting similar to 
> hbase.hstore.blockingStoreFiles and invoke attempt to drop undeleted 
> compacted files if number of files reaches this setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22072) High read/write intensive regions may cause long crash recovery

2019-05-21 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845085#comment-16845085
 ] 

Lars Hofhansl commented on HBASE-22072:
---

> Verified in branch-1 series. This issue does not exist there.

Are you guys sure? We are having an issue where ref counts in the 1000's and 
HFiles not being remove indefinitely until we close and reopen the regions in 
question (either by moving regions, disabling/enabling the table, or bouncing 
region servers).


> High read/write intensive regions may cause long crash recovery
> ---
>
> Key: HBASE-22072
> URL: https://issues.apache.org/jira/browse/HBASE-22072
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, Recovery
>Affects Versions: 2.0.0
>Reporter: Pavel
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
>  Labels: compaction
> Fix For: 2.2.0, 2.3.0, 2.0.6, 2.1.5
>
> Attachments: HBASE-22072.HBASE-21879-v1.patch
>
>
> Compaction of high read loaded region may leave compacted files undeleted 
> because of existing scan references:
> INFO org.apache.hadoop.hbase.regionserver.HStore - Can't archive compacted 
> file hdfs://hdfs-ha/hbase... because of either isCompactedAway=true or file 
> has reference, isReferencedInReads=true, refCount=1, skipping for now
> If region is either high write loaded this happens quite often and region may 
> have few storefiles and tons of undeleted compacted hdfs files.
> Region keeps all that files (in my case thousands) untill graceful region 
> closing procedure, which ignores existing references and drop obsolete files. 
> It works fine unless consuming some extra hdfs space, but only in case of 
> normal region closing. If region server crashes than new region server, 
> responsible for that overfiling region, reads hdfs folder and try to deal 
> with all undeleted files, producing tons of storefiles, compaction tasks and 
> consuming abnormal amount of memory, wich may lead to OutOfMemory Exception 
> and further region servers crash. This stops writing to region because number 
> of storefiles reach *hbase.hstore.blockingStoreFiles* limit, forces high GC 
> duty and may take hours to compact all files into working set of files.
> Workaround is a periodically check hdfs folders files count and force region 
> assign for ones with too many files.
> It could be nice if regionserver had a setting similar to 
> hbase.hstore.blockingStoreFiles and invoke attempt to drop undeleted 
> compacted files if number of files reaches this setting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836057#comment-16836057
 ] 

Lars Hofhansl commented on HBASE-22385:
---

In theory yes, but it would not be the goal, nor would this make it easier - I 
think.

> Consider "programmatic" HFiles
> --
>
> Key: HBASE-22385
> URL: https://issues.apache.org/jira/browse/HBASE-22385
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> For various use cases (among others there is mass deletes) it would be great 
> if HBase had a mechanism for programmatic HFiles. I.e. HFiles (Reader) that 
> produce KeyValues just like any other old HFile, but the key values produced 
> are generated or produced by some other means rather than being physically 
> read from some storage medium.
> In fact this could be a generalization for the various HFiles we have: 
> (Normal) HFiles, HFileLinks, HalfStoreFiles, etc.
> A simple way could be to allow for storing a classname into the HFile. Upon 
> reading the HFile HBase would instantiate an instance of that class and that 
> instance is responsible for all further interaction with that HFile. For 
> normal HFiles it would just be the normal HFileReaderVx. For that we'd also 
> need to StoreFile.Reader into an interface (or a more basic base class) that 
> can be properly implemented.
> (Remember this is Brainstorming :) )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-22385:
--
Description: 
For various use cases (among others there is mass deletes) it would be great if 
HBase had a mechanism for programmatic HFiles. I.e. HFiles (Reader) that 
produce KeyValues just like any other old HFile, but the key values produced 
are generated or produced by some other means rather than being physically read 
from some storage medium.

In fact this could be a generalization for the various HFiles we have: (Normal) 
HFiles, HFileLinks, HalfStoreFiles, etc.

A simple way could be to allow for storing a classname into the HFile. Upon 
reading the HFile HBase would instantiate an instance of that class and that 
instance is responsible for all further interaction with that HFile. For normal 
HFiles it would just be the normal HFileReaderVx. For that we'd also need to 
StoreFile.Reader into an interface (or a more basic base class) that can be 
properly implemented.

(Remember this is Brainstorming :) )

  was:
For various use cases (among others there is mass deletes) it would be great if 
HBase had a mechanism for programmatic HFiles. I.e. HFiles (with HFileScanner 
and Reader) that produce KeyValues just like any other old HFile, but the key 
values produced are generated or produced by some other means rather than being 
physically read from some storage medium.

In fact this could be a generalization for the various HFiles we have: (Normal) 
HFiles, HFileLinks, HalfStoreFiles, etc.

A simple way could be to allow for storing a classname into the HFile. Upon 
reading the HFile HBase would instantiate an instance of that class and that 
instance is responsible for all further interaction with that HFile. For normal 
HFiles it would just be the normal HFileReaderVx.

(Remember this is Brainstorming :) )


> Consider "programmatic" HFiles
> --
>
> Key: HBASE-22385
> URL: https://issues.apache.org/jira/browse/HBASE-22385
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> For various use cases (among others there is mass deletes) it would be great 
> if HBase had a mechanism for programmatic HFiles. I.e. HFiles (Reader) that 
> produce KeyValues just like any other old HFile, but the key values produced 
> are generated or produced by some other means rather than being physically 
> read from some storage medium.
> In fact this could be a generalization for the various HFiles we have: 
> (Normal) HFiles, HFileLinks, HalfStoreFiles, etc.
> A simple way could be to allow for storing a classname into the HFile. Upon 
> reading the HFile HBase would instantiate an instance of that class and that 
> instance is responsible for all further interaction with that HFile. For 
> normal HFiles it would just be the normal HFileReaderVx. For that we'd also 
> need to StoreFile.Reader into an interface (or a more basic base class) that 
> can be properly implemented.
> (Remember this is Brainstorming :) )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-22385:
--
Description: 
For various use cases (among others there is mass deletes) it would be great if 
HBase had a mechanism for programmatic HFiles. I.e. HFiles (with HFileScanner 
and Reader) that produce KeyValues just like any other old HFile, but the key 
values produced are generated or produced by some other means rather than being 
physically read from some storage medium.

In fact this could be a generalization for the various HFiles we have: (Normal) 
HFiles, HFileLinks, HalfStoreFiles, etc.

A simple way could be to allow for storing a classname into the HFile. Upon 
reading the HFile HBase would instantiate an instance of that class and that 
instance is responsible for all further interaction with that HFile. For normal 
HFiles it would just be the normal HFileReaderVx.

(Remember this is Brainstorming :) )

  was:
For various use case (among other there is mass deletes) it would be great if 
HBase had a mechanism for programmatic HFiles. I.e. HFiles (with HFileScanner 
and Reader) that produce KeyValue just like any other old HFile, but the key 
values produced are generated or produced by some other means rather than being 
physically read from some storage medium.

In fact this could be a generalization for the various HFiles we have: (Normal) 
HFiles, HFileLinks, HalfStoreFiles, etc.

A simple way could be to allow for storing a classname into the HFile. Upon 
reading the HFile HBase would instantiate an instance of that class and that 
instance is responsible for all further interaction with that HFile. For normal 
HFiles it would just be the normal HFileReader.

(Remember this is Brainstorming)


> Consider "programmatic" HFiles
> --
>
> Key: HBASE-22385
> URL: https://issues.apache.org/jira/browse/HBASE-22385
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> For various use cases (among others there is mass deletes) it would be great 
> if HBase had a mechanism for programmatic HFiles. I.e. HFiles (with 
> HFileScanner and Reader) that produce KeyValues just like any other old 
> HFile, but the key values produced are generated or produced by some other 
> means rather than being physically read from some storage medium.
> In fact this could be a generalization for the various HFiles we have: 
> (Normal) HFiles, HFileLinks, HalfStoreFiles, etc.
> A simple way could be to allow for storing a classname into the HFile. Upon 
> reading the HFile HBase would instantiate an instance of that class and that 
> instance is responsible for all further interaction with that HFile. For 
> normal HFiles it would just be the normal HFileReaderVx.
> (Remember this is Brainstorming :) )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835763#comment-16835763
 ] 

Lars Hofhansl commented on HBASE-22385:
---

[~jisaac], what we chatted about.

> Consider "programmatic" HFiles
> --
>
> Key: HBASE-22385
> URL: https://issues.apache.org/jira/browse/HBASE-22385
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> For various use case (among other there is mass deletes) it would be great if 
> HBase had a mechanism for programmatic HFiles. I.e. HFiles (with HFileScanner 
> and Reader) that produce KeyValue just like any other old HFile, but the key 
> values produced are generated or produced by some other means rather than 
> being physically read from some storage medium.
> In fact this could be a generalization for the various HFiles we have: 
> (Normal) HFiles, HFileLinks, HalfStoreFiles, etc.
> A simple way could be to allow for storing a classname into the HFile. Upon 
> reading the HFile HBase would instantiate an instance of that class and that 
> instance is responsible for all further interaction with that HFile. For 
> normal HFiles it would just be the normal HFileReader.
> (Remember this is Brainstorming)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-22385:
-

 Summary: Consider "programmatic" HFiles
 Key: HBASE-22385
 URL: https://issues.apache.org/jira/browse/HBASE-22385
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


For various use case (among other there is mass deletes) it would be great if 
HBase had a mechanism for programmatic HFiles. I.e. HFiles (with HFileScanner 
and Reader) that produce KeyValue just like any other old HFile, but the key 
values produced are generated or produced by some other means rather than being 
physically read from some storage medium.

In fact this could be a generalization for the various HFiles we have: (Normal) 
HFiles, HFileLinks, HalfStoreFiles, etc.

A simple way could be to allow for storing a classname into the HFile. Upon 
reading the HFile HBase would instantiate an instance of that class and that 
instance is responsible for all further interaction with that HFile. For normal 
HFiles it would just be the normal HFileReader.

(Remember this is Brainstorming)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-11811) Use binary search for seeking into a block

2019-05-06 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-11811:
--
Resolution: Implemented
Status: Resolved  (was: Patch Available)

This has been implemented now with the ROW_INDEX_V1 block encoding. Closing...

> Use binary search for seeking into a block
> --
>
> Key: HBASE-11811
> URL: https://issues.apache.org/jira/browse/HBASE-11811
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: 11811-wip-v2.txt, 11811-wip-v4.txt, block_index-v2.txt
>
>
> Currently upon every seek (including Gets) we need to linearly look through 
> the block from the beginning until we find the Cell we are looking for.
> It should be possible to build a simple cache of offsets of Cells for each 
> block as it is loaded and then use binary search to find the Cell in question.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17884) Backport HBASE-16217 to branch-1

2019-04-25 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826655#comment-16826655
 ] 

Lars Hofhansl commented on HBASE-17884:
---

This is probably another case where Phoenix reaches too deep into HBase.
This is where it fails:
{code}
private static abstract class CoprocessorOperation extends ObserverContext {
abstract void call(MetaDataEndpointObserver oserver, ObserverContext 
ctx) throws IOException;

public void postEnvCall(T env) {}
}
{code}
Could probably add a no-argument constructor there.

> Backport HBASE-16217 to branch-1
> 
>
> Key: HBASE-17884
> URL: https://issues.apache.org/jira/browse/HBASE-17884
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-17884-branch-1.patch, HBASE-17884-branch-1.patch, 
> HBASE-17884.branch-1.001.patch
>
>
> The change to add calling user to ObserverContext in HBASE-16217 should also 
> be applied to branch-1 to avoid use of UserGroupInformation.doAs() for access 
> control checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-17884) Backport HBASE-16217 to branch-1

2019-04-25 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826651#comment-16826651
 ] 

Lars Hofhansl edited comment on HBASE-17884 at 4/26/19 5:25 AM:


Locally reverted for now. This is for Phoenix which deploys a coprocessor that 
was built against a 1.4.x version of HBase.

Edit: I agree it's a good change. Not sure how we can do that without breaking 
Phoenix (in this case, there're possibly other things broken)


was (Author: lhofhansl):
Locally reverted for now. This is for Phoenix which deploys a coprocessor that 
was built against a 1.4.x version of HBase.

> Backport HBASE-16217 to branch-1
> 
>
> Key: HBASE-17884
> URL: https://issues.apache.org/jira/browse/HBASE-17884
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-17884-branch-1.patch, HBASE-17884-branch-1.patch, 
> HBASE-17884.branch-1.001.patch
>
>
> The change to add calling user to ObserverContext in HBASE-16217 should also 
> be applied to branch-1 to avoid use of UserGroupInformation.doAs() for access 
> control checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17884) Backport HBASE-16217 to branch-1

2019-04-25 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826651#comment-16826651
 ] 

Lars Hofhansl commented on HBASE-17884:
---

Locally reverted for now. This is for Phoenix which deploys a coprocessor that 
was built against a 1.4.x version of HBase.

> Backport HBASE-16217 to branch-1
> 
>
> Key: HBASE-17884
> URL: https://issues.apache.org/jira/browse/HBASE-17884
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-17884-branch-1.patch, HBASE-17884-branch-1.patch, 
> HBASE-17884.branch-1.001.patch
>
>
> The change to add calling user to ObserverContext in HBASE-16217 should also 
> be applied to branch-1 to avoid use of UserGroupInformation.doAs() for access 
> control checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17884) Backport HBASE-16217 to branch-1

2019-04-25 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826645#comment-16826645
 ] 

Lars Hofhansl commented on HBASE-17884:
---

Does this break binary compatability?!
{code:java}
19/04/25 22:15:04 WARN ipc.CoprocessorRpcChannel: Call failed on IOException
org.apache.hadoop.hbase.DoNotRetryIOException: 
org.apache.hadoop.hbase.DoNotRetryIOException: TEST: 
org.apache.hadoop.hbase.coprocessor.ObserverContext: method ()V not found
at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:121)
at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:656)
at 
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17038)
at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8466)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2276)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2258)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36617)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2380)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)
Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.coprocessor.ObserverContext: method ()V not found
at 
org.apache.phoenix.coprocessor.PhoenixMetaDataCoprocessorHost$CoprocessorOperation.(PhoenixMetaDataCoprocessorHost.java:63)
at 
org.apache.phoenix.coprocessor.PhoenixMetaDataCoprocessorHost$CoprocessorOperation.(PhoenixMetaDataCoprocessorHost.java:63)
at 
org.apache.phoenix.coprocessor.PhoenixMetaDataCoprocessorHost$1.(PhoenixMetaDataCoprocessorHost.java:157)
at 
org.apache.phoenix.coprocessor.PhoenixMetaDataCoprocessorHost.preGetTable(PhoenixMetaDataCoprocessorHost.java:157)
at 
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:621)
... 9 more
{code}

[~apurtell]

> Backport HBASE-16217 to branch-1
> 
>
> Key: HBASE-17884
> URL: https://issues.apache.org/jira/browse/HBASE-17884
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, security
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-17884-branch-1.patch, HBASE-17884-branch-1.patch, 
> HBASE-17884.branch-1.001.patch
>
>
> The change to add calling user to ObserverContext in HBASE-16217 should also 
> be applied to branch-1 to avoid use of UserGroupInformation.doAs() for access 
> control checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22235) OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party coprocessors

2019-04-16 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819480#comment-16819480
 ] 

Lars Hofhansl commented on HBASE-22235:
---

+1

> OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party 
> coprocessors
> ---
>
> Key: HBASE-22235
> URL: https://issues.apache.org/jira/browse/HBASE-22235
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Reporter: Lars Hofhansl
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.10, 2.3.0, 1.2.12, 2.1.5, 2.2.1, 1.3.5
>
> Attachments: HBASE-22235-branch-1.patch, HBASE-22235.patch, 
> HBASE-22235.patch, HBASE-22235.patch
>
>
> preBatchMutate is useless for some operation due to this.
> See also TEPHRA-299. This looks like an oversight.
> MiniBatchOperationInProgress has limited visibility for coprocessors. 
> OperationStatus and OperationStatusCode should have the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22235) OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party coprocessors

2019-04-16 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819389#comment-16819389
 ] 

Lars Hofhansl edited comment on HBASE-22235 at 4/16/19 6:42 PM:


The constants need to be the public in OperationStatus, which you have in the 
branch-1 patch but not the master patch - perhaps I'm missing something in the 
master branch...?


was (Author: lhofhansl):
The constants need to be the public in OperationStatus.

> OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party 
> coprocessors
> ---
>
> Key: HBASE-22235
> URL: https://issues.apache.org/jira/browse/HBASE-22235
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Reporter: Lars Hofhansl
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22235-branch-1.patch, HBASE-22235.patch, 
> HBASE-22235.patch
>
>
> preBatchMutate is useless for some operation due to this.
> See also TEPHRA-299. This looks like an oversight.
> MiniBatchOperationInProgress has limited visibility for coprocessors. 
> OperationStatus and OperationStatusCode should have the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22235) OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party coprocessors

2019-04-16 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819389#comment-16819389
 ] 

Lars Hofhansl commented on HBASE-22235:
---

The constants need to be the public in OperationStatus.

> OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party 
> coprocessors
> ---
>
> Key: HBASE-22235
> URL: https://issues.apache.org/jira/browse/HBASE-22235
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Reporter: Lars Hofhansl
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22235-branch-1.patch, HBASE-22235.patch, 
> HBASE-22235.patch
>
>
> preBatchMutate is useless for some operation due to this.
> See also TEPHRA-299. This looks like an oversight.
> MiniBatchOperationInProgress has limited visibility for coprocessors. 
> OperationStatus and OperationStatusCode should have the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >