[jira] [Created] (HBASE-17722) Metrics subsystem stop/start messages add a lot of useless bulk to operational logging

2017-03-02 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-17722:
--

 Summary: Metrics subsystem stop/start messages add a lot of 
useless bulk to operational logging
 Key: HBASE-17722
 URL: https://issues.apache.org/jira/browse/HBASE-17722
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 1.2.4, 1.3.0
Reporter: Andrew Purtell


Metrics subsystem stop/start messages add a lot of useless bulk to operational 
logging. Say you are collecting logs from a fleet of thousands of servers and 
want to have them around for ~month or longer. It adds up. 

I think these should at least be at DEBUG level and ideally at TRACE. They 
don't offer much utility.

{noformat}
 INFO  [] impl.MetricsSystemImpl: HBase metrics system started

 INFO  [] impl.MetricsSystemImpl: Stopping HBase metrics system...

 INFO  [] impl.MetricsSystemImpl: HBase metrics system stopped.

 INFO  [] impl.MetricsConfig: loaded properties from 
hadoop-metrics2-hbase.properties

 INFO  [] impl.MetricsSystemImpl: Scheduled snapshot period at 10 
second(s).
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17721) Provide streaming APIs with SSL/TLS

2017-03-02 Thread Alex Araujo (JIRA)
Alex Araujo created HBASE-17721:
---

 Summary: Provide streaming APIs with SSL/TLS
 Key: HBASE-17721
 URL: https://issues.apache.org/jira/browse/HBASE-17721
 Project: HBase
  Issue Type: Umbrella
Reporter: Alex Araujo
Assignee: Alex Araujo
 Fix For: 2.0.0


Umbrella to add optional client/server streaming capabilities to HBase.
This would allow bandwidth to be used more efficiently for certain operations, 
and allow clients to use SSL/TLS for authentication and encryption.

Desired client/server scaffolding:
- HTTP/2 support
- Protocol negotiation (blocking vs streaming, auth, encryption, etc.)
- TLS/SSL support
- Streaming RPC support

Possibilities (and their tradeoffs):
- gRPC: Some initial work and discussion on HBASE-13467 (Prototype using GRPC 
as IPC mechanism)
-- Has most or all of the desired scaffolding
-- Adds additional g* dependencies. Compat story for g* dependencies not always 
ideal
- Custom HTTP/2 based client/server APIs
-- More control over compat story
-- Non-trivial to build scaffolding; might reinvent wheels along the way
- Others?

Related Jiras that might be rolled in as sub-tasks (or closed/replaced with new 
ones):
HBASE-17708 (Expose config to set two-way auth over TLS in HttpServer and add a 
test)
HBASE-8691 (High-Throughput Streaming Scan API)
HBASE-14899 (Create custom Streaming ReplicationEndpoint)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-17720) Possible bug in FlushSnapshotSubprocedure

2017-03-02 Thread Ben Lau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau resolved HBASE-17720.
-
Resolution: Duplicate

> Possible bug in FlushSnapshotSubprocedure
> -
>
> Key: HBASE-17720
> URL: https://issues.apache.org/jira/browse/HBASE-17720
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, snapshots
>Reporter: Ben Lau
>
> I noticed that FlushSnapshotSubProcedure differs from MemstoreFlusher in that 
> it does not appear to explicitly handle a DroppedSnapshotException.  In the 
> primary codepath when flushing memstores, (see 
> MemStoreFlusher.flushRegion()), there is a try/catch for 
> DroppedSnapshotException that will abort the regionserver to replay WALs to 
> avoid data loss.  I don't see this in FlushSnapshotSubProcedure.  Is this an 
> accidental omission or is there a reason this isn't present?  
> I'm not too familiar with procedure V1 or V2.  I assume it is the case that 
> if a participant dies that all other participants will terminate any 
> outstanding operations for the procedure?  If so and if this lack of 
> RS.abort() for DroppedSnapshotException is a bug, then it can't be fixed 
> naively otherwise I assume a failed flush on 1 region server could cause a 
> cascade of RS abortions on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17720) Possible bug in FlushSnapshotSubprocedure

2017-03-02 Thread Ben Lau (JIRA)
Ben Lau created HBASE-17720:
---

 Summary: Possible bug in FlushSnapshotSubprocedure
 Key: HBASE-17720
 URL: https://issues.apache.org/jira/browse/HBASE-17720
 Project: HBase
  Issue Type: Bug
  Components: dataloss, snapshots
Reporter: Ben Lau


I noticed that FlushSnapshotSubProcedure differs from MemstoreFlusher in that 
it does not appear to explicitly handle a DroppedSnapshotException.  In the 
primary codepath when flushing memstores, (see MemStoreFlusher.flushRegion()), 
there is a try/catch for DroppedSnapshotException that will abort the 
regionserver to replay WALs to avoid data loss.  I don't see this in 
FlushSnapshotSubProcedure.  Is this an accidental omission or is there a reason 
this isn't present?  

I'm not too familiar with procedure V1 or V2.  I assume it is the case that if 
a participant dies that all other participants will terminate any outstanding 
operations for the procedure?  If so and if this lack of RS.abort() for 
DroppedSnapshotException is a bug, then it can't be fixed naively otherwise I 
assume a failed flush on 1 region server could cause a cascade of RS abortions 
on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-17714) Client heartbeats seems to be broken

2017-03-02 Thread Samarth Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain resolved HBASE-17714.
--
Resolution: Not A Bug

> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17719) Pre-Emptive Fast Fail does not apply to scanners

2017-03-02 Thread James Moore (JIRA)
James Moore created HBASE-17719:
---

 Summary: Pre-Emptive Fast Fail does not apply to scanners
 Key: HBASE-17719
 URL: https://issues.apache.org/jira/browse/HBASE-17719
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 1.2.0
Reporter: James Moore
Assignee: James Moore


on CDH 5.9.0 testing revealed that scanners do not leverage Pre-emptive fast 
fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Successful: HBase Generate Website

2017-03-02 Thread Apache Jenkins Server
Build status: Successful

If successful, the website and docs have been generated. To update the live 
site, follow the instructions below. If failed, skip to the bottom of this 
email.

Use the following commands to download the patch and apply it to a clean branch 
based on origin/asf-site. If you prefer to keep the hbase-site repo around 
permanently, you can skip the clone step.

  git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git

  cd hbase-site
  wget -O- 
https://builds.apache.org/job/hbase_generate_website/504/artifact/website.patch.zip
 | funzip > 697a55a8782d940aa4f1287c2ef4a45ba516cac1.patch
  git fetch
  git checkout -b asf-site-697a55a8782d940aa4f1287c2ef4a45ba516cac1 
origin/asf-site
  git am --whitespace=fix 697a55a8782d940aa4f1287c2ef4a45ba516cac1.patch

At this point, you can preview the changes by opening index.html or any of the 
other HTML pages in your local 
asf-site-697a55a8782d940aa4f1287c2ef4a45ba516cac1 branch.

There are lots of spurious changes, such as timestamps and CSS styles in 
tables, so a generic git diff is not very useful. To see a list of files that 
have been added, deleted, renamed, changed type, or are otherwise interesting, 
use the following command:

  git diff --name-status --diff-filter=ADCRTXUB origin/asf-site

To see only files that had 100 or more lines changed:

  git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}'

When you are satisfied, publish your changes to origin/asf-site using these 
commands:

  git commit --allow-empty -m "Empty commit" # to work around a current ASF 
INFRA bug
  git push origin asf-site-697a55a8782d940aa4f1287c2ef4a45ba516cac1:asf-site
  git checkout asf-site
  git branch -D asf-site-697a55a8782d940aa4f1287c2ef4a45ba516cac1

Changes take a couple of minutes to be propagated. You can verify whether they 
have been propagated by looking at the Last Published date at the bottom of 
http://hbase.apache.org/. It should match the date in the index.html on the 
asf-site branch in Git.

As a courtesy- reply-all to this email to let other committers know you pushed 
the site.



If failed, see https://builds.apache.org/job/hbase_generate_website/504/console

[jira] [Reopened] (HBASE-17595) Add partial result support for small/limited scan

2017-03-02 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-17595:
---

hasMoreCellsInRow is not stable as KeyvalueHeap.peek does not use filter.

> Add partial result support for small/limited scan
> -
>
> Key: HBASE-17595
> URL: https://issues.apache.org/jira/browse/HBASE-17595
> Project: HBase
>  Issue Type: Sub-task
>  Components: asyncclient, Client, scan
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17595-branch-1.patch, HBASE-17595.patch, 
> HBASE-17595-v1.patch
>
>
> The partial result support is marked as a 'TODO' when implementing 
> HBASE-17045. And when implementing HBASE-17508, we found that if we make 
> small scan share the same logic with general scan, the scan request other 
> than open scanner will not have the small flag so the server may return  
> partial result to the client and cause some strange behavior. It is solved by 
> modifying the logic at server side, but this means the 1.4.x client is not 
> safe to contact with earlier 1.x server. So we'd better address the problem 
> at client side. Marked as blocker as this issue should be finished before any 
> 2.x and 1.4.x releases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)