[jira] [Created] (HBASE-24619) Try compact the recovered hfiles firstly after region online

2020-06-22 Thread Guanghao Zhang (Jira)
Guanghao Zhang created HBASE-24619:
--

 Summary: Try compact the recovered hfiles firstly after region 
online
 Key: HBASE-24619
 URL: https://issues.apache.org/jira/browse/HBASE-24619
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang


As discussed in HBASE-23739, there may have many recovered hfiles. Should find 
a better way to compact them firstly after region online.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23055) Alter hbase:meta

2020-06-22 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-23055.
---
Resolution: Fixed

Pushed addendum on branch-2.3+

> Alter hbase:meta
> 
>
> Key: HBASE-23055
> URL: https://issues.apache.org/jira/browse/HBASE-23055
> Project: HBase
>  Issue Type: Task
>  Components: meta
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> hbase:meta is currently hardcoded. Its schema cannot be change.
> This issue is about allowing edits to hbase:meta schema. It will allow our 
> being able to set encodings such as the block-with-indexes which will help 
> quell CPU usage on host carrying hbase:meta. A dynamic hbase:meta is first 
> step on road to being able to split meta.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24605) Break long region names in the web UI

2020-06-22 Thread Guangxu Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangxu Cheng resolved HBASE-24605.
---
Fix Version/s: 2.2.6
   2.4.0
   2.3.1
   3.0.0-alpha-1
   Resolution: Fixed

Pushed to branch-2.2+, Thanks for your contributing.[~songxincun]

> Break long region names in the web UI
> -
>
> Key: HBASE-24605
> URL: https://issues.apache.org/jira/browse/HBASE-24605
> Project: HBase
>  Issue Type: Improvement
>  Components: UI
>Affects Versions: 3.0.0-alpha-1
>Reporter: song XinCun
>Assignee: song XinCun
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.3.1, 2.4.0, 2.2.6
>
> Attachments: image-2020-06-21-20-18-37-041.png, 
> image-2020-06-21-20-19-25-183.png, image-2020-06-21-20-20-02-782.png, 
> image-2020-06-21-20-27-23-474.png, image-2020-06-21-20-28-36-464.png, 
> image-2020-06-21-20-29-07-819.png
>
>
> Before this patch, when it comes to the long region name, the UI content will 
> be out of the screen, making it unreadable. Like this:
> !image-2020-06-21-20-18-37-041.png|width=542,height=50!
> !image-2020-06-21-20-19-25-183.png|width=531,height=23!
> !image-2020-06-21-20-20-02-782.png|width=542,height=146!
>  
> After this patch, the long region name wil be break to the new line, like 
> this:
> !image-2020-06-21-20-27-23-474.png|width=529,height=35!
> !image-2020-06-21-20-28-36-464.png|width=533,height=33!
> !image-2020-06-21-20-29-07-819.png|width=531,height=117!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24618) Backport HBASE-21204 to branch-1

2020-06-22 Thread Abhishek Singh Chouhan (Jira)
Abhishek Singh Chouhan created HBASE-24618:
--

 Summary: Backport HBASE-21204 to branch-1
 Key: HBASE-24618
 URL: https://issues.apache.org/jira/browse/HBASE-24618
 Project: HBase
  Issue Type: Improvement
Reporter: Abhishek Singh Chouhan
Assignee: Abhishek Singh Chouhan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24617) Enable injecting build start time value as part of build

2020-06-22 Thread Matthew Foley (Jira)
Matthew Foley created HBASE-24617:
-

 Summary: Enable injecting build start time value as part of build
 Key: HBASE-24617
 URL: https://issues.apache.org/jira/browse/HBASE-24617
 Project: HBase
  Issue Type: Improvement
  Components: create-release, UI
Affects Versions: 3.0.0-alpha-1, 2.3.0
Reporter: Matthew Foley
Assignee: Matthew Foley


The HBase build's creation time is presented in the HBase UI, and made 
available through Java, via the {{org.apache.hadoop.hbase.Version}} class's 
{{date}} value, which is generated at build time by 
{{hbase-common/src/saveVersion.sh}}. The script just invokes the shell command 
{{date}} and captures its result as a string.

The problem is, this occurs every time hbase-common is built. And, for good and 
sufficient reason, when making a release via dev-support/create-release, the 
task for building and deploying hbase jars as maven libraries and the task for 
building binary release artifacts as tarballs, EACH do a {{clean}} build. Thus, 
the build time found in the libs is different from the build time found in the 
release tarballs.

There is value in keeping the two tasks independent, and able to run fully each 
by themselves. And there is value in doing a {{clean}} at the start of such 
processes, to make sure you're releasing binaries that exactly match the source 
code.

So to keep these benefits, but enable the start time to be determined once and 
used for a couple builds in a row in a given environment, I propose to allow 
injecting the desired value. Specifically, I want to change saveVersion.sh to 
look for an existing value of env var HBASE_BUILD_TIME, and if it exists use it 
instead of calling {{date}}. One would of course set it as part of the build 
process (in create-release) and clear this value by unsetting the environment 
variable when done with the build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] VisibleForTesting annotation as it pertains to our API compatibility guidelines

2020-06-22 Thread Bharath Vissapragada
Sorry, I should've been clearer. It's the former. My point is, any method
tagged with @VisibleForTesting is only intended for testing purposes and
should _not_ be considered public, its visibility scope is wider than
necessary only because it was needed by some test method. That's how I'd
interpret it (Actually, that's what I thought you meant, now I'm confused
:-)).

On Mon, Jun 22, 2020 at 4:02 PM Nick Dimiduk  wrote:

> On Mon, Jun 22, 2020 at 3:45 PM Bharath Vissapragada 
> wrote:
>
> > I share the same opinion. Infact hadoop (from which our annotations are
> > derived I believe), talks about this, "Also, certain APIs are annotated
> as
> > @VisibleForTesting (from com.google.common
> .annotations.VisibleForTesting)
> > - these are meant to be used strictly for unit tests and should be
> treated
> > as “Private” APIs."
> >
> >
> >
> https://hadoop.apache.org/docs/r3.1.2/hadoop-project-dist/hadoop-common/InterfaceClassification.html
> >
>
> Sorry Bharath, I don't follow. Are you saying "I share the opinion that the
> VisibleForTesting annotation should be considered as defining a method as
> IA.Private," and this is an omission from our community guidelines
> document? Or are you saying "no, it does not count as an interface audience
> marker," and we are obliged to treat methods such as in this example as
> public API?
>
> Thanks,
> Nick
>
> On Mon, Jun 22, 2020 at 10:15 AM Sean Busbey  wrote:
> >
> > > Yeah I would say no as well. We should make clear on our dev guide that
> > you
> > > also should be marking those things with an Interface Audience marking
> if
> > > you don't intend them to be at the downstream API visibility of the
> > parent
> > > class.
> > >
> > > (IIRC we also use VisibleForTesting in IA.Private classes to
> proactively
> > > explain why some internal looking member is at a wider Java access
> > scope.)
> > >
> > > On Mon, Jun 22, 2020, 11:39 Nick Dimiduk  wrote:
> > >
> > > > Hello,
> > > >
> > > > This came up over on the 2.3.0RC0 thread, so let's open it for proper
> > > > discussion. In that context, we observe method signature changes to a
> > > > method marked with the Guava VisibleForTesting annotation. The method
> > is
> > > a
> > > > protected method on a IA.Public class. There is no method-level IA
> > > > annotation.
> > > >
> > > > Do we consider the VisibleForTesting annotation as a specifier for
> our
> > > > compatibility guidelines?
> > > >
> > > > I am of the opinion that no, it is not an InterfaceAudience
> annotation,
> > > and
> > > > so it is not applicable for defining our public API.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Nick
> > > >
> > >
> >
>


[jira] [Created] (HBASE-24616) Remove BoundedRecoveredHFilesOutputSink dependency on a TableDescriptor

2020-06-22 Thread Michael Stack (Jira)
Michael Stack created HBASE-24616:
-

 Summary: Remove BoundedRecoveredHFilesOutputSink  dependency on a 
TableDescriptor
 Key: HBASE-24616
 URL: https://issues.apache.org/jira/browse/HBASE-24616
 Project: HBase
  Issue Type: Bug
  Components: HFile, MTTR
Reporter: Michael Stack


BoundedRecoveredHFilesOutputSink wants to read TableDescriptor so it writes the 
particular hfile format specified by a table's schema. Getting the table schema 
can be tough at various points of operation especially around startup. 
HBASE-23739 tried to read from the fs if unable to read TableDescriptor from 
Master. This approach works generally but fails in standalone mode as in 
standalone mode we will have given-up our start up attempt BEFORE the request 
to Master for TableDescriptor times out (the read from fs is never attempted).

The suggested patch here does away w/ reading TableDescriptor and just has 
BoundedRecoveredHFilesOutputSink write generic hfiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-15161) Umbrella: Miscellaneous improvements from production usage

2020-06-22 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-15161.
--
Fix Version/s: 2.3.0
   3.0.0-alpha-1
   Resolution: Fixed

> Umbrella: Miscellaneous improvements from production usage
> --
>
> Key: HBASE-15161
> URL: https://issues.apache.org/jira/browse/HBASE-15161
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> We use HBase to (mainly) build index for our search engine in Alibaba. 
> Recently we are upgrading our online cluster from 0.98.12 to 1.x and I'd like 
> to take the opportunity to contribute a bunch of our private patches to 
> community (better late than never, I hope :-)). This is an umbrella to track 
> this effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] VisibleForTesting annotation as it pertains to our API compatibility guidelines

2020-06-22 Thread Nick Dimiduk
On Mon, Jun 22, 2020 at 3:45 PM Bharath Vissapragada 
wrote:

> I share the same opinion. Infact hadoop (from which our annotations are
> derived I believe), talks about this, "Also, certain APIs are annotated as
> @VisibleForTesting (from com.google.common .annotations.VisibleForTesting)
> - these are meant to be used strictly for unit tests and should be treated
> as “Private” APIs."
>
>
> https://hadoop.apache.org/docs/r3.1.2/hadoop-project-dist/hadoop-common/InterfaceClassification.html
>

Sorry Bharath, I don't follow. Are you saying "I share the opinion that the
VisibleForTesting annotation should be considered as defining a method as
IA.Private," and this is an omission from our community guidelines
document? Or are you saying "no, it does not count as an interface audience
marker," and we are obliged to treat methods such as in this example as
public API?

Thanks,
Nick

On Mon, Jun 22, 2020 at 10:15 AM Sean Busbey  wrote:
>
> > Yeah I would say no as well. We should make clear on our dev guide that
> you
> > also should be marking those things with an Interface Audience marking if
> > you don't intend them to be at the downstream API visibility of the
> parent
> > class.
> >
> > (IIRC we also use VisibleForTesting in IA.Private classes to proactively
> > explain why some internal looking member is at a wider Java access
> scope.)
> >
> > On Mon, Jun 22, 2020, 11:39 Nick Dimiduk  wrote:
> >
> > > Hello,
> > >
> > > This came up over on the 2.3.0RC0 thread, so let's open it for proper
> > > discussion. In that context, we observe method signature changes to a
> > > method marked with the Guava VisibleForTesting annotation. The method
> is
> > a
> > > protected method on a IA.Public class. There is no method-level IA
> > > annotation.
> > >
> > > Do we consider the VisibleForTesting annotation as a specifier for our
> > > compatibility guidelines?
> > >
> > > I am of the opinion that no, it is not an InterfaceAudience annotation,
> > and
> > > so it is not applicable for defining our public API.
> > >
> > > What do you think?
> > >
> > > Thanks,
> > > Nick
> > >
> >
>


Re: [DISCUSS] VisibleForTesting annotation as it pertains to our API compatibility guidelines

2020-06-22 Thread Bharath Vissapragada
I share the same opinion. Infact hadoop (from which our annotations are
derived I believe), talks about this, "Also, certain APIs are annotated as
@VisibleForTesting (from com.google.common .annotations.VisibleForTesting)
- these are meant to be used strictly for unit tests and should be treated
as “Private” APIs."

https://hadoop.apache.org/docs/r3.1.2/hadoop-project-dist/hadoop-common/InterfaceClassification.html

On Mon, Jun 22, 2020 at 10:15 AM Sean Busbey  wrote:

> Yeah I would say no as well. We should make clear on our dev guide that you
> also should be marking those things with an Interface Audience marking if
> you don't intend them to be at the downstream API visibility of the parent
> class.
>
> (IIRC we also use VisibleForTesting in IA.Private classes to proactively
> explain why some internal looking member is at a wider Java access scope.)
>
> On Mon, Jun 22, 2020, 11:39 Nick Dimiduk  wrote:
>
> > Hello,
> >
> > This came up over on the 2.3.0RC0 thread, so let's open it for proper
> > discussion. In that context, we observe method signature changes to a
> > method marked with the Guava VisibleForTesting annotation. The method is
> a
> > protected method on a IA.Public class. There is no method-level IA
> > annotation.
> >
> > Do we consider the VisibleForTesting annotation as a specifier for our
> > compatibility guidelines?
> >
> > I am of the opinion that no, it is not an InterfaceAudience annotation,
> and
> > so it is not applicable for defining our public API.
> >
> > What do you think?
> >
> > Thanks,
> > Nick
> >
>


[jira] [Created] (HBASE-24615) MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the distribution for last bucket.

2020-06-22 Thread Rushabh Shah (Jira)
Rushabh Shah created HBASE-24615:


 Summary: MutableRangeHistogram#updateSnapshotRangeMetrics doesn't 
calculate the distribution for last bucket.
 Key: HBASE-24615
 URL: https://issues.apache.org/jira/browse/HBASE-24615
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 1.3.7
Reporter: Rushabh Shah


We are not processing the distribution for last bucket. 

https://github.com/apache/hbase/blob/master/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics2/lib/MutableRangeHistogram.java#L70

{code:java}
  public void updateSnapshotRangeMetrics(MetricsRecordBuilder 
metricsRecordBuilder,
 Snapshot snapshot) {
long priorRange = 0;
long cumNum = 0;

final long[] ranges = getRanges();
final String rangeType = getRangeType();
for (int i = 0; i < ranges.length - 1; i++) { -> The bug lies 
here. We are not processing last bucket.
  long val = snapshot.getCountAtOrBelow(ranges[i]);
  if (val - cumNum > 0) {
metricsRecordBuilder.addCounter(
Interns.info(name + "_" + rangeType + "_" + priorRange + "-" + 
ranges[i], desc),
val - cumNum);
  }
  priorRange = ranges[i];
  cumNum = val;
}
long val = snapshot.getCount();
if (val - cumNum > 0) {
  metricsRecordBuilder.addCounter(
  Interns.info(name + "_" + rangeType + "_" + ranges[ranges.length - 1] 
+ "-inf", desc),
  val - cumNum);
}
  }
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Esteban Gutierrez
+1 to remove master-slave terminology and favor the proposals that Andrew
mentions. This is the right thing to do as community.

Thanks,
Esteban.
--
Cloudera, Inc.



On Mon, Jun 22, 2020 at 4:28 PM Zach York 
wrote:

> While reading the definitions, I think it is pretty clear that the
> definition HBase is intending is somewhere under the #2 definition link.
> HMaster is not a teacher (which implies learning on the "student" side),
> but rather orders the RS to do a task.
> I think master in this context is still worth changing to coordinator since
> in my mind this is actually a more clear definition of what HMaster does.
>
> On Mon, Jun 22, 2020 at 2:09 PM Geoffrey Jacoby 
> wrote:
>
> > For most of the proposals (slave -> worker, blacklist -> denylist,
> > whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> > have the advantage of being clearer than the terms they're replacing.
> >
> > However, I'm not convinced about changing "master" to "coordinator", or
> > something similar. Unlike "slave", which is negative in any context,
> > "master" has many definitions, including some common ones which do not
> > appear problematic. See
> https://www.merriam-webster.com/dictionary/master
> > for
> > examples. In particular, the progression of an artisan was from
> > "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> > artist would run a shop managing lots of workers and apprentices who
> would
> > hope to become masters of their own someday. So "master" and "worker" can
> > still go together.
> >
> > Since it's the least problematic term, and by far the hardest term to
> > change (both within HBase and with effects on downstream projects such as
> > Ambari), I'm -0 (nonbinding) on changing "master".
> >
> > Geoffrey
> >
> > On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
> >  wrote:
> >
> > > +1 to renaming.
> > >
> > >
> > > Rushabh Shah
> > >
> > >- Software Engineering SMTS | Salesforce
> > >-
> > >   - Mobile: 213 422 9052
> > >
> > >
> > >
> > > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> > >
> > > > +1
> > > >
> > > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > > We should change our use of these terms. We can be equally or more
> > > clear
> > > > in
> > > > > what we are trying to convey where they are present.
> > > > >
> > > > > That they have been used historically is only useful if the
> advantage
> > > we
> > > > > gain from using them through that shared context outweighs the
> > > potential
> > > > > friction they add. They make me personally less enthusiastic about
> > > > > contributing. That's enough friction for me to advocate removing
> > them.
> > > > >
> > > > > AFAICT reworking our replication stuff in terms of "active" and
> > > "passive"
> > > > > clusters did not result in a big spike of folks asking new
> questions
> > > > about
> > > > > where authority for state was.
> > > > >
> > > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > > wrote:
> > > > >
> > > > >> In response to renewed attention at the Foundation toward
> addressing
> > > > >> culturally problematic language and terms often used in technical
> > > > >> documentation and discussion, several projects have begun
> > discussions,
> > > > or
> > > > >> made proposals, or started work along these lines.
> > > > >>
> > > > >> The HBase PMC began its own discussion on private@ on June 9,
> 2020
> > > > with an
> > > > >> observation of this activity and this suggestion:
> > > > >>
> > > > >> There is a renewed push back against classic technology industry
> > terms
> > > > that
> > > > >> have negative modern connotations.
> > > > >>
> > > > >> In the case of HBase, the following substitutions might be
> proposed:
> > > > >>
> > > > >> - Coordinator instead of master
> > > > >>
> > > > >> - Worker instead of slave
> > > > >>
> > > > >> Recommendations for these additional substitutions also come up in
> > > this
> > > > >> type of discussion:
> > > > >>
> > > > >> - Accept list instead of white list
> > > > >>
> > > > >> - Deny list instead of black list
> > > > >>
> > > > >> Unfortunately we have Master all over our code base, baked into
> > > various
> > > > >> APIs and configuration variable names, so for us the necessary
> > changes
> > > > >> amount to a new major release and deprecation cycle. It could well
> > be
> > > > worth
> > > > >> it in the long run. We exist only as long as we draw a willing and
> > > > >> sufficient contributor community. It also wouldn’t be great to
> have
> > an
> > > > >> activist fork appear somewhere, even if unlikely to be successful.
> > > > >>
> > > > >> Relevant JIRAs are:
> > > > >>
> > > > >> - HBASE-12677 <
> > https://issues.apache.org/jira/browse/HBASE-12677
> > > >:
> > > > >> Update replication docs to clarify terminology
> > > > >> - HBASE-13852 <
> > https://issues.apache.org/jira/browse/HBASE-13852
> > > >:
> > > > >> Replace master-slave terminology in book, site, and javadoc
> > with a
>

[jira] [Reopened] (HBASE-24144) Update docs from master

2020-06-22 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-24144:
--

Reopening so as to include HBASE-24231.

> Update docs from master
> ---
>
> Key: HBASE-24144
> URL: https://issues.apache.org/jira/browse/HBASE-24144
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.3.0
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
>Priority: Major
> Fix For: 2.3.0
>
>
> Take a pass updating the docs. Have a look at what's on branch-2.2 and add 
> whatever updates we need from master. Consider refreshing branch-2 as well, 
> since it's been a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Mich Talebzadeh
Let us look at what *slave* mean

According to the merriam-webster

https://www.merriam-webster.com/dictionary/slave

Definition of *slave*

 (Entry 1 of 4)
1: a person held in servitude as the chattel of another
2: one that is completely subservient to a dominating influence
3: a device (such as the printer of a computer) that is directly responsive
to another
4: DRUDGE , TOILER

so in the context of Hbase, number *3* is valid. In other words, a
component which is directly responsive to another, another being *master*.






LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 22 Jun 2020 at 22:09, Geoffrey Jacoby  wrote:

> For most of the proposals (slave -> worker, blacklist -> denylist,
> whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> have the advantage of being clearer than the terms they're replacing.
>
> However, I'm not convinced about changing "master" to "coordinator", or
> something similar. Unlike "slave", which is negative in any context,
> "master" has many definitions, including some common ones which do not
> appear problematic. See https://www.merriam-webster.com/dictionary/master
> for
> examples. In particular, the progression of an artisan was from
> "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> artist would run a shop managing lots of workers and apprentices who would
> hope to become masters of their own someday. So "master" and "worker" can
> still go together.
>
> Since it's the least problematic term, and by far the hardest term to
> change (both within HBase and with effects on downstream projects such as
> Ambari), I'm -0 (nonbinding) on changing "master".
>
> Geoffrey
>
> On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
>  wrote:
>
> > +1 to renaming.
> >
> >
> > Rushabh Shah
> >
> >- Software Engineering SMTS | Salesforce
> >-
> >   - Mobile: 213 422 9052
> >
> >
> >
> > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> >
> > > +1
> > >
> > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > We should change our use of these terms. We can be equally or more
> > clear
> > > in
> > > > what we are trying to convey where they are present.
> > > >
> > > > That they have been used historically is only useful if the advantage
> > we
> > > > gain from using them through that shared context outweighs the
> > potential
> > > > friction they add. They make me personally less enthusiastic about
> > > > contributing. That's enough friction for me to advocate removing
> them.
> > > >
> > > > AFAICT reworking our replication stuff in terms of "active" and
> > "passive"
> > > > clusters did not result in a big spike of folks asking new questions
> > > about
> > > > where authority for state was.
> > > >
> > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > wrote:
> > > >
> > > >> In response to renewed attention at the Foundation toward addressing
> > > >> culturally problematic language and terms often used in technical
> > > >> documentation and discussion, several projects have begun
> discussions,
> > > or
> > > >> made proposals, or started work along these lines.
> > > >>
> > > >> The HBase PMC began its own discussion on private@ on June 9, 2020
> > > with an
> > > >> observation of this activity and this suggestion:
> > > >>
> > > >> There is a renewed push back against classic technology industry
> terms
> > > that
> > > >> have negative modern connotations.
> > > >>
> > > >> In the case of HBase, the following substitutions might be proposed:
> > > >>
> > > >> - Coordinator instead of master
> > > >>
> > > >> - Worker instead of slave
> > > >>
> > > >> Recommendations for these additional substitutions also come up in
> > this
> > > >> type of discussion:
> > > >>
> > > >> - Accept list instead of white list
> > > >>
> > > >> - Deny list instead of black list
> > > >>
> > > >> Unfortunately we have Master all over our code base, baked into
> > various
> > > >> APIs and configuration variable names, so for us the necessary
> changes
> > > >> amount to a new major release and deprecation cycle. It could well
> be
> > > worth
> > > >> it in the long run. We exist only as long as we draw a willing and
> > > >> sufficient contributor community. It also wouldn’t be great to have
> an
> > > >> activist fork appear somewhere, even if unlikely to be successful.
> > > >>
> > > >> Relevant JIRAs are:
> > > >>
> > > 

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Zach York
While reading the definitions, I think it is pretty clear that the
definition HBase is intending is somewhere under the #2 definition link.
HMaster is not a teacher (which implies learning on the "student" side),
but rather orders the RS to do a task.
I think master in this context is still worth changing to coordinator since
in my mind this is actually a more clear definition of what HMaster does.

On Mon, Jun 22, 2020 at 2:09 PM Geoffrey Jacoby  wrote:

> For most of the proposals (slave -> worker, blacklist -> denylist,
> whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> have the advantage of being clearer than the terms they're replacing.
>
> However, I'm not convinced about changing "master" to "coordinator", or
> something similar. Unlike "slave", which is negative in any context,
> "master" has many definitions, including some common ones which do not
> appear problematic. See https://www.merriam-webster.com/dictionary/master
> for
> examples. In particular, the progression of an artisan was from
> "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> artist would run a shop managing lots of workers and apprentices who would
> hope to become masters of their own someday. So "master" and "worker" can
> still go together.
>
> Since it's the least problematic term, and by far the hardest term to
> change (both within HBase and with effects on downstream projects such as
> Ambari), I'm -0 (nonbinding) on changing "master".
>
> Geoffrey
>
> On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
>  wrote:
>
> > +1 to renaming.
> >
> >
> > Rushabh Shah
> >
> >- Software Engineering SMTS | Salesforce
> >-
> >   - Mobile: 213 422 9052
> >
> >
> >
> > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> >
> > > +1
> > >
> > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > We should change our use of these terms. We can be equally or more
> > clear
> > > in
> > > > what we are trying to convey where they are present.
> > > >
> > > > That they have been used historically is only useful if the advantage
> > we
> > > > gain from using them through that shared context outweighs the
> > potential
> > > > friction they add. They make me personally less enthusiastic about
> > > > contributing. That's enough friction for me to advocate removing
> them.
> > > >
> > > > AFAICT reworking our replication stuff in terms of "active" and
> > "passive"
> > > > clusters did not result in a big spike of folks asking new questions
> > > about
> > > > where authority for state was.
> > > >
> > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > wrote:
> > > >
> > > >> In response to renewed attention at the Foundation toward addressing
> > > >> culturally problematic language and terms often used in technical
> > > >> documentation and discussion, several projects have begun
> discussions,
> > > or
> > > >> made proposals, or started work along these lines.
> > > >>
> > > >> The HBase PMC began its own discussion on private@ on June 9, 2020
> > > with an
> > > >> observation of this activity and this suggestion:
> > > >>
> > > >> There is a renewed push back against classic technology industry
> terms
> > > that
> > > >> have negative modern connotations.
> > > >>
> > > >> In the case of HBase, the following substitutions might be proposed:
> > > >>
> > > >> - Coordinator instead of master
> > > >>
> > > >> - Worker instead of slave
> > > >>
> > > >> Recommendations for these additional substitutions also come up in
> > this
> > > >> type of discussion:
> > > >>
> > > >> - Accept list instead of white list
> > > >>
> > > >> - Deny list instead of black list
> > > >>
> > > >> Unfortunately we have Master all over our code base, baked into
> > various
> > > >> APIs and configuration variable names, so for us the necessary
> changes
> > > >> amount to a new major release and deprecation cycle. It could well
> be
> > > worth
> > > >> it in the long run. We exist only as long as we draw a willing and
> > > >> sufficient contributor community. It also wouldn’t be great to have
> an
> > > >> activist fork appear somewhere, even if unlikely to be successful.
> > > >>
> > > >> Relevant JIRAs are:
> > > >>
> > > >> - HBASE-12677 <
> https://issues.apache.org/jira/browse/HBASE-12677
> > >:
> > > >> Update replication docs to clarify terminology
> > > >> - HBASE-13852 <
> https://issues.apache.org/jira/browse/HBASE-13852
> > >:
> > > >> Replace master-slave terminology in book, site, and javadoc
> with a
> > > more
> > > >> modern vocabulary
> > > >> - HBASE-24576 <
> https://issues.apache.org/jira/browse/HBASE-24576
> > >:
> > > >> Changing "whitelist" and "blacklist" in our docs and project
> > > >>
> > > >> In response to this proposal, a member of the PMC asked if the term
> > > >> 'master' used by itself would be fine, because we only have use of
> > > 'slave'
> > > >> in replication documentation and that is easily addressed. In
> response
> > > to
> > > >> thi

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
Regarding "slave", it is a stretch to point to an esoteric technical
definition and ask someone to pretend like all the other pejorative
meanings relating to power relationship are somehow not meaningful. If we
were to be accused of "turning a blind eye", that charge would stick, in my
opinion.



On Mon, Jun 22, 2020 at 2:23 PM Mich Talebzadeh 
wrote:

> Let us look at what *slave* mean
>
> According to the merriam-webster
>
> https://www.merriam-webster.com/dictionary/slave
>
> Definition of *slave*
>
>  (Entry 1 of 4)
> 1: a person held in servitude as the chattel of another
> 2: one that is completely subservient to a dominating influence
> 3: a device (such as the printer of a computer) that is directly responsive
> to another
> 4: DRUDGE , TOILER
> 
> so in the context of Hbase, number *3* is valid. In other words, a
> component which is directly responsive to another, another being *master*.
>
>
> 
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >*
>
>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 22 Jun 2020 at 22:09, Geoffrey Jacoby  wrote:
>
> > For most of the proposals (slave -> worker, blacklist -> denylist,
> > whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> > have the advantage of being clearer than the terms they're replacing.
> >
> > However, I'm not convinced about changing "master" to "coordinator", or
> > something similar. Unlike "slave", which is negative in any context,
> > "master" has many definitions, including some common ones which do not
> > appear problematic. See
> https://www.merriam-webster.com/dictionary/master
> > for
> > examples. In particular, the progression of an artisan was from
> > "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> > artist would run a shop managing lots of workers and apprentices who
> would
> > hope to become masters of their own someday. So "master" and "worker" can
> > still go together.
> >
> > Since it's the least problematic term, and by far the hardest term to
> > change (both within HBase and with effects on downstream projects such as
> > Ambari), I'm -0 (nonbinding) on changing "master".
> >
> > Geoffrey
> >
> > On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
> >  wrote:
> >
> > > +1 to renaming.
> > >
> > >
> > > Rushabh Shah
> > >
> > >- Software Engineering SMTS | Salesforce
> > >-
> > >   - Mobile: 213 422 9052
> > >
> > >
> > >
> > > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> > >
> > > > +1
> > > >
> > > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > > We should change our use of these terms. We can be equally or more
> > > clear
> > > > in
> > > > > what we are trying to convey where they are present.
> > > > >
> > > > > That they have been used historically is only useful if the
> advantage
> > > we
> > > > > gain from using them through that shared context outweighs the
> > > potential
> > > > > friction they add. They make me personally less enthusiastic about
> > > > > contributing. That's enough friction for me to advocate removing
> > them.
> > > > >
> > > > > AFAICT reworking our replication stuff in terms of "active" and
> > > "passive"
> > > > > clusters did not result in a big spike of folks asking new
> questions
> > > > about
> > > > > where authority for state was.
> > > > >
> > > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > > wrote:
> > > > >
> > > > >> In response to renewed attention at the Foundation toward
> addressing
> > > > >> culturally problematic language and terms often used in technical
> > > > >> documentation and discussion, several projects have begun
> > discussions,
> > > > or
> > > > >> made proposals, or started work along these lines.
> > > > >>
> > > > >> The HBase PMC began its own discussion on private@ on June 9,
> 2020
> > > > with an
> > > > >> observation of this activity and this suggestion:
> > > > >>
> > > > >> There is a renewed push back against classic technology industry
> > terms
> > > > that
> > > > >> have negative modern connotations.
> > > > >>
> > > > >> In the case of HBase, the following substitutions might be
> proposed:
> > > > >>
> > > > >> - Coordinator instead of master
> > > > >>
> > > > >> - Worker instead of slave
> > > > >>
> > > > >> Recommendations for these additional substitutions also come up in
> > > this
> > > > >> type of discussion:
> > > > >>
> > > > >> - Accept list instead of w

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
In observing something like voting happening on this thread to express
alignment or not, it might be helpful to first, come up with a list of
terms to change (if any), and then propose replacements, individually. So
far we might break this apart into four proposals:

1. Replace "master"/"hmaster" with ??? ("coordinator" is one option), this
one has by far the most significant impact and both opinion and
interpretation on this one is mixed.

2. Replace "slave" with "follower", seems to impact the cross cluster
replication subsystem only.

3. Replace "black list" with "deny list".

4. Replace "white list" with "accept list".

Perhaps if you are inclined to respond with a +1/-1/+0/-0, it would be
useful to give such an indication for each line item above. Or, offer
alternative proposals. Or, if you have a singular opinion, that's fine too.



On Mon, Jun 22, 2020 at 2:09 PM Geoffrey Jacoby  wrote:

> For most of the proposals (slave -> worker, blacklist -> denylist,
> whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> have the advantage of being clearer than the terms they're replacing.
>
> However, I'm not convinced about changing "master" to "coordinator", or
> something similar. Unlike "slave", which is negative in any context,
> "master" has many definitions, including some common ones which do not
> appear problematic. See https://www.merriam-webster.com/dictionary/master
> for
> examples. In particular, the progression of an artisan was from
> "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> artist would run a shop managing lots of workers and apprentices who would
> hope to become masters of their own someday. So "master" and "worker" can
> still go together.
>
> Since it's the least problematic term, and by far the hardest term to
> change (both within HBase and with effects on downstream projects such as
> Ambari), I'm -0 (nonbinding) on changing "master".
>
> Geoffrey
>
> On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
>  wrote:
>
> > +1 to renaming.
> >
> >
> > Rushabh Shah
> >
> >- Software Engineering SMTS | Salesforce
> >-
> >   - Mobile: 213 422 9052
> >
> >
> >
> > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> >
> > > +1
> > >
> > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > We should change our use of these terms. We can be equally or more
> > clear
> > > in
> > > > what we are trying to convey where they are present.
> > > >
> > > > That they have been used historically is only useful if the advantage
> > we
> > > > gain from using them through that shared context outweighs the
> > potential
> > > > friction they add. They make me personally less enthusiastic about
> > > > contributing. That's enough friction for me to advocate removing
> them.
> > > >
> > > > AFAICT reworking our replication stuff in terms of "active" and
> > "passive"
> > > > clusters did not result in a big spike of folks asking new questions
> > > about
> > > > where authority for state was.
> > > >
> > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > wrote:
> > > >
> > > >> In response to renewed attention at the Foundation toward addressing
> > > >> culturally problematic language and terms often used in technical
> > > >> documentation and discussion, several projects have begun
> discussions,
> > > or
> > > >> made proposals, or started work along these lines.
> > > >>
> > > >> The HBase PMC began its own discussion on private@ on June 9, 2020
> > > with an
> > > >> observation of this activity and this suggestion:
> > > >>
> > > >> There is a renewed push back against classic technology industry
> terms
> > > that
> > > >> have negative modern connotations.
> > > >>
> > > >> In the case of HBase, the following substitutions might be proposed:
> > > >>
> > > >> - Coordinator instead of master
> > > >>
> > > >> - Worker instead of slave
> > > >>
> > > >> Recommendations for these additional substitutions also come up in
> > this
> > > >> type of discussion:
> > > >>
> > > >> - Accept list instead of white list
> > > >>
> > > >> - Deny list instead of black list
> > > >>
> > > >> Unfortunately we have Master all over our code base, baked into
> > various
> > > >> APIs and configuration variable names, so for us the necessary
> changes
> > > >> amount to a new major release and deprecation cycle. It could well
> be
> > > worth
> > > >> it in the long run. We exist only as long as we draw a willing and
> > > >> sufficient contributor community. It also wouldn’t be great to have
> an
> > > >> activist fork appear somewhere, even if unlikely to be successful.
> > > >>
> > > >> Relevant JIRAs are:
> > > >>
> > > >> - HBASE-12677 <
> https://issues.apache.org/jira/browse/HBASE-12677
> > >:
> > > >> Update replication docs to clarify terminology
> > > >> - HBASE-13852 <
> https://issues.apache.org/jira/browse/HBASE-13852
> > >:
> > > >> Replace master-slave terminology in book, site, and javadoc
> with a
> > > more
> > > >> modern voc

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Geoffrey Jacoby
For most of the proposals (slave -> worker, blacklist -> denylist,
whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
have the advantage of being clearer than the terms they're replacing.

However, I'm not convinced about changing "master" to "coordinator", or
something similar. Unlike "slave", which is negative in any context,
"master" has many definitions, including some common ones which do not
appear problematic. See https://www.merriam-webster.com/dictionary/master for
examples. In particular, the progression of an artisan was from
"apprentice" to "journeyman" to "master". A master smith, carpenter, or
artist would run a shop managing lots of workers and apprentices who would
hope to become masters of their own someday. So "master" and "worker" can
still go together.

Since it's the least problematic term, and by far the hardest term to
change (both within HBase and with effects on downstream projects such as
Ambari), I'm -0 (nonbinding) on changing "master".

Geoffrey

On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
 wrote:

> +1 to renaming.
>
>
> Rushabh Shah
>
>- Software Engineering SMTS | Salesforce
>-
>   - Mobile: 213 422 9052
>
>
>
> On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
>
> > +1
> >
> > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > We should change our use of these terms. We can be equally or more
> clear
> > in
> > > what we are trying to convey where they are present.
> > >
> > > That they have been used historically is only useful if the advantage
> we
> > > gain from using them through that shared context outweighs the
> potential
> > > friction they add. They make me personally less enthusiastic about
> > > contributing. That's enough friction for me to advocate removing them.
> > >
> > > AFAICT reworking our replication stuff in terms of "active" and
> "passive"
> > > clusters did not result in a big spike of folks asking new questions
> > about
> > > where authority for state was.
> > >
> > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> wrote:
> > >
> > >> In response to renewed attention at the Foundation toward addressing
> > >> culturally problematic language and terms often used in technical
> > >> documentation and discussion, several projects have begun discussions,
> > or
> > >> made proposals, or started work along these lines.
> > >>
> > >> The HBase PMC began its own discussion on private@ on June 9, 2020
> > with an
> > >> observation of this activity and this suggestion:
> > >>
> > >> There is a renewed push back against classic technology industry terms
> > that
> > >> have negative modern connotations.
> > >>
> > >> In the case of HBase, the following substitutions might be proposed:
> > >>
> > >> - Coordinator instead of master
> > >>
> > >> - Worker instead of slave
> > >>
> > >> Recommendations for these additional substitutions also come up in
> this
> > >> type of discussion:
> > >>
> > >> - Accept list instead of white list
> > >>
> > >> - Deny list instead of black list
> > >>
> > >> Unfortunately we have Master all over our code base, baked into
> various
> > >> APIs and configuration variable names, so for us the necessary changes
> > >> amount to a new major release and deprecation cycle. It could well be
> > worth
> > >> it in the long run. We exist only as long as we draw a willing and
> > >> sufficient contributor community. It also wouldn’t be great to have an
> > >> activist fork appear somewhere, even if unlikely to be successful.
> > >>
> > >> Relevant JIRAs are:
> > >>
> > >> - HBASE-12677  >:
> > >> Update replication docs to clarify terminology
> > >> - HBASE-13852  >:
> > >> Replace master-slave terminology in book, site, and javadoc with a
> > more
> > >> modern vocabulary
> > >> - HBASE-24576  >:
> > >> Changing "whitelist" and "blacklist" in our docs and project
> > >>
> > >> In response to this proposal, a member of the PMC asked if the term
> > >> 'master' used by itself would be fine, because we only have use of
> > 'slave'
> > >> in replication documentation and that is easily addressed. In response
> > to
> > >> this question, others on the PMC suggested that even if only 'master'
> is
> > >> used, in this context it is still a problem.
> > >>
> > >> For folks who are surprised or lacking context on the details of this
> > >> discussion, one PMC member offered a link to this draft RFC as
> > background:
> > >> https://tools.ietf.org/id/draft-knodel-terminology-00.html
> > >>
> > >> There was general support for removing the term "master" / "hmaster"
> > from
> > >> our code base and using the terms "coordinator" or "leader" instead.
> In
> > the
> > >> context of replication, "worker" makes less sense and perhaps
> > "destination"
> > >> or "follower" would be more appropriate terms.
> > >>
> > >> One PMC member's tho

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Mich Talebzadeh
In mitigation, we should only do the revision if the community feels:


   1. There is a need to revise historical context
   2. We by virtue of accepting changes will make a better team
   3. It will have little or no impact on the current functionality
   4. Given that most products in production lag few versions behind, in
   all likelihood, it may take few years before the changes are materialised
   5. If there is a clear majority message that we ought to change, then a
   sensible roadmap should be prepared with timelines.
   6. We should not change the things because it is fashionable. Those who
   have visited Linkedlin recently would have noticed that there are a lot of
   companies who rightly or wrongly have come out with the support of the
   current trends and equally have attracted a lot of criticism by quote "not
   being sincere"


I know I am not making myself popular but I think we ought to weigh things
up in true context.


HTH




LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 22 Jun 2020 at 20:14, Mich Talebzadeh 
wrote:

>
> Hi,
>
> Thank you for the proposals.
>
> I am afraid I have to agree to differ. The term master and slave (commonly
> used in Big data tools (not confined to HBase only) is BAU and historical)
> and bears no resemblance to anything recent.
>
> Additionally, both whitelist and blacklist simply refer to a
> proposal which is accepted and a proposal which is struck out (black
> pencil line).
>
> So in scientific context these are terminologies used. Terminologies
> become offensive if they are used "in the incorrect context". I don't think
> anyone in HBase or Spark community will have objections if these
> terminologies are used as before. Spark used the term in master/slave in
> Standalone mode if i recall correctly.
>
> Changing something for the sake of "now being in the limelight" does not
> make it right. So I beg to differ on this. Having said that it is indeed a
> sign of a civilised mind to entertain an idea without accepting it so
> whatever the community wishes.
>
> HTH
>
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 22 Jun 2020 at 19:39, Andrew Purtell  wrote:
>
>> In response to renewed attention at the Foundation toward addressing
>> culturally problematic language and terms often used in technical
>> documentation and discussion, several projects have begun discussions, or
>> made proposals, or started work along these lines.
>>
>> The HBase PMC began its own discussion on private@ on June 9, 2020 with
>> an
>> observation of this activity and this suggestion:
>>
>> There is a renewed push back against classic technology industry terms
>> that
>> have negative modern connotations.
>>
>> In the case of HBase, the following substitutions might be proposed:
>>
>> - Coordinator instead of master
>>
>> - Worker instead of slave
>>
>> Recommendations for these additional substitutions also come up in this
>> type of discussion:
>>
>> - Accept list instead of white list
>>
>> - Deny list instead of black list
>>
>> Unfortunately we have Master all over our code base, baked into various
>> APIs and configuration variable names, so for us the necessary changes
>> amount to a new major release and deprecation cycle. It could well be
>> worth
>> it in the long run. We exist only as long as we draw a willing and
>> sufficient contributor community. It also wouldn’t be great to have an
>> activist fork appear somewhere, even if unlikely to be successful.
>>
>> Relevant JIRAs are:
>>
>>- HBASE-12677 :
>>Update replication docs to clarify terminology
>>- HBASE-13852 :
>>Replace master-slave terminology in book, site, and javadoc with a more
>>modern vocabulary
>>- HBASE-24576 :
>>Changing "whitelist" and "blacklist" in our docs and project
>>
>> In response to this proposal, a member of the PMC asked if the term
>> 'mast

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Rushabh Shah
+1 to renaming.


Rushabh Shah

   - Software Engineering SMTS | Salesforce
   -
  - Mobile: 213 422 9052



On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:

> +1
>
> On 6/22/20 4:03 PM, Sean Busbey wrote:
> > We should change our use of these terms. We can be equally or more clear
> in
> > what we are trying to convey where they are present.
> >
> > That they have been used historically is only useful if the advantage we
> > gain from using them through that shared context outweighs the potential
> > friction they add. They make me personally less enthusiastic about
> > contributing. That's enough friction for me to advocate removing them.
> >
> > AFAICT reworking our replication stuff in terms of "active" and "passive"
> > clusters did not result in a big spike of folks asking new questions
> about
> > where authority for state was.
> >
> > On Mon, Jun 22, 2020, 13:39 Andrew Purtell  wrote:
> >
> >> In response to renewed attention at the Foundation toward addressing
> >> culturally problematic language and terms often used in technical
> >> documentation and discussion, several projects have begun discussions,
> or
> >> made proposals, or started work along these lines.
> >>
> >> The HBase PMC began its own discussion on private@ on June 9, 2020
> with an
> >> observation of this activity and this suggestion:
> >>
> >> There is a renewed push back against classic technology industry terms
> that
> >> have negative modern connotations.
> >>
> >> In the case of HBase, the following substitutions might be proposed:
> >>
> >> - Coordinator instead of master
> >>
> >> - Worker instead of slave
> >>
> >> Recommendations for these additional substitutions also come up in this
> >> type of discussion:
> >>
> >> - Accept list instead of white list
> >>
> >> - Deny list instead of black list
> >>
> >> Unfortunately we have Master all over our code base, baked into various
> >> APIs and configuration variable names, so for us the necessary changes
> >> amount to a new major release and deprecation cycle. It could well be
> worth
> >> it in the long run. We exist only as long as we draw a willing and
> >> sufficient contributor community. It also wouldn’t be great to have an
> >> activist fork appear somewhere, even if unlikely to be successful.
> >>
> >> Relevant JIRAs are:
> >>
> >> - HBASE-12677 :
> >> Update replication docs to clarify terminology
> >> - HBASE-13852 :
> >> Replace master-slave terminology in book, site, and javadoc with a
> more
> >> modern vocabulary
> >> - HBASE-24576 :
> >> Changing "whitelist" and "blacklist" in our docs and project
> >>
> >> In response to this proposal, a member of the PMC asked if the term
> >> 'master' used by itself would be fine, because we only have use of
> 'slave'
> >> in replication documentation and that is easily addressed. In response
> to
> >> this question, others on the PMC suggested that even if only 'master' is
> >> used, in this context it is still a problem.
> >>
> >> For folks who are surprised or lacking context on the details of this
> >> discussion, one PMC member offered a link to this draft RFC as
> background:
> >> https://tools.ietf.org/id/draft-knodel-terminology-00.html
> >>
> >> There was general support for removing the term "master" / "hmaster"
> from
> >> our code base and using the terms "coordinator" or "leader" instead. In
> the
> >> context of replication, "worker" makes less sense and perhaps
> "destination"
> >> or "follower" would be more appropriate terms.
> >>
> >> One PMC member's thoughts on language and non-native English speakers is
> >> worth including in its entirety:
> >>
> >> While words like blacklist/whitelist/slave clearly have those negative
> >> references, word master might not have the same impact for non native
> >> English speakers like myself where the literal translation to my mother
> >> tongue does not have this same bad connotation. Replacing all references
> >> for word *master *on our docs/codebase is a huge effort, I guess such a
> >> decision would be more suitable for native English speakers folks, and
> >> maybe we should consider the opinion of contributors from that ethinic
> >> minority as well?
> >>
> >> These are good questions for public discussion.
> >>
> >> We have a consensus in the PMC, at this time, that is supportive of
> making
> >> the above discussed terminology changes. However, we also have concerns
> >> about what it would take to accomplish meaningful changes. Several on
> the
> >> PMC offered support in the form of cycles to review pull requests and
> >> patches, and two PMC members offered  personal bandwidth for creating
> and
> >> releasing new code lines as needed to complete a deprecation cycle.
> >>
> >> Unfortunately, the terms "master" and "hmaster" appear throughout our
> code
> >> base in c

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Josh Elser

+1

On 6/22/20 4:03 PM, Sean Busbey wrote:

We should change our use of these terms. We can be equally or more clear in
what we are trying to convey where they are present.

That they have been used historically is only useful if the advantage we
gain from using them through that shared context outweighs the potential
friction they add. They make me personally less enthusiastic about
contributing. That's enough friction for me to advocate removing them.

AFAICT reworking our replication stuff in terms of "active" and "passive"
clusters did not result in a big spike of folks asking new questions about
where authority for state was.

On Mon, Jun 22, 2020, 13:39 Andrew Purtell  wrote:


In response to renewed attention at the Foundation toward addressing
culturally problematic language and terms often used in technical
documentation and discussion, several projects have begun discussions, or
made proposals, or started work along these lines.

The HBase PMC began its own discussion on private@ on June 9, 2020 with an
observation of this activity and this suggestion:

There is a renewed push back against classic technology industry terms that
have negative modern connotations.

In the case of HBase, the following substitutions might be proposed:

- Coordinator instead of master

- Worker instead of slave

Recommendations for these additional substitutions also come up in this
type of discussion:

- Accept list instead of white list

- Deny list instead of black list

Unfortunately we have Master all over our code base, baked into various
APIs and configuration variable names, so for us the necessary changes
amount to a new major release and deprecation cycle. It could well be worth
it in the long run. We exist only as long as we draw a willing and
sufficient contributor community. It also wouldn’t be great to have an
activist fork appear somewhere, even if unlikely to be successful.

Relevant JIRAs are:

- HBASE-12677 :
Update replication docs to clarify terminology
- HBASE-13852 :
Replace master-slave terminology in book, site, and javadoc with a more
modern vocabulary
- HBASE-24576 :
Changing "whitelist" and "blacklist" in our docs and project

In response to this proposal, a member of the PMC asked if the term
'master' used by itself would be fine, because we only have use of 'slave'
in replication documentation and that is easily addressed. In response to
this question, others on the PMC suggested that even if only 'master' is
used, in this context it is still a problem.

For folks who are surprised or lacking context on the details of this
discussion, one PMC member offered a link to this draft RFC as background:
https://tools.ietf.org/id/draft-knodel-terminology-00.html

There was general support for removing the term "master" / "hmaster" from
our code base and using the terms "coordinator" or "leader" instead. In the
context of replication, "worker" makes less sense and perhaps "destination"
or "follower" would be more appropriate terms.

One PMC member's thoughts on language and non-native English speakers is
worth including in its entirety:

While words like blacklist/whitelist/slave clearly have those negative
references, word master might not have the same impact for non native
English speakers like myself where the literal translation to my mother
tongue does not have this same bad connotation. Replacing all references
for word *master *on our docs/codebase is a huge effort, I guess such a
decision would be more suitable for native English speakers folks, and
maybe we should consider the opinion of contributors from that ethinic
minority as well?

These are good questions for public discussion.

We have a consensus in the PMC, at this time, that is supportive of making
the above discussed terminology changes. However, we also have concerns
about what it would take to accomplish meaningful changes. Several on the
PMC offered support in the form of cycles to review pull requests and
patches, and two PMC members offered  personal bandwidth for creating and
releasing new code lines as needed to complete a deprecation cycle.

Unfortunately, the terms "master" and "hmaster" appear throughout our code
base in class names, user facing API subject to our project compatibility
guidelines, and configuration variable names, which are also implicated by
compatibility guidelines given the impact of changes to operators and
operations. The changes being discussed are not backwards compatible
changes and cannot be executed with swiftness while simultaneously
preserving compatibility. There must be a deprecation cycle. First, we must
tag all implicated public API and configuration variables as deprecated,
and release HBase 3 with these deprecations in place. Then, we must
undertake rename and removal as appropriate, and release the resul

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Nick Dimiduk
>From my perspective, we gain nothing as a project or as a community be
willfully retaining use of language that is well understood to be
problematic or hurtful, even if that terminology has precedent in the
technology domain. On the contrary, we have much to gain by encouraging
contributions from as many people as possible. Regardless of one's personal
opinions on the context that encourages this change, I think it has
benefits for the project and benefits for the community. For me, I think
it's about time, well overdue.

Thanks,
Nick

On Mon, Jun 22, 2020 at 12:18 PM Andrew Purtell 
wrote:

> Thank you Mich.
>
> Hopefully it is clear that there is no community consensus yet, and all
> voices are welcome on the topic.
>
> > On Jun 22, 2020, at 12:15 PM, Mich Talebzadeh 
> wrote:
> >
> > Hi,
> >
> > Thank you for the proposals.
> >
> > I am afraid I have to agree to differ. The term master and slave
> (commonly
> > used in Big data tools (not confined to HBase only) is BAU and
> historical)
> > and bears no resemblance to anything recent.
> >
> > Additionally, both whitelist and blacklist simply refer to a proposal
> which
> > is accepted and a proposal which is struck out (black pencil line).
> >
> > So in scientific context these are terminologies used. Terminologies
> become
> > offensive if they are used "in the incorrect context". I don't think
> anyone
> > in HBase or Spark community will have objections if these terminologies
> are
> > used as before. Spark used the term in master/slave in Standalone mode
> if i
> > recall correctly.
> >
> > Changing something for the sake of "now being in the limelight" does not
> > make it right. So I beg to differ on this. Having said that it is indeed
> a
> > sign of a civilised mind to entertain an idea without accepting it so
> > whatever the community wishes.
> >
> > HTH
> >
> >
> >
> >
> > LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > <
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >*
> >
> >
> >
> >
> >
> > *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> > loss, damage or destruction of data or any other property which may arise
> > from relying on this email's technical content is explicitly disclaimed.
> > The author will in no case be liable for any monetary damages arising
> from
> > such loss, damage or destruction.
> >
> >
> >
> >
> >> On Mon, 22 Jun 2020 at 19:39, Andrew Purtell 
> wrote:
> >>
> >> In response to renewed attention at the Foundation toward addressing
> >> culturally problematic language and terms often used in technical
> >> documentation and discussion, several projects have begun discussions,
> or
> >> made proposals, or started work along these lines.
> >>
> >> The HBase PMC began its own discussion on private@ on June 9, 2020
> with an
> >> observation of this activity and this suggestion:
> >>
> >> There is a renewed push back against classic technology industry terms
> that
> >> have negative modern connotations.
> >>
> >> In the case of HBase, the following substitutions might be proposed:
> >>
> >> - Coordinator instead of master
> >>
> >> - Worker instead of slave
> >>
> >> Recommendations for these additional substitutions also come up in this
> >> type of discussion:
> >>
> >> - Accept list instead of white list
> >>
> >> - Deny list instead of black list
> >>
> >> Unfortunately we have Master all over our code base, baked into various
> >> APIs and configuration variable names, so for us the necessary changes
> >> amount to a new major release and deprecation cycle. It could well be
> worth
> >> it in the long run. We exist only as long as we draw a willing and
> >> sufficient contributor community. It also wouldn’t be great to have an
> >> activist fork appear somewhere, even if unlikely to be successful.
> >>
> >> Relevant JIRAs are:
> >>
> >>   - HBASE-12677 :
> >>   Update replication docs to clarify terminology
> >>   - HBASE-13852 :
> >>   Replace master-slave terminology in book, site, and javadoc with a
> more
> >>   modern vocabulary
> >>   - HBASE-24576 :
> >>   Changing "whitelist" and "blacklist" in our docs and project
> >>
> >> In response to this proposal, a member of the PMC asked if the term
> >> 'master' used by itself would be fine, because we only have use of
> 'slave'
> >> in replication documentation and that is easily addressed. In response
> to
> >> this question, others on the PMC suggested that even if only 'master' is
> >> used, in this context it is still a problem.
> >>
> >> For folks who are surprised or lacking context on the details of this
> >> discussion, one PMC member offered a link to this draft RFC as
> background:
> >> https://tools.ietf.org/id/draft-knodel-terminology-00.html
> >>
> >> There was general support for 

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Sean Busbey
We should change our use of these terms. We can be equally or more clear in
what we are trying to convey where they are present.

That they have been used historically is only useful if the advantage we
gain from using them through that shared context outweighs the potential
friction they add. They make me personally less enthusiastic about
contributing. That's enough friction for me to advocate removing them.

AFAICT reworking our replication stuff in terms of "active" and "passive"
clusters did not result in a big spike of folks asking new questions about
where authority for state was.

On Mon, Jun 22, 2020, 13:39 Andrew Purtell  wrote:

> In response to renewed attention at the Foundation toward addressing
> culturally problematic language and terms often used in technical
> documentation and discussion, several projects have begun discussions, or
> made proposals, or started work along these lines.
>
> The HBase PMC began its own discussion on private@ on June 9, 2020 with an
> observation of this activity and this suggestion:
>
> There is a renewed push back against classic technology industry terms that
> have negative modern connotations.
>
> In the case of HBase, the following substitutions might be proposed:
>
> - Coordinator instead of master
>
> - Worker instead of slave
>
> Recommendations for these additional substitutions also come up in this
> type of discussion:
>
> - Accept list instead of white list
>
> - Deny list instead of black list
>
> Unfortunately we have Master all over our code base, baked into various
> APIs and configuration variable names, so for us the necessary changes
> amount to a new major release and deprecation cycle. It could well be worth
> it in the long run. We exist only as long as we draw a willing and
> sufficient contributor community. It also wouldn’t be great to have an
> activist fork appear somewhere, even if unlikely to be successful.
>
> Relevant JIRAs are:
>
>- HBASE-12677 :
>Update replication docs to clarify terminology
>- HBASE-13852 :
>Replace master-slave terminology in book, site, and javadoc with a more
>modern vocabulary
>- HBASE-24576 :
>Changing "whitelist" and "blacklist" in our docs and project
>
> In response to this proposal, a member of the PMC asked if the term
> 'master' used by itself would be fine, because we only have use of 'slave'
> in replication documentation and that is easily addressed. In response to
> this question, others on the PMC suggested that even if only 'master' is
> used, in this context it is still a problem.
>
> For folks who are surprised or lacking context on the details of this
> discussion, one PMC member offered a link to this draft RFC as background:
> https://tools.ietf.org/id/draft-knodel-terminology-00.html
>
> There was general support for removing the term "master" / "hmaster" from
> our code base and using the terms "coordinator" or "leader" instead. In the
> context of replication, "worker" makes less sense and perhaps "destination"
> or "follower" would be more appropriate terms.
>
> One PMC member's thoughts on language and non-native English speakers is
> worth including in its entirety:
>
> While words like blacklist/whitelist/slave clearly have those negative
> references, word master might not have the same impact for non native
> English speakers like myself where the literal translation to my mother
> tongue does not have this same bad connotation. Replacing all references
> for word *master *on our docs/codebase is a huge effort, I guess such a
> decision would be more suitable for native English speakers folks, and
> maybe we should consider the opinion of contributors from that ethinic
> minority as well?
>
> These are good questions for public discussion.
>
> We have a consensus in the PMC, at this time, that is supportive of making
> the above discussed terminology changes. However, we also have concerns
> about what it would take to accomplish meaningful changes. Several on the
> PMC offered support in the form of cycles to review pull requests and
> patches, and two PMC members offered  personal bandwidth for creating and
> releasing new code lines as needed to complete a deprecation cycle.
>
> Unfortunately, the terms "master" and "hmaster" appear throughout our code
> base in class names, user facing API subject to our project compatibility
> guidelines, and configuration variable names, which are also implicated by
> compatibility guidelines given the impact of changes to operators and
> operations. The changes being discussed are not backwards compatible
> changes and cannot be executed with swiftness while simultaneously
> preserving compatibility. There must be a deprecation cycle. First, we must
> tag all implicated public API and configuration variables as deprecated,
> and release HBase 3 with these dep

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Mich Talebzadeh
Hi,

Thank you for the proposals.

I am afraid I have to agree to differ. The term master and slave (commonly
used in Big data tools (not confined to HBase only) is BAU and historical)
and bears no resemblance to anything recent.

Additionally, both whitelist and blacklist simply refer to a proposal which
is accepted and a proposal which is struck out (black pencil line).

So in scientific context these are terminologies used. Terminologies become
offensive if they are used "in the incorrect context". I don't think anyone
in HBase or Spark community will have objections if these terminologies are
used as before. Spark used the term in master/slave in Standalone mode if i
recall correctly.

Changing something for the sake of "now being in the limelight" does not
make it right. So I beg to differ on this. Having said that it is indeed a
sign of a civilised mind to entertain an idea without accepting it so
whatever the community wishes.

HTH




LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 22 Jun 2020 at 19:39, Andrew Purtell  wrote:

> In response to renewed attention at the Foundation toward addressing
> culturally problematic language and terms often used in technical
> documentation and discussion, several projects have begun discussions, or
> made proposals, or started work along these lines.
>
> The HBase PMC began its own discussion on private@ on June 9, 2020 with an
> observation of this activity and this suggestion:
>
> There is a renewed push back against classic technology industry terms that
> have negative modern connotations.
>
> In the case of HBase, the following substitutions might be proposed:
>
> - Coordinator instead of master
>
> - Worker instead of slave
>
> Recommendations for these additional substitutions also come up in this
> type of discussion:
>
> - Accept list instead of white list
>
> - Deny list instead of black list
>
> Unfortunately we have Master all over our code base, baked into various
> APIs and configuration variable names, so for us the necessary changes
> amount to a new major release and deprecation cycle. It could well be worth
> it in the long run. We exist only as long as we draw a willing and
> sufficient contributor community. It also wouldn’t be great to have an
> activist fork appear somewhere, even if unlikely to be successful.
>
> Relevant JIRAs are:
>
>- HBASE-12677 :
>Update replication docs to clarify terminology
>- HBASE-13852 :
>Replace master-slave terminology in book, site, and javadoc with a more
>modern vocabulary
>- HBASE-24576 :
>Changing "whitelist" and "blacklist" in our docs and project
>
> In response to this proposal, a member of the PMC asked if the term
> 'master' used by itself would be fine, because we only have use of 'slave'
> in replication documentation and that is easily addressed. In response to
> this question, others on the PMC suggested that even if only 'master' is
> used, in this context it is still a problem.
>
> For folks who are surprised or lacking context on the details of this
> discussion, one PMC member offered a link to this draft RFC as background:
> https://tools.ietf.org/id/draft-knodel-terminology-00.html
>
> There was general support for removing the term "master" / "hmaster" from
> our code base and using the terms "coordinator" or "leader" instead. In the
> context of replication, "worker" makes less sense and perhaps "destination"
> or "follower" would be more appropriate terms.
>
> One PMC member's thoughts on language and non-native English speakers is
> worth including in its entirety:
>
> While words like blacklist/whitelist/slave clearly have those negative
> references, word master might not have the same impact for non native
> English speakers like myself where the literal translation to my mother
> tongue does not have this same bad connotation. Replacing all references
> for word *master *on our docs/codebase is a huge effort, I guess such a
> decision would be more suitable for native English speakers folks, and
> maybe we should consider the opinion of contributors from that ethinic
> minority as well?
>
> These are good questions for public discussion.
>
> We have a consensus in the PMC, at this time, that is supportive of making
> the above discussed terminology changes. However, we also have concerns
> about what it would take to accomplish meaning

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
Thank you Mich. 

Hopefully it is clear that there is no community consensus yet, and all voices 
are welcome on the topic. 

> On Jun 22, 2020, at 12:15 PM, Mich Talebzadeh  
> wrote:
> 
> Hi,
> 
> Thank you for the proposals.
> 
> I am afraid I have to agree to differ. The term master and slave (commonly
> used in Big data tools (not confined to HBase only) is BAU and historical)
> and bears no resemblance to anything recent.
> 
> Additionally, both whitelist and blacklist simply refer to a proposal which
> is accepted and a proposal which is struck out (black pencil line).
> 
> So in scientific context these are terminologies used. Terminologies become
> offensive if they are used "in the incorrect context". I don't think anyone
> in HBase or Spark community will have objections if these terminologies are
> used as before. Spark used the term in master/slave in Standalone mode if i
> recall correctly.
> 
> Changing something for the sake of "now being in the limelight" does not
> make it right. So I beg to differ on this. Having said that it is indeed a
> sign of a civilised mind to entertain an idea without accepting it so
> whatever the community wishes.
> 
> HTH
> 
> 
> 
> 
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
> 
> 
> 
> 
> 
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
> 
> 
> 
> 
>> On Mon, 22 Jun 2020 at 19:39, Andrew Purtell  wrote:
>> 
>> In response to renewed attention at the Foundation toward addressing
>> culturally problematic language and terms often used in technical
>> documentation and discussion, several projects have begun discussions, or
>> made proposals, or started work along these lines.
>> 
>> The HBase PMC began its own discussion on private@ on June 9, 2020 with an
>> observation of this activity and this suggestion:
>> 
>> There is a renewed push back against classic technology industry terms that
>> have negative modern connotations.
>> 
>> In the case of HBase, the following substitutions might be proposed:
>> 
>> - Coordinator instead of master
>> 
>> - Worker instead of slave
>> 
>> Recommendations for these additional substitutions also come up in this
>> type of discussion:
>> 
>> - Accept list instead of white list
>> 
>> - Deny list instead of black list
>> 
>> Unfortunately we have Master all over our code base, baked into various
>> APIs and configuration variable names, so for us the necessary changes
>> amount to a new major release and deprecation cycle. It could well be worth
>> it in the long run. We exist only as long as we draw a willing and
>> sufficient contributor community. It also wouldn’t be great to have an
>> activist fork appear somewhere, even if unlikely to be successful.
>> 
>> Relevant JIRAs are:
>> 
>>   - HBASE-12677 :
>>   Update replication docs to clarify terminology
>>   - HBASE-13852 :
>>   Replace master-slave terminology in book, site, and javadoc with a more
>>   modern vocabulary
>>   - HBASE-24576 :
>>   Changing "whitelist" and "blacklist" in our docs and project
>> 
>> In response to this proposal, a member of the PMC asked if the term
>> 'master' used by itself would be fine, because we only have use of 'slave'
>> in replication documentation and that is easily addressed. In response to
>> this question, others on the PMC suggested that even if only 'master' is
>> used, in this context it is still a problem.
>> 
>> For folks who are surprised or lacking context on the details of this
>> discussion, one PMC member offered a link to this draft RFC as background:
>> https://tools.ietf.org/id/draft-knodel-terminology-00.html
>> 
>> There was general support for removing the term "master" / "hmaster" from
>> our code base and using the terms "coordinator" or "leader" instead. In the
>> context of replication, "worker" makes less sense and perhaps "destination"
>> or "follower" would be more appropriate terms.
>> 
>> One PMC member's thoughts on language and non-native English speakers is
>> worth including in its entirety:
>> 
>> While words like blacklist/whitelist/slave clearly have those negative
>> references, word master might not have the same impact for non native
>> English speakers like myself where the literal translation to my mother
>> tongue does not have this same bad connotation. Replacing all references
>> for word *master *on our docs/codebase is a huge effort, I guess such a
>> decision would be more suitable for native English speakers 

[DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
In response to renewed attention at the Foundation toward addressing
culturally problematic language and terms often used in technical
documentation and discussion, several projects have begun discussions, or
made proposals, or started work along these lines.

The HBase PMC began its own discussion on private@ on June 9, 2020 with an
observation of this activity and this suggestion:

There is a renewed push back against classic technology industry terms that
have negative modern connotations.

In the case of HBase, the following substitutions might be proposed:

- Coordinator instead of master

- Worker instead of slave

Recommendations for these additional substitutions also come up in this
type of discussion:

- Accept list instead of white list

- Deny list instead of black list

Unfortunately we have Master all over our code base, baked into various
APIs and configuration variable names, so for us the necessary changes
amount to a new major release and deprecation cycle. It could well be worth
it in the long run. We exist only as long as we draw a willing and
sufficient contributor community. It also wouldn’t be great to have an
activist fork appear somewhere, even if unlikely to be successful.

Relevant JIRAs are:

   - HBASE-12677 :
   Update replication docs to clarify terminology
   - HBASE-13852 :
   Replace master-slave terminology in book, site, and javadoc with a more
   modern vocabulary
   - HBASE-24576 :
   Changing "whitelist" and "blacklist" in our docs and project

In response to this proposal, a member of the PMC asked if the term
'master' used by itself would be fine, because we only have use of 'slave'
in replication documentation and that is easily addressed. In response to
this question, others on the PMC suggested that even if only 'master' is
used, in this context it is still a problem.

For folks who are surprised or lacking context on the details of this
discussion, one PMC member offered a link to this draft RFC as background:
https://tools.ietf.org/id/draft-knodel-terminology-00.html

There was general support for removing the term "master" / "hmaster" from
our code base and using the terms "coordinator" or "leader" instead. In the
context of replication, "worker" makes less sense and perhaps "destination"
or "follower" would be more appropriate terms.

One PMC member's thoughts on language and non-native English speakers is
worth including in its entirety:

While words like blacklist/whitelist/slave clearly have those negative
references, word master might not have the same impact for non native
English speakers like myself where the literal translation to my mother
tongue does not have this same bad connotation. Replacing all references
for word *master *on our docs/codebase is a huge effort, I guess such a
decision would be more suitable for native English speakers folks, and
maybe we should consider the opinion of contributors from that ethinic
minority as well?

These are good questions for public discussion.

We have a consensus in the PMC, at this time, that is supportive of making
the above discussed terminology changes. However, we also have concerns
about what it would take to accomplish meaningful changes. Several on the
PMC offered support in the form of cycles to review pull requests and
patches, and two PMC members offered  personal bandwidth for creating and
releasing new code lines as needed to complete a deprecation cycle.

Unfortunately, the terms "master" and "hmaster" appear throughout our code
base in class names, user facing API subject to our project compatibility
guidelines, and configuration variable names, which are also implicated by
compatibility guidelines given the impact of changes to operators and
operations. The changes being discussed are not backwards compatible
changes and cannot be executed with swiftness while simultaneously
preserving compatibility. There must be a deprecation cycle. First, we must
tag all implicated public API and configuration variables as deprecated,
and release HBase 3 with these deprecations in place. Then, we must
undertake rename and removal as appropriate, and release the result as
HBase 4.

One PMC member raised a question in this context included here in entirety:

Are we willing to commit to rolling through the major versions at a pace
that's necessary to make this transition as swift as
reasonably possible?

This is a question for all of us. For the PMC, who would supervise the
effort, perhaps contribute to it, and certainly vote on the release
candidates. For contributors and potential contributors, who would provide
the necessary patches. For committers, who would be required to review and
commit the relevant changes.

Although there has been some initial discussion, there is no singular
proposal, or plan, or set of decisions made at this time. Wrestling with

Re: [VOTE] First release candidate for HBase 2.3.0 (RC0) is available

2020-06-22 Thread Wellington Chevreuil
Had submitted an addendum PR for HBASE-21773. For HBASE-24221, we may try
the same.

Em seg., 22 de jun. de 2020 às 17:45, Nick Dimiduk 
escreveu:

> I have reopened HBASE-22504, HBASE-21773, HBASE-23055, and HBASE-24102 for
> addendums based on this thread. I also started a [DISCUSS] thread re:
> VisibleForTesting annotation.
>
> Thanks,
> Nick
>
> On Mon, Jun 22, 2020 at 9:22 AM Nick Dimiduk  wrote:
>
> > > On NettyRpcClientConfigHelper I think it is fine. It is designed to be
> > an 'util' class, so in HBASE-23956 we made it final and added a private
> > constructor. It only has static methods and is not expected to be
> extended
> > or instantiated by end users.
> >
> > Very well, let's keep this change.
> >
> > > On ByteBufferUtils, it is IA.Private on master branch?
> >
> > It is IA.Public on 2.2.0, the point of reference.
> >
> > > On the replication related classes, all the constructors are marked as
> > IA.Private, so I think they are all fine. Anyway, we should have a better
> > design, maybe something like the ClusterMetrics, where we introduce an
> > interface get the metrics.
> >
> > Ah indeed, the constructors are marked IA.Private. That's not very kind
> to
> > our users, but I guess it works.
> >
> > > For the
> > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad
> > change added by HBASE-24221, the method is marked with @VisibleForTesting
> > and javadoc says "Protected for testing", so maybe it's fine?
> >
> > To the best of my knowledge, we do not discuss the @VisibleForTesting
> > annotation in our compatibility guidelines, thus I think it's a
> violation.
> >
> > > For "org.apache.hadoop.hbase.mapreduce.RowCounter.createSubmittableJob
> > -- Method parameter removed via HBASE-21773" and
> > "org.apache.hadoop.hbase.client.SnapshotDescription -- Additional
> > constructor arguments added in HBASE-22648" I guess we could push
> amending
> > PRs re-adding the methods with original signature as deprecated?
> >
> > Yes, sounds good for both of them.
> >
> > > adding property map argument in SnapshotDescription was my doing, let
> me
> > open up Jira to bring back original signature as deprecated (since I am
> > familiar with it). I can also help look into other changes if required.
> >
> > Thank you!
> >
> > > Moreover, reg test failure for ReplicationStatusSink, opened Jira
> > HBASE-24594 to have it separate cluster pair setup and not share
> resources
> > with TestReplicationStatus. This is committed, I will keep an eye on
> flaky
> > and nightly builds.
> >
> > And thank you again :)
> >
> > On Sun, Jun 21, 2020 at 10:52 AM Viraj Jasani 
> wrote:
> >
> >> Nick,
> >>
> >> I just went through above methods and I agree with Duo and Wellington
> reg
> >> @IA.Private, @VisibleForTesting methods and also the fact that we should
> >> add original signature for IA.Public methods making them deprecated and
> >> internally using new methods. e.g adding property map argument in
> >> SnapshotDescription was my doing, let me open up Jira to bring back
> >> original signature as deprecated (since I am familiar with it). I can
> also
> >> help look into other changes if required.
> >>
> >> Moreover, reg test failure for ReplicationStatusSink, opened Jira
> >> HBASE-24594 to have it separate cluster pair setup and not share
> resources
> >> with TestReplicationStatus. This is committed, I will keep an eye on
> flaky
> >> and nightly builds. I wish I had taken junit xml output when it failed,
> >> apologies. However, I am glad that this is not reported flaky as such
> and
> >> with separate resource allocation, this should go even smoother.
> >>
> >> Overall, I am hopeful that we should be able to take care of all
> relevant
> >> source incompatibilities sooner.
> >>
> >>
> >> On 2020/06/19 09:51:43, Wellington Chevreuil <
> >> wellington.chevre...@gmail.com> wrote:
> >> > I agree with Duo regarding the methods that were already marked as
> >> > IA.private.
> >> >
> >> > For the
> >> > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad
> >> > change added by HBASE-24221, the method is marked with
> >> @VisibleForTesting
> >> > and javadoc says "Protected for testing", so maybe it's fine?
> >> >
> >> > For "org.apache.hadoop.hbase.mapreduce.RowCounter.createSubmittableJob
> >> > --Method parameter removed via HBASE-21773" and
> >> > "org.apache.hadoop.hbase.client.SnapshotDescription -- Additional
> >> > constructor arguments added in HBASE-22648" I guess we could
> >> > push amending PRs re-adding the methods with original signature as
> >> > deprecated?
> >> >
> >> >
> >> >
> >> > Em sex., 19 de jun. de 2020 às 03:01, 张铎(Duo Zhang) <
> >> palomino...@gmail.com>
> >> > escreveu:
> >> >
> >> > > On NettyRpcClientConfigHelper I think it is fine. It is designed to
> >> be an
> >> > > 'util' class, so in HBASE-23956 we made it final and added a private
> >> > > constructor. It only has static methods and is not expected to be
> >> extended
> >> > > or instantiated by

[jira] [Resolved] (HBASE-24102) RegionMover should exclude draining/decommissioning nodes from target RSs

2020-06-22 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-24102.
--
Resolution: Fixed

Addendum is committed to master, branch-2, 2.3, 2.2, 2.1.

> RegionMover should exclude draining/decommissioning nodes from target RSs
> -
>
> Key: HBASE-24102
> URL: https://issues.apache.org/jira/browse/HBASE-24102
> Project: HBase
>  Issue Type: Improvement
>Reporter: Anoop Sam John
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.4.0, 2.1.10, 2.2.5
>
>
> When using RegionMover tool to unload the regions from a given RS, it decides 
> the list of destination RSs by 
> {code}
> List regionServers = new ArrayList<>();
> regionServers.addAll(admin.getRegionServers());
> // Remove the host Region server from target Region Servers list
> ServerName server = stripServer(regionServers, hostname, port);
> .
> // Remove RS present in the exclude file
> stripExcludes(regionServers);
> stripMaster(regionServers);
> {code}
> Ya it is removing the RSs mentioned in the exclude file.  
> Better when the RegionMover user is NOT mentioning any exclude list, we can 
> exclude the draining/decommissioning RSs
> Admin#listDecommissionedRegionServers()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] VisibleForTesting annotation as it pertains to our API compatibility guidelines

2020-06-22 Thread Sean Busbey
Yeah I would say no as well. We should make clear on our dev guide that you
also should be marking those things with an Interface Audience marking if
you don't intend them to be at the downstream API visibility of the parent
class.

(IIRC we also use VisibleForTesting in IA.Private classes to proactively
explain why some internal looking member is at a wider Java access scope.)

On Mon, Jun 22, 2020, 11:39 Nick Dimiduk  wrote:

> Hello,
>
> This came up over on the 2.3.0RC0 thread, so let's open it for proper
> discussion. In that context, we observe method signature changes to a
> method marked with the Guava VisibleForTesting annotation. The method is a
> protected method on a IA.Public class. There is no method-level IA
> annotation.
>
> Do we consider the VisibleForTesting annotation as a specifier for our
> compatibility guidelines?
>
> I am of the opinion that no, it is not an InterfaceAudience annotation, and
> so it is not applicable for defining our public API.
>
> What do you think?
>
> Thanks,
> Nick
>


Re: [VOTE] First release candidate for HBase 2.3.0 (RC0) is available

2020-06-22 Thread Nick Dimiduk
I have reopened HBASE-22504, HBASE-21773, HBASE-23055, and HBASE-24102 for
addendums based on this thread. I also started a [DISCUSS] thread re:
VisibleForTesting annotation.

Thanks,
Nick

On Mon, Jun 22, 2020 at 9:22 AM Nick Dimiduk  wrote:

> > On NettyRpcClientConfigHelper I think it is fine. It is designed to be
> an 'util' class, so in HBASE-23956 we made it final and added a private
> constructor. It only has static methods and is not expected to be extended
> or instantiated by end users.
>
> Very well, let's keep this change.
>
> > On ByteBufferUtils, it is IA.Private on master branch?
>
> It is IA.Public on 2.2.0, the point of reference.
>
> > On the replication related classes, all the constructors are marked as
> IA.Private, so I think they are all fine. Anyway, we should have a better
> design, maybe something like the ClusterMetrics, where we introduce an
> interface get the metrics.
>
> Ah indeed, the constructors are marked IA.Private. That's not very kind to
> our users, but I guess it works.
>
> > For the
> org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad
> change added by HBASE-24221, the method is marked with @VisibleForTesting
> and javadoc says "Protected for testing", so maybe it's fine?
>
> To the best of my knowledge, we do not discuss the @VisibleForTesting
> annotation in our compatibility guidelines, thus I think it's a violation.
>
> > For "org.apache.hadoop.hbase.mapreduce.RowCounter.createSubmittableJob
> -- Method parameter removed via HBASE-21773" and
> "org.apache.hadoop.hbase.client.SnapshotDescription -- Additional
> constructor arguments added in HBASE-22648" I guess we could push amending
> PRs re-adding the methods with original signature as deprecated?
>
> Yes, sounds good for both of them.
>
> > adding property map argument in SnapshotDescription was my doing, let me
> open up Jira to bring back original signature as deprecated (since I am
> familiar with it). I can also help look into other changes if required.
>
> Thank you!
>
> > Moreover, reg test failure for ReplicationStatusSink, opened Jira
> HBASE-24594 to have it separate cluster pair setup and not share resources
> with TestReplicationStatus. This is committed, I will keep an eye on flaky
> and nightly builds.
>
> And thank you again :)
>
> On Sun, Jun 21, 2020 at 10:52 AM Viraj Jasani  wrote:
>
>> Nick,
>>
>> I just went through above methods and I agree with Duo and Wellington reg
>> @IA.Private, @VisibleForTesting methods and also the fact that we should
>> add original signature for IA.Public methods making them deprecated and
>> internally using new methods. e.g adding property map argument in
>> SnapshotDescription was my doing, let me open up Jira to bring back
>> original signature as deprecated (since I am familiar with it). I can also
>> help look into other changes if required.
>>
>> Moreover, reg test failure for ReplicationStatusSink, opened Jira
>> HBASE-24594 to have it separate cluster pair setup and not share resources
>> with TestReplicationStatus. This is committed, I will keep an eye on flaky
>> and nightly builds. I wish I had taken junit xml output when it failed,
>> apologies. However, I am glad that this is not reported flaky as such and
>> with separate resource allocation, this should go even smoother.
>>
>> Overall, I am hopeful that we should be able to take care of all relevant
>> source incompatibilities sooner.
>>
>>
>> On 2020/06/19 09:51:43, Wellington Chevreuil <
>> wellington.chevre...@gmail.com> wrote:
>> > I agree with Duo regarding the methods that were already marked as
>> > IA.private.
>> >
>> > For the
>> > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad
>> > change added by HBASE-24221, the method is marked with
>> @VisibleForTesting
>> > and javadoc says "Protected for testing", so maybe it's fine?
>> >
>> > For "org.apache.hadoop.hbase.mapreduce.RowCounter.createSubmittableJob
>> > --Method parameter removed via HBASE-21773" and
>> > "org.apache.hadoop.hbase.client.SnapshotDescription -- Additional
>> > constructor arguments added in HBASE-22648" I guess we could
>> > push amending PRs re-adding the methods with original signature as
>> > deprecated?
>> >
>> >
>> >
>> > Em sex., 19 de jun. de 2020 às 03:01, 张铎(Duo Zhang) <
>> palomino...@gmail.com>
>> > escreveu:
>> >
>> > > On NettyRpcClientConfigHelper I think it is fine. It is designed to
>> be an
>> > > 'util' class, so in HBASE-23956 we made it final and added a private
>> > > constructor. It only has static methods and is not expected to be
>> extended
>> > > or instantiated by end users.
>> > >
>> > > On ByteBufferUtils, it is IA.Private on master branch?
>> > >
>> > > On the replication related classes, all the constructors are marked as
>> > > IA.Private, so I think they are all fine. Anyway, we should have a
>> better
>> > > design, maybe something like the ClusterMetrics, where we introduce an
>> > > interface get the metrics.
>> > >
>> > >
>> > > Nick Dimid

[jira] [Reopened] (HBASE-24102) RegionMover should exclude draining/decommissioning nodes from target RSs

2020-06-22 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-24102:
--

This issue removed visibility of 6 fields on {{o.a.h.h.util.RegionMover}} 
without a deprecation cycle. Reopening.

> RegionMover should exclude draining/decommissioning nodes from target RSs
> -
>
> Key: HBASE-24102
> URL: https://issues.apache.org/jira/browse/HBASE-24102
> Project: HBase
>  Issue Type: Improvement
>Reporter: Anoop Sam John
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0, 2.4.0, 2.1.10, 2.2.5
>
>
> When using RegionMover tool to unload the regions from a given RS, it decides 
> the list of destination RSs by 
> {code}
> List regionServers = new ArrayList<>();
> regionServers.addAll(admin.getRegionServers());
> // Remove the host Region server from target Region Servers list
> ServerName server = stripServer(regionServers, hostname, port);
> .
> // Remove RS present in the exclude file
> stripExcludes(regionServers);
> stripMaster(regionServers);
> {code}
> Ya it is removing the RSs mentioned in the exclude file.  
> Better when the RegionMover user is NOT mentioning any exclude list, we can 
> exclude the draining/decommissioning RSs
> Admin#listDecommissionedRegionServers()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-23055) Alter hbase:meta

2020-06-22 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-23055:
--

This issue drops {{HConstants#HBASE_NON_USER_TABLE_DIRS}} without a deprecation 
cycle. Reopening.

> Alter hbase:meta
> 
>
> Key: HBASE-23055
> URL: https://issues.apache.org/jira/browse/HBASE-23055
> Project: HBase
>  Issue Type: Task
>  Components: meta
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> hbase:meta is currently hardcoded. Its schema cannot be change.
> This issue is about allowing edits to hbase:meta schema. It will allow our 
> being able to set encodings such as the block-with-indexes which will help 
> quell CPU usage on host carrying hbase:meta. A dynamic hbase:meta is first 
> step on road to being able to split meta.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-21773) rowcounter utility should respond to pleas for help

2020-06-22 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-21773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-21773:
--

This issue changes the method signature of 
o.a.h.h.mapreduce.RowCounter#createSubmittableJob in a way that violates our 
compatibility guidelines.

> rowcounter utility should respond to pleas for help
> ---
>
> Key: HBASE-21773
> URL: https://issues.apache.org/jira/browse/HBASE-21773
> Project: HBase
>  Issue Type: Bug
>  Components: tooling
>Affects Versions: 2.1.0
>Reporter: Sean Busbey
>Assignee: Wellington Chevreuil
>Priority: Critical
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: HBASE-21773.master.001.patch, 
> HBASE-21773.master.002.patch, HBASE-21773.master.003.patch, 
> HBASE-21773.master.004.patch
>
>
> {{hbase rowcounter}} does not respond to reasonable requests for help, i.e. 
> {{--help}}, {{-h}}, or {{-?}}
> {code}
> [systest@busbey-training-1 root]$ hbase rowcounter -?
> OpenJDK 64-Bit Server VM warning: Using incremental CMS is deprecated and 
> will likely be removed in a future release
> 19/01/24 12:30:00 INFO client.RMProxy: Connecting to ResourceManager at 
> busbey-training-1.gce.cloudera.com/172.31.116.31:8032
> 19/01/24 12:30:01 INFO hdfs.DFSClient: Created token for systest: 
> HDFS_DELEGATION_TOKEN owner=syst...@gce.cloudera.com, renewer=yarn, 
> realUser=, issueDate=1548361801519, maxDate=1548966601519, sequenceNumber=3, 
> masterKeyId=8 on 172.31.116.31:8020
> 19/01/24 12:30:01 INFO kms.KMSClientProvider: Getting new token from 
> https://busbey-training-3.gce.cloudera.com:16000/kms/v1/, 
> renewer:yarn/busbey-training-1.gce.cloudera@gce.cloudera.com
> 19/01/24 12:30:02 INFO kms.KMSClientProvider: New token received: (Kind: 
> kms-dt, Service: 172.31.116.52:16000, Ident: (kms-dt owner=systest, 
> renewer=yarn, realUser=, issueDate=1548361801965, maxDate=1548966601965, 
> sequenceNumber=5, masterKeyId=17))
> 19/01/24 12:30:02 INFO security.TokenCache: Got dt for 
> hdfs://busbey-training-1.gce.cloudera.com:8020; Kind: HDFS_DELEGATION_TOKEN, 
> Service: 172.31.116.31:8020, Ident: (token for systest: HDFS_DELEGATION_TOKEN 
> owner=syst...@gce.cloudera.com, renewer=yarn, realUser=, 
> issueDate=1548361801519, maxDate=1548966601519, sequenceNumber=3, 
> masterKeyId=8)
> 19/01/24 12:30:02 INFO security.TokenCache: Got dt for 
> hdfs://busbey-training-1.gce.cloudera.com:8020; Kind: kms-dt, Service: 
> 172.31.116.52:16000, Ident: (kms-dt owner=systest, renewer=yarn, realUser=, 
> issueDate=1548361801965, maxDate=1548966601965, sequenceNumber=5, 
> masterKeyId=17)
> 19/01/24 12:30:02 INFO kms.KMSClientProvider: Getting new token from 
> https://busbey-training-4.gce.cloudera.com:16000/kms/v1/, 
> renewer:yarn/busbey-training-1.gce.cloudera@gce.cloudera.com
> 19/01/24 12:30:02 INFO kms.KMSClientProvider: New token received: (Kind: 
> kms-dt, Service: 172.31.116.50:16000, Ident: (kms-dt owner=systest, 
> renewer=yarn, realUser=, issueDate=1548361802363, maxDate=1548966602363, 
> sequenceNumber=6, masterKeyId=18))
> 19/01/24 12:30:02 INFO security.TokenCache: Got dt for 
> hdfs://busbey-training-1.gce.cloudera.com:8020; Kind: kms-dt, Service: 
> 172.31.116.50:16000, Ident: (kms-dt owner=systest, renewer=yarn, realUser=, 
> issueDate=1548361802363, maxDate=1548966602363, sequenceNumber=6, 
> masterKeyId=18)
> 19/01/24 12:30:02 INFO mapreduce.JobResourceUploader: Disabling Erasure 
> Coding for path: /user/systest/.staging/job_1548349234632_0003
> 19/01/24 12:30:03 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/systest/.staging/job_1548349234632_0003
> Exception in thread "main" java.lang.IllegalArgumentException: Illegal first 
> character <45> at 0. User-space table qualifiers can only start with 
> 'alphanumeric characters' from any language: -?
> at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:193)
> at 
> org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:156)
> at org.apache.hadoop.hbase.TableName.(TableName.java:346)
> at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:382)
> at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:469)
> at 
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.initialize(TableInputFormat.java:198)
> at 
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:243)
> at 
> org.apache.hadoop.hbase.mapreduce.TableInputFormat.getSplits(TableInputFormat.java:254)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
> at 
> org.apache.ha

[DISCUSS] VisibleForTesting annotation as it pertains to our API compatibility guidelines

2020-06-22 Thread Nick Dimiduk
Hello,

This came up over on the 2.3.0RC0 thread, so let's open it for proper
discussion. In that context, we observe method signature changes to a
method marked with the Guava VisibleForTesting annotation. The method is a
protected method on a IA.Public class. There is no method-level IA
annotation.

Do we consider the VisibleForTesting annotation as a specifier for our
compatibility guidelines?

I am of the opinion that no, it is not an InterfaceAudience annotation, and
so it is not applicable for defining our public API.

What do you think?

Thanks,
Nick


[jira] [Reopened] (HBASE-22504) Optimize the MultiByteBuff#get(ByteBuffer, offset, len)

2020-06-22 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-22504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-22504:
--

>From the 2.3.0RC0, looks like this change introduced a breach of our 
>compatibility guidelines. The method {{findCommonPrefix}} needs to be restored 
>(and optionally marked as deprecated).

> Optimize the MultiByteBuff#get(ByteBuffer, offset, len)
> ---
>
> Key: HBASE-22504
> URL: https://issues.apache.org/jira/browse/HBASE-22504
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: HBASE-22504.HBASE-21879.v01.patch
>
>
> In HBASE-22483,  we saw that the BucketCacheWriter thread was quite busy 
> [^BucketCacheWriter-is-busy.png],  the flame graph also indicated that the 
> ByteBufferArray#internalTransfer cost ~6% CPU (see 
> [async-prof-pid-25042-cpu-1.svg|https://issues.apache.org/jira/secure/attachment/12970294/async-prof-pid-25042-cpu-1.svg]).
>   because we used the hbase.ipc.server.allocator.buffer.size=64KB, each 
> HFileBlock will be backend  by a MultiByteBuff: one 64KB offheap ByteBuffer 
> and one small heap ByteBuffer.   
> The path is depending on the MultiByteBuff#get(ByteBuffer, offset, len) now: 
> {code:java}
> RAMQueueEntry#writeToCache
> |--> ByteBufferIOEngine#write
> |--> ByteBufferArray#internalTransfer
> |--> ByteBufferArray$WRITER
> |--> MultiByteBuff#get(ByteBuffer, offset, len)
> {code}
> While the MultiByteBuff#get impl is simple and crude now, can optimze this 
> implementation:
> {code:java}
>   @Override
>   public void get(ByteBuffer out, int sourceOffset,
>   int length) {
> checkRefCount();
>   // Not used from real read path actually. So not going with
>   // optimization
> for (int i = 0; i < length; ++i) {
>   out.put(this.get(sourceOffset + i));
> }
>   }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] First release candidate for HBase 2.3.0 (RC0) is available

2020-06-22 Thread Nick Dimiduk
> On NettyRpcClientConfigHelper I think it is fine. It is designed to be an
'util' class, so in HBASE-23956 we made it final and added a private
constructor. It only has static methods and is not expected to be extended
or instantiated by end users.

Very well, let's keep this change.

> On ByteBufferUtils, it is IA.Private on master branch?

It is IA.Public on 2.2.0, the point of reference.

> On the replication related classes, all the constructors are marked as
IA.Private, so I think they are all fine. Anyway, we should have a better
design, maybe something like the ClusterMetrics, where we introduce an
interface get the metrics.

Ah indeed, the constructors are marked IA.Private. That's not very kind to
our users, but I guess it works.

> For the
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad
change added by HBASE-24221, the method is marked with @VisibleForTesting
and javadoc says "Protected for testing", so maybe it's fine?

To the best of my knowledge, we do not discuss the @VisibleForTesting
annotation in our compatibility guidelines, thus I think it's a violation.

> For "org.apache.hadoop.hbase.mapreduce.RowCounter.createSubmittableJob --
Method parameter removed via HBASE-21773" and
"org.apache.hadoop.hbase.client.SnapshotDescription -- Additional
constructor arguments added in HBASE-22648" I guess we could push amending
PRs re-adding the methods with original signature as deprecated?

Yes, sounds good for both of them.

> adding property map argument in SnapshotDescription was my doing, let me
open up Jira to bring back original signature as deprecated (since I am
familiar with it). I can also help look into other changes if required.

Thank you!

> Moreover, reg test failure for ReplicationStatusSink, opened Jira
HBASE-24594 to have it separate cluster pair setup and not share resources
with TestReplicationStatus. This is committed, I will keep an eye on flaky
and nightly builds.

And thank you again :)

On Sun, Jun 21, 2020 at 10:52 AM Viraj Jasani  wrote:

> Nick,
>
> I just went through above methods and I agree with Duo and Wellington reg
> @IA.Private, @VisibleForTesting methods and also the fact that we should
> add original signature for IA.Public methods making them deprecated and
> internally using new methods. e.g adding property map argument in
> SnapshotDescription was my doing, let me open up Jira to bring back
> original signature as deprecated (since I am familiar with it). I can also
> help look into other changes if required.
>
> Moreover, reg test failure for ReplicationStatusSink, opened Jira
> HBASE-24594 to have it separate cluster pair setup and not share resources
> with TestReplicationStatus. This is committed, I will keep an eye on flaky
> and nightly builds. I wish I had taken junit xml output when it failed,
> apologies. However, I am glad that this is not reported flaky as such and
> with separate resource allocation, this should go even smoother.
>
> Overall, I am hopeful that we should be able to take care of all relevant
> source incompatibilities sooner.
>
>
> On 2020/06/19 09:51:43, Wellington Chevreuil <
> wellington.chevre...@gmail.com> wrote:
> > I agree with Duo regarding the methods that were already marked as
> > IA.private.
> >
> > For the
> > org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad
> > change added by HBASE-24221, the method is marked with @VisibleForTesting
> > and javadoc says "Protected for testing", so maybe it's fine?
> >
> > For "org.apache.hadoop.hbase.mapreduce.RowCounter.createSubmittableJob
> > --Method parameter removed via HBASE-21773" and
> > "org.apache.hadoop.hbase.client.SnapshotDescription -- Additional
> > constructor arguments added in HBASE-22648" I guess we could
> > push amending PRs re-adding the methods with original signature as
> > deprecated?
> >
> >
> >
> > Em sex., 19 de jun. de 2020 às 03:01, 张铎(Duo Zhang) <
> palomino...@gmail.com>
> > escreveu:
> >
> > > On NettyRpcClientConfigHelper I think it is fine. It is designed to be
> an
> > > 'util' class, so in HBASE-23956 we made it final and added a private
> > > constructor. It only has static methods and is not expected to be
> extended
> > > or instantiated by end users.
> > >
> > > On ByteBufferUtils, it is IA.Private on master branch?
> > >
> > > On the replication related classes, all the constructors are marked as
> > > IA.Private, so I think they are all fine. Anyway, we should have a
> better
> > > design, maybe something like the ClusterMetrics, where we introduce an
> > > interface get the metrics.
> > >
> > >
> > > Nick Dimiduk  于2020年6月19日周五 上午6:26写道:
> > >
> > > > I've done some accounting of the source-incompatible changes. I'm not
> > > > listing every item here, only the ones that I think might raise
> eyebrows
> > > or
> > > > warrant further discussion. Here are my findings.
> > > >
> > > > I think these problems sink the RC. I plan to reopen the various
> tickets
> > > > here and start a discussion with

[jira] [Created] (HBASE-24614) o.a.h.h.tool.LoadIncrementalHFiles deprecation comment does not span an entire release

2020-06-22 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-24614:


 Summary: o.a.h.h.tool.LoadIncrementalHFiles deprecation comment 
does not span an entire release
 Key: HBASE-24614
 URL: https://issues.apache.org/jira/browse/HBASE-24614
 Project: HBase
  Issue Type: Task
  Components: community
Affects Versions: 2.2.0, 3.0.0-alpha-1, 2.3.0
Reporter: Nick Dimiduk


While evaluating interface compatibilities for 2.3.0RC0, I noticed the 
deprecation comment for {{o.a.h.h.tool.LoadIncrementalHFiles}} says

{noformat}
 * @deprecated since 2.2.0, will be removed in 3.0.0. Use {@link 
BulkLoadHFiles} instead. {noformat}

This is contrary to an explicit example in our 
[book|https://hbase.apache.org/book.html#hbase.versioning], which states:

{quote}
An API needs to be deprecated for a whole major version before we will 
change/remove it.

An example: An API was deprecated in 2.0.1 and will be marked for deletion in 
4.0.0. On the other hand, an API deprecated in 2.0.0 can be removed in 3.0.0.
{quote}

Maybe there are other comments like this, which need addressed. We should grep 
the codebase and make appropriate adjustments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24613) Doesn't work SplitPolicy attributs for table

2020-06-22 Thread Aleksandr (Jira)
Aleksandr created HBASE-24613:
-

 Summary: Doesn't work SplitPolicy attributs for table
 Key: HBASE-24613
 URL: https://issues.apache.org/jira/browse/HBASE-24613
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 1.4.8
Reporter: Aleksandr
 Attachments: image-2020-06-22-18-04-27-155.png

Doesn't work SplitPolicy settings at the table level. Regions are splits to 
very low size.

Table describe: 

 
{code:java}
test_split, {TABLE_ATTRIBUTES => {MAX_FILESIZE => '2147483648', METADATA => 
{'SPLIT_POLICY' => 
'org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy'}} COLUMN 
FAMILIES DESCRIPTION {NAME => 'data', BLOOMFILTER => 'ROW', VERSIONS => '1', 
IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 
'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', 
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} {NAME => 
'number', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', 
KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', 
COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => 
'65536', REPLICATION_SCOPE => '0'} {NAME => 'uuid', BLOOMFILTER => 'ROW', 
VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', 
DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', 
MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', 
REPLICATION_SCOPE => '0'}{code}
 

Global settings that affect split:
 * Maximum Region File Size - 10GB
 * Memstore Flush Size - 128MB
 * hbase.regionserver.region.split.policy - 
org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy 

As a result, regions are splits to  400-600MB, but we expecting should region 
size 2GB. 

!image-2020-06-22-18-04-27-155.png!

Why it happens? This is a bug?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24612) Consider allowing a separate EventLoopGroup for accepting new connections.

2020-06-22 Thread Mark Robert Miller (Jira)
Mark Robert Miller created HBASE-24612:
--

 Summary: Consider allowing a separate EventLoopGroup for accepting 
new connections.
 Key: HBASE-24612
 URL: https://issues.apache.org/jira/browse/HBASE-24612
 Project: HBase
  Issue Type: Improvement
Reporter: Mark Robert Miller


Netty applications often set a separate thread pool for accepting connections 
rather than sharing a single pool for accepting new connections and the work 
those connections do.

It would be interesting to allow configuring a separation of pools to allow 
users to experiment with a pool dedicated to accepting new connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24604) Remove the stable-1 notice on our download page

2020-06-22 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-24604.
--
Fix Version/s: 3.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged PR to master branch.

> Remove the stable-1 notice on our download page
> ---
>
> Key: HBASE-24604
> URL: https://issues.apache.org/jira/browse/HBASE-24604
> Project: HBase
>  Issue Type: Task
>Reporter: Duo Zhang
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> We have already removed it from our dist release directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24611) Bring back old constructor of SnapshotDescription

2020-06-22 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-24611.
--
Resolution: Fixed

Applied addendum to branch-2 and branch-2.3.

> Bring back old constructor of SnapshotDescription
> -
>
> Key: HBASE-24611
> URL: https://issues.apache.org/jira/browse/HBASE-24611
> Project: HBase
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> As part of HBASE-22648 (Snapshot TTL), one of SnapshotDescription constructor 
> was modified with an additional argument and hence, this is raising source 
> compatibility concerns for minor releases. We need to bring back old 
> constructor, mark it deprecated and internally point to new constructor with 
> null/empty snapshotProps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-24611) Bring back old constructor of SnapshotDescription

2020-06-22 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reopened HBASE-24611:
--

> Bring back old constructor of SnapshotDescription
> -
>
> Key: HBASE-24611
> URL: https://issues.apache.org/jira/browse/HBASE-24611
> Project: HBase
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
>
> As part of HBASE-22648 (Snapshot TTL), one of SnapshotDescription constructor 
> was modified with an additional argument and hence, this is raising source 
> compatibility concerns for minor releases. We need to bring back old 
> constructor, mark it deprecated and internally point to new constructor with 
> null/empty snapshotProps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)