Re: Considering deprecation and removal of XZ compression (hbase-compression-xz)

2024-04-09 Thread Andrew Purtell
Let's remove in 2.6.0. I will submit a PR.

On Tue, Apr 2, 2024 at 7:50 PM 张铎(Duo Zhang)  wrote:

> For me I've never seen people actually use the xz compression.
>
> For size, usually people will choose gzip, and for speed, in the past
> people will choose lzo and now they choose snappy or zstd.
>
> So for me I prefer we just deprecated the xz compression immediately
> and remove it 2.6.0.
>
> Thanks.
>
> Andrew Purtell  于2024年4月2日周二 08:02写道:
> >
> > Red Hat filed CVE-2024-3094 late last week on 2024-03-29. This implicates
> > recent releases of the native liblzma library as a vector for malicious
> > code.
> >
> > This is not the pure Java version that we depend upon for HBase's support
> > for the LZMA algorithm (
> >
> https://github.com/apache/hbase/tree/master/hbase-compression/hbase-compression-xz
> ).
> > We depend on version 1.9 of xz-java, which was published in 2021, well
> > before maintenance changes in the project and the involvement of a person
> > who is now believed to be a malicious actor. Projects like HBase that
> > depend on xz-java have no reason to be concerned about the issues
> affecting
> > the native xz library.
> >
> > How the backdoor was introduced calls into question the trustworthiness
> and
> > viability of the XZ project. GitHub has disabled all repositories related
> > to XZ and liblzma, even xz-java. The webpage for XZ and xz-java is down.
> > The open source software community is responding vigorously.
> CVE-2024-3094
> > has a CVSS score 10, the highest possible score. Your security team may
> > become interested in HBase because of hbase-compression-xz's dependency
> on
> > xz-java. It is likely any discovered dependency on any LZMA
> implementation
> > will at least raise questions.
> >
> > For now xz-java remains available in Maven central. (See
> > https://central.sonatype.com/artifact/org.tukaani/xz/versions) We may
> have
> > no choice but to immediately remove hbase-compression-xz if Maven blocks
> or
> > drops xz-java too, because that will break our builds.
> >
> > There is no immediate cause for concern. Still, we believe XZ compression
> > provides little to no value over more modern alternatives, like
> ZStandard,
> > that can also achieve similar compression ratios. XZ, and alternatives
> like
> > ZStandard with the compression level set to a high value, are also
> suitable
> > only for archival use cases and unsuitable for compression of flush files
> > or for use in minor compactions. Given how niche any use of XZ
> > compression could
> > be, we are wondering if there are actually any users of it.
> >
> > If we have no users of hbase-compression-xz, then it provides little to
> no
> > value and continued maintenance of hbase-compression-xz given the issues
> > with its dependency does not make sense.
> >
> > Do you use XZ compression, or are you planning to?
> >
> > If we deprecate XZ compression immediately and then remove it in 2.6,
> would
> > this present a problem? In a private discussion we reached consensus on
> > this approach, but, of course, that is not yet a plan, and something that
> > could easily change based on feedback.
> >
> > From https://nvd.nist.gov/vuln/detail/CVE-2024-3094:
> > "Malicious code was discovered in the upstream tarballs of xz, starting
> > with version 5.6.0. Through a series of complex obfuscations, the liblzma
> > build process extracts a prebuilt object file from a disguised test file
> > existing in the source code, which is then used to modify specific
> > functions in the liblzma code. This results in a modified liblzma library
> > that can be used by any software linked against this library,
> intercepting
> > and modifying the data interaction with this library."
> >
> > --
> > Best regards,
> > Andrew
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase committer Istvan Toth

2024-04-03 Thread Andrew Purtell
Congratulations and welcome, Istvan!


On Tue, Apr 2, 2024 at 4:23 AM Duo Zhang  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that
> Istvan Toth(stoty)
> has accepted the PMC's invitation to become a committer on the
> project. We appreciate all
> of Istvan Toth's generous contributions thus far and look forward to
> his continued involvement.
>
> Congratulations and welcome, Istvan Toth!
>
> 我很高兴代表 Apache HBase PMC 宣布 Istvan Toth 已接受我们的邀请,成
> 为 Apache HBase 项目的 Committer。感谢 Istvan Toth 一直以来为 HBase 项目
> 做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎 Istvan Toth!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase committer Istvan Toth

2024-04-03 Thread Andrew Purtell
Congratulations and welcome, Istvan!


On Tue, Apr 2, 2024 at 4:23 AM Duo Zhang  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that
> Istvan Toth(stoty)
> has accepted the PMC's invitation to become a committer on the
> project. We appreciate all
> of Istvan Toth's generous contributions thus far and look forward to
> his continued involvement.
>
> Congratulations and welcome, Istvan Toth!
>
> 我很高兴代表 Apache HBase PMC 宣布 Istvan Toth 已接受我们的邀请,成
> 为 Apache HBase 项目的 Committer。感谢 Istvan Toth 一直以来为 HBase 项目
> 做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎 Istvan Toth!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Considering deprecation and removal of XZ compression (hbase-compression-xz)

2024-04-01 Thread Andrew Purtell
Red Hat filed CVE-2024-3094 late last week on 2024-03-29. This implicates
recent releases of the native liblzma library as a vector for malicious
code.

This is not the pure Java version that we depend upon for HBase's support
for the LZMA algorithm (
https://github.com/apache/hbase/tree/master/hbase-compression/hbase-compression-xz).
We depend on version 1.9 of xz-java, which was published in 2021, well
before maintenance changes in the project and the involvement of a person
who is now believed to be a malicious actor. Projects like HBase that
depend on xz-java have no reason to be concerned about the issues affecting
the native xz library.

How the backdoor was introduced calls into question the trustworthiness and
viability of the XZ project. GitHub has disabled all repositories related
to XZ and liblzma, even xz-java. The webpage for XZ and xz-java is down.
The open source software community is responding vigorously. CVE-2024-3094
has a CVSS score 10, the highest possible score. Your security team may
become interested in HBase because of hbase-compression-xz's dependency on
xz-java. It is likely any discovered dependency on any LZMA implementation
will at least raise questions.

For now xz-java remains available in Maven central. (See
https://central.sonatype.com/artifact/org.tukaani/xz/versions) We may have
no choice but to immediately remove hbase-compression-xz if Maven blocks or
drops xz-java too, because that will break our builds.

There is no immediate cause for concern. Still, we believe XZ compression
provides little to no value over more modern alternatives, like ZStandard,
that can also achieve similar compression ratios. XZ, and alternatives like
ZStandard with the compression level set to a high value, are also suitable
only for archival use cases and unsuitable for compression of flush files
or for use in minor compactions. Given how niche any use of XZ
compression could
be, we are wondering if there are actually any users of it.

If we have no users of hbase-compression-xz, then it provides little to no
value and continued maintenance of hbase-compression-xz given the issues
with its dependency does not make sense.

Do you use XZ compression, or are you planning to?

If we deprecate XZ compression immediately and then remove it in 2.6, would
this present a problem? In a private discussion we reached consensus on
this approach, but, of course, that is not yet a plan, and something that
could easily change based on feedback.

>From https://nvd.nist.gov/vuln/detail/CVE-2024-3094:
"Malicious code was discovered in the upstream tarballs of xz, starting
with version 5.6.0. Through a series of complex obfuscations, the liblzma
build process extracts a prebuilt object file from a disguised test file
existing in the source code, which is then used to modify specific
functions in the liblzma code. This results in a modified liblzma library
that can be used by any software linked against this library, intercepting
and modifying the data interaction with this library."

--
Best regards,
Andrew


[ANNOUNCE] Apache HBase 2.5.8 is now available for download

2024-03-13 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.8.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.8 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.8-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.8 is now available for download

2024-03-13 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.8.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.8 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.8-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.7 is now available for download

2024-01-03 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.7.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.7 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.7-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.7 is now available for download

2024-01-03 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.7.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.7 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.7-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.6 is now available for download

2023-10-23 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.6.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.6 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.6-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.6 is now available for download

2023-10-23 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.6.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.6 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.6-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.5 is now available for download

2023-06-13 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.5.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.5 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.5-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.5 is now available for download

2023-06-13 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.5.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.5 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.5-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] New HBase committer Nihal Jain

2023-05-03 Thread Andrew Purtell
Congratulations and welcome, Nihal!

On Wed, May 3, 2023 at 5:12 AM Nick Dimiduk  wrote:

> Hello!
>
> On behalf of the Apache HBase PMC, I am pleased to announce that Nihal Jain
> has accepted the PMC's invitation to become a committer on the project. We
> appreciate all of Nihal's generous contributions thus far and look forward
> to his continued involvement.
>
> Congratulations and welcome, Nihal Jain!
>
> Thanks,
> Nick
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase committer Nihal Jain

2023-05-03 Thread Andrew Purtell
Congratulations and welcome, Nihal!

On Wed, May 3, 2023 at 5:12 AM Nick Dimiduk  wrote:

> Hello!
>
> On behalf of the Apache HBase PMC, I am pleased to announce that Nihal Jain
> has accepted the PMC's invitation to become a committer on the project. We
> appreciate all of Nihal's generous contributions thus far and look forward
> to his continued involvement.
>
> Congratulations and welcome, Nihal Jain!
>
> Thanks,
> Nick
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


[ANNOUNCE] Apache HBase 2.5.4 is now available for download

2023-04-14 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.4.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.4 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.4-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.5.4 is now available for download

2023-04-14 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.4.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.4 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/2.5.4-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [DISCUSS] How to deal with the disabling of public sign ups for jira.a.o(enable github issues?)

2023-03-06 Thread Andrew Purtell
Approved the PR.
I see the self service portal is already mailing us about requests. I
approved one a few minutes ago.

On Mon, Mar 6, 2023 at 2:02 AM 张铎(Duo Zhang)  wrote:

> https://selfserve.apache.org/jira-account.html
>
> The INFRA team has delivered a self serving tool for requesting a jira
> account, so contributors could request a jira account by their own.
>
> I filed HBASE-27689 for updating the README.md to mention this change and
> the PR is ready
>
> https://github.com/apache/hbase/pull/5088
>
> PTAL.
>
> Thanks.
>
> Guanghao Zhang  于2022年12月7日周三 12:35写道:
>
> > Did other projects have the same solution for this, sync github issues to
> > jira issues? Github issues will be useful to get more feedback.
> >
> > 张铎(Duo Zhang)  于2022年12月6日周二 00:13写道:
> >
> > > The PR for HBASE-27513 is available
> > >
> > > https://github.com/apache/hbase/pull/4913
> > >
> > > Let's at least tell our users to send email to private@hbase for
> > > acquiring a jira account.
> > >
> > > Thanks.
> > >
> > > 张铎(Duo Zhang)  于2022年12月2日周五 12:46写道:
> > > >
> > > > Currently all the comment on github PR will be sent to issues@hbase,
> > > > like this one
> > > >
> > > > https://lists.apache.org/thread/jbfm269b4m24xl2r82l8b0t3pmqr44hr
> > > >
> > > > But I think this can only be used as an archive, to make sure that
> all
> > > > discussions are recorded on asf infrastructure.
> > > >
> > > > For github issues, I'm afraid we can only do the same thing. As the
> > > > format of github comment is different, it will be hard to read if we
> > > > just sync the message to jira...
> > > >
> > > > Thanks.
> > > >
> > > > Bryan Beaudreault  于2022年12月1日周四
> > > 21:30写道:
> > > > >
> > > > > Should we have them sent to private@? Just thinking in terms of
> > > reducing
> > > > > spam to users who put their email and full name on a public list.
> > > > >
> > > > > One thought I had about bug tracking is whether we could use some
> > sort
> > > of
> > > > > github -> jira sync. I've seen them used before, where it
> > automatically
> > > > > syncs issues and comments between the two systems. It's definitely
> > not
> > > > > ideal, but maybe an option? I'm guessing it would require INFRA
> help.
> > > > >
> > > > > On Thu, Dec 1, 2022 at 5:47 AM 张铎(Duo Zhang) <
> palomino...@gmail.com>
> > > wrote:
> > > > >
> > > > > > I've filed HBASE-27513 for changing the readme on github.
> > > > > >
> > > > > > At least let's reuse the existing mailing list for acquiring jira
> > > account.
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > 张铎(Duo Zhang)  于2022年11月29日周二 22:34写道:
> > > > > >
> > > > > > >
> > > > > > > Bump and also send this to user@hbase.
> > > > > > >
> > > > > > > We need to find a way to deal with the current situation where
> > > > > > > contributors can not create a Jira account on their own...
> > > > > > >
> > > > > > > At least, we need to change the readme on github page, web site
> > and
> > > > > > > also the ref guide to tell users how to acquire a jira
> account...
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > > 张铎(Duo Zhang)  于2022年11月27日周日 22:06写道:
> > > > > > > >
> > > > > > > > For me, I think most developers already have a github
> account,
> > so
> > > > > > > > enabling it could help us get more feedback. For lots of
> > younger
> > > > > > > > Chinese developers, they rarely use email in their daily
> > life...
> > > > > > > > No doubt later we need to modify our readme on github. If we
> > > just let
> > > > > > > > users go to github issues on the readme, they will soon open
> an
> > > issue
> > > > > > > > there. But if we ask users to first send an email to a
> mailing
> > > list,
> > > > > > > > for acquiring a jira account, and then wait for a PMC member
> to
> > > submit
> > > > > > > > the request, and receive the email response, set up their
> > > account, and
> > > > > > > > then they can finally open an issue on jira. I'm afraid lots
> of
> > > users
> > > > > > > > will just give up, it is not very friendly...
> > > > > > > >
> > > > > > > > And I do not mean separate issue systems for users and devs.
> > > Users can
> > > > > > > > still open jira issues or ask in the mailing list if they
> want,
> > > github
> > > > > > > > issues is just another channel. If a user asks something in
> the
> > > > > > > > mailing list and we think it is a bug, we will ask the user
> to
> > > file an
> > > > > > > > issue or we will file an issue for it. It is just the same
> with
> > > github
> > > > > > > > issues.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Nick Dimiduk  于2022年11月24日周四 15:44写道:
> > > > > > > > >
> > > > > > > > > This new situation around JIRA seems very similar to the
> > > existing
> > > > > > situation
> > > > > > > > > around Slack. A new community member currently must
> acquire a
> > > Slack
> > > > > > invite
> > > > > > > > > somehow, usually by emailing one of the lists. Mailing
> lists
> > > > > > themselves
> > > > > > > > > involve a 

Re: [ANNOUNCE] Please welcome Tak Lon (Stephen) Wu to the HBase PMC

2023-01-30 Thread Andrew Purtell
Congratulations and welcome, Stephen!

On Sun, Jan 29, 2023 at 6:50 PM Duo Zhang  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that
> Tak Lon (Stephen) Wu has accepted our invitation to become a PMC member
> on the Apache HBase project. We appreciate Tak Lon (Stephen) Wu stepping
> up to take more responsibility in the HBase project.
>
> Please join me in welcoming Tak Lon (Stephen) Wu to the HBase PMC!
>
> 我很高兴代表 Apache HBase PMC 宣布 Tak Lon (Stephen) Wu 已接受我们的邀请,
> 成为 Apache HBase 项目的 PMC 成员。感谢 Tak Lon (Stephen) Wu 愿意在 HBase
> 项目中承担更大的责任。
>
> 欢迎 Tak Lon (Stephen) Wu!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [DISCUSS] Allow namespace admins to clone snapshots created by them

2022-12-31 Thread Andrew Purtell
+1 

If this is needed soon in a release we could start on 2.6.0?

(How is TLS RPC coming along? - that would be the big ticket item.)

> On Dec 23, 2022, at 7:06 AM, 张铎  wrote:
> 
> This is a behavior change, it makes non admin users can clone snapshot.
> 
> For me I do not think we should include changes like this in a patch
> release, unless it is considered as a critical bug which must be
> fixed.
> 
> Thanks.
> 
> Szabolcs Bukros  于2022年11月30日周三 00:06写道:
>> 
>> This should not break any existing use case so I see no reason to not add
>> this to branch-2.5 and
>> branch-2.4.
>> 
>>> On Thu, Nov 24, 2022 at 3:03 AM 张铎(Duo Zhang)  wrote:
>>> 
>>> I'm OK with this change.
>>> 
>>> But maybe we still need to determine which branches we can apply this
>>> change to? Is it OK to include this change for branch-2.5 and
>>> branch-2.4?
>>> 
>>> Tak Lon (Stephen) Wu  于2022年11月22日周二 06:31写道:
 
 FYI the PR is https://github.com/apache/hbase/pull/4885 and
 https://issues.apache.org/jira/browse/HBASE-27493.
 
 the proposal seems to be, should we allow cloning snapshot to any
 namespace if they're not the global admin.
 
 logically, it should be fine because they're the admin for the
 namespace, and should be able to do whatever within that namespace.
 
 Thanks,
 Stephen
 
 
 On Mon, Nov 21, 2022 at 11:38 AM Szabolcs Bukros
  wrote:
> 
> Hi Everyone,
> 
> Creating a snapshot requires table admin permissions. But cloning it
> requires global admin permissions unless the user owns the snapshot and
> wants to recreate the original table the snapshot was based on using
>>> the
> same table name. This puts unnecessary load on the few users having
>>> global
> admin permissions on the cluster. I would like to relax this rule a
>>> bit and
> allow the owner of the snapshot to clone it into any namespace where
>>> they
> have admin permissions regardless of the table name used.
> 
> Please let me know what you think about this proposal. And if you find
>>> it
> acceptable which branch do you think this could land on.
> 
> Thanks,
> Szabolcs Bukros
>>> 


Re: [ANNOUNCE] New HBase committer Rushabh Shah

2022-12-15 Thread Andrew Purtell
Congratulations and welcome, Rushabh!

On Wed, Dec 14, 2022 at 10:57 PM 张铎(Duo Zhang) 
wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that
> Rushabh Shah(shahrs87)
> has accepted the PMC's invitation to become a committer on the
> project. We appreciate all
> of Rushabh's generous contributions thus far and look forward to his
> continued involvement.
>
> Congratulations and welcome, Rushabh Shah!
>
> 我很高兴代表 Apache HBase PMC 宣布 Rushabh Shah 已接受我们的邀请,成
> 为 Apache HBase 项目的 Committer。感谢 Rushabh Shah 一直以来为 HBase 项目
> 做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎 Rushabh Shah!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase committer Rushabh Shah

2022-12-15 Thread Andrew Purtell
Congratulations and welcome, Rushabh!

On Wed, Dec 14, 2022 at 10:57 PM 张铎(Duo Zhang) 
wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that
> Rushabh Shah(shahrs87)
> has accepted the PMC's invitation to become a committer on the
> project. We appreciate all
> of Rushabh's generous contributions thus far and look forward to his
> continued involvement.
>
> Congratulations and welcome, Rushabh Shah!
>
> 我很高兴代表 Apache HBase PMC 宣布 Rushabh Shah 已接受我们的邀请,成
> 为 Apache HBase 项目的 Committer。感谢 Rushabh Shah 一直以来为 HBase 项目
> 做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎 Rushabh Shah!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase Committer Liangjun He

2022-12-05 Thread Andrew Purtell
Congratulations, and welcome!

On Sat, Dec 3, 2022 at 5:51 AM Yu Li  wrote:

> Hi All,
>
> On behalf of the Apache HBase PMC, I am pleased to announce that Liangjun
> He (heliangjun) has accepted the PMC's invitation to become a committer on
> the project. We appreciate all of Liangjun's generous contributions thus
> far and look forward to his continued involvement.
>
> Congratulations and welcome, Liangjun!
>
> 我很高兴代表 Apache HBase PMC 宣布 Liangjun He (何良均) 已接受我们的邀请,成为 Apache HBase 项目的
> Committer。感谢何良均一直以来为 HBase 项目做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎良均!
>
> Best Regards,
> Yu
> --
> Best Regards,
> Yu
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase Committer Liangjun He

2022-12-05 Thread Andrew Purtell
Congratulations, and welcome!

On Sat, Dec 3, 2022 at 5:51 AM Yu Li  wrote:

> Hi All,
>
> On behalf of the Apache HBase PMC, I am pleased to announce that Liangjun
> He (heliangjun) has accepted the PMC's invitation to become a committer on
> the project. We appreciate all of Liangjun's generous contributions thus
> far and look forward to his continued involvement.
>
> Congratulations and welcome, Liangjun!
>
> 我很高兴代表 Apache HBase PMC 宣布 Liangjun He (何良均) 已接受我们的邀请,成为 Apache HBase 项目的
> Committer。感谢何良均一直以来为 HBase 项目做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎良均!
>
> Best Regards,
> Yu
> --
> Best Regards,
> Yu
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


[ANNOUNCE] Apache HBase 2.5.1 is now available for download

2022-10-28 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.5.1.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.5.1 is the latest patch release in the HBase 2.5.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.5.1-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.4.15 is now available for download

2022-10-28 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.15.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.15 is the latest patch release in the HBase 2.4.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.15-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] Changes to Jira Account Creation (issues.a.o/jira)

2022-10-22 Thread Andrew Purtell
The Yetus tools invoked by create-release use JIRA as source of truth for 
changelog and release notes and there is not an alternative that I am aware of, 
although perhaps there is one. So, we can to look at that: Can Yetus, or 
create-release, be modified to generate changelogs and release notes from 
GitHub issues instead? If so we could abandon Apache JIRA and avoid this new 
burden, although that would make referring to anything historical on current 
issues inconvenient, so is not ideal, in my opinion, but would return us to a 
low touch process. 

Let me observe that once a contributor is onboarded to JIRA by one of us using 
this “self service” tool, how that user interacts with the project is the same 
as now. The one change here is the GitHub user cannot sign up for the JIRA 
account on their own. It might not be too bad. Perhaps there is a way to pre 
fill our GitHub issues and PRs with template text that requests the user’s JIRA 
ID. We would document this new JIRA signup process in the template so users can 
proactively determine if they require onboarding to JIRA or not. Then one of us 
(PMC) has to do it, one time for each unique contributor. This is burdensome. 
It would delay our response that first time from the perspective of the new 
contributor while the JIRA onboarding is pending. But if it’s the new policy it 
is not avoidable. 

Like today, if the GitHub user does not want to onboard to JIRA we ignore the 
issue or PR. That doesn’t represent a change of policy. 

I have a feeling there will be changes to this announced policy. Only those 
projects where commercial entities have their employees on the PMC getting paid 
to do user onboarding administrivia will not see some sort of impact. The long 
tail of everyone else will see declines that will be reflected in statistics 
that Whimsy generates for the Board reports.


> On Oct 21, 2022, at 7:58 PM, 张铎  wrote:
> 
> Because of spam users, the infra team plans to shutdown self
> registration of jira account and suggests ASF projects to make use of
> github issuesfor tracking customer facing questions/bugs.
> 
> What should we do?
> 
> -- Forwarded message -
> 发件人: fluxo 
> Date: 2022年10月22日周六 09:02
> Subject: [ANNOUNCE] Changes to Jira Account Creation (issues.a.o/jira)
> To: 
> 
> 
> Hello PMC members,
> 
> As I'm sure most of you are aware, the spam issues on Jira are getting
> worse. We are seeing spam user creation of over 10,000 accounts per
> year, and receive many requests per month from project members for
> help addressing spam complaints. Infra is taking steps to disable
> public Jira signups.
> 
> Infra has developed a self-service tool by which folks on a PMC can
> request a Jira account for non-ASF contributors:
> 
> 
> https://selfserve.apache.org/
> 
> 
> Click "Create a Jira user account" to go to:
> 
> 
> https://selfserve.apache.org/jira-acct.html
> 
> 
> You need to enter a username for the new Jira account. We will reject
> the request if there is an existing account with that username. If
> this person may ultimately become a committer, Infra recommends that
> they choose a username that they can also use for their LDAP username.
> 
> Next, the tool asks you to enter their Display Name. This is the
> "public name" which will appear on all their Jira posts and comments.
> 
> Last, the tool asks you to enter the user's email address. We expect
> the PMC to exercise due diligence in making sure the contributor's
> email works. If it does not, they will not get the password reset
> mail.
> 
> 
> Infra knows this process change places an increasing burden on PMC
> members for managing contributors, and makes it harder for people to
> contribute bug reports. We suggest projects consider using GitHub
> Issues for customer-facing questions/bug reports/etc., while
> maintaining development issues on Jira. You can enable GitHub Issues
> for your repository via
> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Repositoryfeatures
> 
> 
> Infra has targeted 6 November for the date we switch off public
> signups for issues.apache.org/jira . Please let us know if this will
> place any significant burden on your teams. We are following an
> aggressive timeline because of the serious impact spam users have on
> the safety and stability of our infrastructure.
> 
> As always, if you have any questions or comments about this, please let us 
> know!
> 
> -Chris (fluxo)
> 
> --
> @fluxo
> Chris Lambertus
> ASF Infrastructure


Re: HBase 2.4.x + Spark 3.3

2022-10-19 Thread Andrew Purtell
No, that is insufficient. HBase must be recompiled against Hadoop 3 first

cd /path/to/hbase
mvn clean install assembly:single -DskipTests -Dhadoop.profile=3.0
-Dhadoop-three.version=XXX

Then once the results are in your local maven cache or nexus instance, you
can compile Spark as indicated.


On Tue, Oct 18, 2022 at 11:39 PM Lars Francke 
wrote:

> Hi Andrew,
>
> thanks for the reply.
> I should have been more specific: We only tried to compile the "client"
> part that's used in Spark itself and we used the proper versions
>
> mvn -Dspark.version=XXX -Dscala.version=XXX -Dhadoop-three.version=XXX
> -Dscala.binary.version=XXX -Dhbase.version=XXX clean package
>
> I assume that should pull in the correct dependencies but I have to admit
> that I didn't check, took it straight from the readme.
> We wanted to try the server bit for the RegionServers afterwards but didn't
> even get to it yet.
>
> We have this on our radar though and might try to work through those issues
> at some point.
> If we get started on that I'll ping the list.
>
> Cheers,
> Lars
>
> On Wed, Oct 19, 2022 at 1:41 AM Andrew Purtell 
> wrote:
>
> > Out of the box use is going to be problematic without recompiling HBase
> for
> > Hadoop 3. Spark 3.3 ships with Hadoop 3.3.2. Apache HBase 2.4.x (and all
> > 2.x) releases are compiled against Hadoop 2. Link errors (ClassNotFound,
> > NoClassDef, etc) I think are to be expected because the class hierarchies
> > of various Hadoop things have been incompatibly changed in 3.x releases
> > relative to 2.x. This is not unreasonable. Semantic versioning suggests
> > breaking changes can be expected in a major version increment.
> >
> > Users probably need to do a holistic (or hermetic, if you prefer) build
> of
> > their bill of materials before testing it or certainly before shipping
> it.
> > Build your HBase for the version of Hadoop you are actually shipping it
> > with, as opposed to whatever the upstream project picks as a default
> build
> > target. They are called "convenience binaries" by the project and the
> > Foundation for a reason. Convenience may vary according to your
> > circumstances. When HBase finally ships builds compiled against Hadoop 3
> by
> > default, anyone still using 2.x in production will face the same problem
> > (in reverse). The Phoenix project also faces this issue for what it's
> > worth. Their readme and build instructions walk users through rebuilding
> > HBase using -Dhadoop.profile=3.0 as a first step as well.
> >
> >
> > On Mon, Oct 17, 2022 at 1:52 PM Lars Francke 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > we've just recently tried getting the HBase Spark connector running
> > against
> > > Spark 3.3 and HBase 2.4.x and failed miserably. It was a mess of Scala
> > and
> > > Java issues, classpath, NoClassDef etc.
> > >
> > > The trauma is too recent for me to dig up the details but if needed I
> can
> > > ;-)
> > >
> > > For now I'm just wondering if anyone has succeeded using this
> > combination?
> > >
> > > Cheers,
> > > Lars
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Unrest, ignorance distilled, nihilistic imbeciles -
> > It's what we’ve earned
> > Welcome, apocalypse, what’s taken you so long?
> > Bring us the fitting end that we’ve been counting on
> >- A23, Welcome, Apocalypse
> >
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: HBase 2.4.x + Spark 3.3

2022-10-18 Thread Andrew Purtell
Out of the box use is going to be problematic without recompiling HBase for
Hadoop 3. Spark 3.3 ships with Hadoop 3.3.2. Apache HBase 2.4.x (and all
2.x) releases are compiled against Hadoop 2. Link errors (ClassNotFound,
NoClassDef, etc) I think are to be expected because the class hierarchies
of various Hadoop things have been incompatibly changed in 3.x releases
relative to 2.x. This is not unreasonable. Semantic versioning suggests
breaking changes can be expected in a major version increment.

Users probably need to do a holistic (or hermetic, if you prefer) build of
their bill of materials before testing it or certainly before shipping it.
Build your HBase for the version of Hadoop you are actually shipping it
with, as opposed to whatever the upstream project picks as a default build
target. They are called "convenience binaries" by the project and the
Foundation for a reason. Convenience may vary according to your
circumstances. When HBase finally ships builds compiled against Hadoop 3 by
default, anyone still using 2.x in production will face the same problem
(in reverse). The Phoenix project also faces this issue for what it's
worth. Their readme and build instructions walk users through rebuilding
HBase using -Dhadoop.profile=3.0 as a first step as well.


On Mon, Oct 17, 2022 at 1:52 PM Lars Francke  wrote:

> Hi everyone,
>
> we've just recently tried getting the HBase Spark connector running against
> Spark 3.3 and HBase 2.4.x and failed miserably. It was a mess of Scala and
> Java issues, classpath, NoClassDef etc.
>
> The trauma is too recent for me to dig up the details but if needed I can
> ;-)
>
> For now I'm just wondering if anyone has succeeded using this combination?
>
> Cheers,
> Lars
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: If anyone has experience in changing the regex engine in RegexStringComparator to joni?

2022-07-16 Thread Andrew Purtell
Please do file an issue on our issue tracker. https://issues.apache.org/jira . 
The project name is HBASE of course. 

I think we may have bigger issues here because joni was recently flagged by 
static analysis tools we use at my employer to determine compliance with 
various government requirements. I would assume a CVE has been filed regarding 
joni. I plan to dig in here soon. A required upgrade of joni could by extension 
provoke an upgrade of JRuby. Sean, I recall you recently landed some changes in 
that regard, but only back to branch-2. So, if so, this encoding issue by 
comparison would be a smaller detail to also address concurrently. In any case 
let’s track the problem. 

> On Jul 16, 2022, at 10:43 AM, Sean Busbey  wrote:
> 
> That sounds reasonable. Could you file an issue in our issue tracker? Are
> you up for working on a PR?
> 
> 
>> On Wed, Jul 13, 2022 at 2:27 AM Minwoo Kang 
>> wrote:
>> 
>> Hello,
>> 
>> I checked whether JONI can be used in RegexStringComparator.
>> After changing the engine of RegexStringComparator to JONI, when a regex
>> filter request was sent, the heap memory usage spiked and the RegionServer
>> did not work due to GC.
>> 
>> When I checked the reason, it is said that when using UTF8Encoding, an
>> infinite loop can occur if an invalid UTF8 is entered.[1]
>> For trino, using NonStrictUTF8Encoding instead of UTF8Encoding.
>> 
>> After changing the encoding of JoniRegexEngine to NonStrictUTF8Encoding in
>> RegexStringComparator, it was confirmed that the heap memory usage spike
>> was gone.[2]
>> 
>> In HBase, like trino, it seems to be necessary to use NonStrictUTF8Encoding
>> instead of UTF8Encoding for JoniRegexEngine's encoding.
>> What do you think about changing JoniRegexEngine's encoding to
>> NonStrictUTF8Encoding?
>> 
>> Best Regards,
>> Minwoo
>> 
>>> On 2022/06/27 04:41:41 Minwoo Kang wrote:
>>> (I sent the mail title in Korean for the first time. I'm so sorry.)
>>> 
>>> Hello,
>>> 
>>> Recently, java.util.regex in the Regex filter (RegexStringComparator) had
>>> been running forever.
>>> It is said that java.util.regex can run forever or stack overflow in the
>>> worst case.
>>> 
>>> Looking at RegexStringComparator, I saw that two regex implementations
>>> (java, joni) were provided.
>>> I was wondering if anyone has experience in changing the regex engine
>>> in RegexStringComparator to joni and operating it.
>>> 
>>> Best Regards,
>>> Minwoo
>>> 
>>> On 2022/06/27 04:37:11 Minwoo Kang wrote:
 Hello,
 
 Recently, java.util.regex in the Regex filter (RegexStringComparator)
>> had
 been running forever.
 It is said that java.util.regex can run forever or stack overflow in
>> the
 worst case.
 
 Looking at RegexStringComparator, I saw that two regex implementations
 (java, joni) were provided.
 I was wondering if anyone has experience in changing the regex engine
 in RegexStringComparator to joni and operating it.
 
 Best Regards,
 Minwoo
 
>>> 
>> 


[ANNOUNCE] Apache HBase 2.4.13 is now available for download

2022-07-01 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.13.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.13 is the thirteenth patch release in the HBase 2.4.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.13-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] Please welcome Xiaolin Ha(哈晓琳) to the HBase PMC

2022-04-09 Thread Andrew Purtell
Congratulations and welcome, Xiaolin, and thank you for the help so far in 
release voting. 

> On Apr 9, 2022, at 6:12 AM, 张铎  wrote:
> 
> On behalf of the Apache HBase PMC I am pleased to announce that Xiaolin Ha
> has accepted our invitation to become a PMC member on the Apache HBase
> project. We appreciate Xiaolin Ha stepping up to take more responsibility in
> the HBase project.
> 
> Please join me in welcoming Xiaolin Ha to the HBase PMC!
> 
> 我很高兴代表 Apache HBase PMC 宣布哈晓琳已接受我们的邀请,
> 成为 Apache HBase 项目的 PMC 成员。感谢哈晓琳愿意在 HBase 项目
> 中承担更大的责任。
> 
> 欢迎哈晓琳!


Re: [ANNOUNCE] New HBase committer Bryan Beaudreault

2022-04-09 Thread Andrew Purtell
Congratulations and welcome, Bryan!

> On Apr 9, 2022, at 4:45 AM, 张铎  wrote:
> 
> On behalf of the Apache HBase PMC, I am pleased to announce that Bryan
> Beaudreault(bbeaudreault) has accepted the PMC's invitation to become a
> committer on the project. We appreciate all of Bryan's generous
> contributions thus far and look forward to his continued involvement.
> 
> Congratulations and welcome, Bryan Beaudreault!
> 
> 我很高兴代表 Apache HBase PMC 宣布 Bryan Beaudreault 已接受我们的邀请,成为 Apache HBase 项目的
> Committer。感谢 Bryan Beaudreault 一直以来为 HBase 项目做出的贡献,并期待他在未来继续承担更多的责任。
> 
> 欢迎 Bryan Beaudreault!


Re: [ANNOUNCE] New HBase committer Bryan Beaudreault

2022-04-09 Thread Andrew Purtell
Congratulations and welcome, Bryan!

> On Apr 9, 2022, at 4:45 AM, 张铎  wrote:
> 
> On behalf of the Apache HBase PMC, I am pleased to announce that Bryan
> Beaudreault(bbeaudreault) has accepted the PMC's invitation to become a
> committer on the project. We appreciate all of Bryan's generous
> contributions thus far and look forward to his continued involvement.
> 
> Congratulations and welcome, Bryan Beaudreault!
> 
> 我很高兴代表 Apache HBase PMC 宣布 Bryan Beaudreault 已接受我们的邀请,成为 Apache HBase 项目的
> Committer。感谢 Bryan Beaudreault 一直以来为 HBase 项目做出的贡献,并期待他在未来继续承担更多的责任。
> 
> 欢迎 Bryan Beaudreault!


[ANNOUNCE] Apache HBase 2.4.11 is now available for download

2022-03-19 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.11.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.11 is the eleventh patch release in the HBase 2.4.x line. The
full list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.11-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.4.10 is now available for download

2022-03-04 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.10.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.10 is the tenth patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.10-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] New HBase committer Lei Cheng(程磊)

2022-03-03 Thread Andrew Purtell
Congratulations! And welcome.

On Tue, Mar 1, 2022 at 11:50 PM 张铎(Duo Zhang)  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that Lei
> Cheng(comnetwork) has accepted the PMC's invitation to become a committer
> on the project. We appreciate all of Lei's generous contributions thus far
> and look forward to his continued involvement.
>
> Congratulations and welcome, Lei Cheng!
>
> 我很高兴代表 Apache HBase PMC 宣布程磊已接受我们的邀请,成为 Apache HBase 项目的
> Committer。感谢程磊一直以来为 HBase 项目做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎程磊!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [ANNOUNCE] New HBase committer Lei Cheng(程磊)

2022-03-03 Thread Andrew Purtell
Congratulations! And welcome.

On Tue, Mar 1, 2022 at 11:50 PM 张铎(Duo Zhang)  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that Lei
> Cheng(comnetwork) has accepted the PMC's invitation to become a committer
> on the project. We appreciate all of Lei's generous contributions thus far
> and look forward to his continued involvement.
>
> Congratulations and welcome, Lei Cheng!
>
> 我很高兴代表 Apache HBase PMC 宣布程磊已接受我们的邀请,成为 Apache HBase 项目的
> Committer。感谢程磊一直以来为 HBase 项目做出的贡献,并期待他在未来继续承担更多的责任。
>
> 欢迎程磊!
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse


Re: [DISCUSS] Keep EOL HBase and Hadoop versions in our support matrix for longer time

2022-01-09 Thread Andrew Purtell
That sounds like a pretty good plan to me. What do others think?

On Sun, Jan 9, 2022 at 4:04 PM 张铎(Duo Zhang)  wrote:

> That's also a possible solution, I mean remove purge the docs in branches
> other than master.
>
> So what should be done will be:
> 1. Remove all docs from branches other than master.
> 2. See how to skip the doc generating when docs are not there so we do not
> break the publishing release process.
> 3. Modify the documentation to also include the information about EOL
> releases, instead of just purging them from the documentation.
>
> WDYT?
>
> Thanks.
>
> Andrew Purtell  于2022年1月10日周一 01:28写道:
>
> > Our branch-1 release process included a step to check out master and
> > overwrite root src/ to do just that, manually synchronize docs for all
> > releases.
> >
> > I do not believe we can keep docs in branches effectively synchronized.
> > Periodic manual sweeps could work but is still error prone. It would be
> > better to remove docs from all branches other than master and then there
> > will be one copy only to maintain … that will be feasible.
> >
> > > On Jan 9, 2022, at 4:39 AM, 张铎  wrote:
> > >
> > > Ouch. IIRC there are lots of documentation improvements only get
> > > merged to master branch only...
> > >
> > > I think the fact for now is that, it is a bit hard for us to keep the
> > > documentation for each minor release in sync correctly...
> > > Since we will publish the ref guide as part of our releases, let's
> > > just introduce a simple rule, keep the ref guide for all active
> > > branches always the same, and once a release line gets EOL, we will
> > > not update its ref guide any more.
> > > At least for me this rule is not very hard to follow...
> > >
> > > Thanks.
> > >
> > > Sean Busbey  于2022年1月9日周日 17:15写道:
> > >>
> > >> We already publish version specific reference guides as a part of our
> > >> binary tarballs. My understanding is that's a big part of why they're
> > >> still in the main source repository; that we get docs built as a part
> > >> of our assembly.
> > >>
> > >> e.g. just picking a 2.4 release I have handy:
> > >>
> > >> (base) sbusbey@seans-mbp ~ % tar tzf
> Downloads/hbase-2.4.8-bin.tar.gz|
> > >> grep pdf
> > >> hbase-2.4.8/docs/apache_hbase_reference_guide.pdf
> > >> (base) sbusbey@seans-mbp ~ % tar tzf
> Downloads/hbase-2.4.8-bin.tar.gz|
> > >> grep book.html
> > >> hbase-2.4.8/docs/book.html
> > >>
> > >> Historically these do get maintained for each minor version. My
> > >> understanding is that the diligence on backporting across both
> > >> contributors to docs and release managers has varied considerably over
> > >> time.
> > >>
> > >> If we're only going to have one version of the docs, then we should
> > >> break the docs out of the main repo entirely so that we can simplify
> > >> their generation.
> > >>
> > >> IIRC we have only gone through the trouble of publishing to the
> > >> website version specific API docs / ref guides for release lines that
> > >> made it to be the "stable" branch.
> > >>
> > >>> On Sat, Jan 8, 2022 at 9:37 AM 张铎(Duo Zhang) 
> > wrote:
> > >>>
> > >>> I agree with Andrew that maybe it is beyond our ability to maintain
> > >>> ref guides for different release lines and keep them all in sync...
> > >>>
> > >>> What about this, we just make the ref guide for the master branch in
> > >>> sync, which contains all release lines. We will still remove
> > >>> information of the EOL releases as needed, to keep the ref guide
> > >>> clean, but as said in the title, less aggressive.
> > >>> And when we decide to EOL a release line, we copy the ref guide of
> the
> > >>> current master branch to the specific branch, and generate the ref
> > >>> guide for that release line for the last time.
> > >>> In this way the release manager does not need to always think of
> > >>> keeping the ref guide in sync, we all just need to consider the
> master
> > >>> one, only one extra work needs to be done when EOLing a release line.
> > >>>
> > >>> WDYT?
> > >>>
> > >>> Thanks.
> > >>>
> > >>> Andrew Purtell  于2022

Re: [DISCUSS] Keep EOL HBase and Hadoop versions in our support matrix for longer time

2022-01-09 Thread Andrew Purtell
Our branch-1 release process included a step to check out master and overwrite 
root src/ to do just that, manually synchronize docs for all releases. 

I do not believe we can keep docs in branches effectively synchronized. 
Periodic manual sweeps could work but is still error prone. It would be better 
to remove docs from all branches other than master and then there will be one 
copy only to maintain … that will be feasible. 

> On Jan 9, 2022, at 4:39 AM, 张铎  wrote:
> 
> Ouch. IIRC there are lots of documentation improvements only get
> merged to master branch only...
> 
> I think the fact for now is that, it is a bit hard for us to keep the
> documentation for each minor release in sync correctly...
> Since we will publish the ref guide as part of our releases, let's
> just introduce a simple rule, keep the ref guide for all active
> branches always the same, and once a release line gets EOL, we will
> not update its ref guide any more.
> At least for me this rule is not very hard to follow...
> 
> Thanks.
> 
> Sean Busbey  于2022年1月9日周日 17:15写道:
>> 
>> We already publish version specific reference guides as a part of our
>> binary tarballs. My understanding is that's a big part of why they're
>> still in the main source repository; that we get docs built as a part
>> of our assembly.
>> 
>> e.g. just picking a 2.4 release I have handy:
>> 
>> (base) sbusbey@seans-mbp ~ % tar tzf Downloads/hbase-2.4.8-bin.tar.gz|
>> grep pdf
>> hbase-2.4.8/docs/apache_hbase_reference_guide.pdf
>> (base) sbusbey@seans-mbp ~ % tar tzf Downloads/hbase-2.4.8-bin.tar.gz|
>> grep book.html
>> hbase-2.4.8/docs/book.html
>> 
>> Historically these do get maintained for each minor version. My
>> understanding is that the diligence on backporting across both
>> contributors to docs and release managers has varied considerably over
>> time.
>> 
>> If we're only going to have one version of the docs, then we should
>> break the docs out of the main repo entirely so that we can simplify
>> their generation.
>> 
>> IIRC we have only gone through the trouble of publishing to the
>> website version specific API docs / ref guides for release lines that
>> made it to be the "stable" branch.
>> 
>>> On Sat, Jan 8, 2022 at 9:37 AM 张铎(Duo Zhang)  wrote:
>>> 
>>> I agree with Andrew that maybe it is beyond our ability to maintain
>>> ref guides for different release lines and keep them all in sync...
>>> 
>>> What about this, we just make the ref guide for the master branch in
>>> sync, which contains all release lines. We will still remove
>>> information of the EOL releases as needed, to keep the ref guide
>>> clean, but as said in the title, less aggressive.
>>> And when we decide to EOL a release line, we copy the ref guide of the
>>> current master branch to the specific branch, and generate the ref
>>> guide for that release line for the last time.
>>> In this way the release manager does not need to always think of
>>> keeping the ref guide in sync, we all just need to consider the master
>>> one, only one extra work needs to be done when EOLing a release line.
>>> 
>>> WDYT?
>>> 
>>> Thanks.
>>> 
>>> Andrew Purtell  于2022年1月8日周六 07:16写道:
>>>> 
>>>> There are some challenges with respect to keeping multiple versions of the
>>>> documentation around. Each minor release needs a new version? Each RM
>>>> managing one or more code line(s) needs to update trunk docs and also
>>>> backport to branch docs (or not, depending)? There's been a JIRA open
>>>> forever for a 2.4 version of the book and honestly I haven't found time to
>>>> do it, because it seems both low priority and nontrivial, and there's never
>>>> enough time for everything... although that may be a personal failing.
>>>> 
>>>> On Fri, Jan 7, 2022 at 9:05 AM Sean Busbey  wrote:
>>>> 
>>>>> We should have a version specific version of the ref guide that
>>>>> contains that information.
>>>>> 
>>>>> e.g.
>>>>> 
>>>>> https://hbase.apache.org/1.4/book.html#hadoop
>>>>> 
>>>>> https://hbase.apache.org/2.3/book.html#hadoop
>>>>> 
>>>>> Can we do a better job of making these discoverable to folks rather
>>>>> than keeping stuff around?
>>>>> 
>>>>> On Fri, Jan 7, 2022 at 1:14 AM 张铎(Duo Zhang) 
>>>>> wrote:
>>>>>> 
>>>>>> Recently we've seen several emails on the user list asking whether
>>>>>> some hbase versions support specific hadoop versions, usually it will
>>>>>> be some versions which are already EOL, so there is no information in
>>>>>> our ref guide.
>>>>>> 
>>>>>> For me I think we could leave the EOL versions in the support matrix
>>>>>> for a bit longer. It will be useful for our users.
>>>>>> 
>>>>>> Thanks.
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> Andrew
>>>> 
>>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>>> decrepit hands
>>>>   - A23, Crosstalk


Re: [DISCUSS] Keep EOL HBase and Hadoop versions in our support matrix for longer time

2022-01-09 Thread Andrew Purtell
As you say the docs on branches are not updated in any principled way. 

Extracting the Javadoc from a matrix of branch builds in a automated way for 
publishing up to the website would be feasible. We can add this to the build 
website Jenkins job.

On the other hand, this thread’s topic, a support matrix, would need to be 
updated by hand whenever a change of consequence is backported. We can expect 
that committers backporting a change breaking old dependencies won’t know about 
or recall the branch support matrixes. History teaches us they might not even 
be aware the change breaks support. The RM won’t catch it with the default 
build. We would need a matrix of compatibility test jobs to catch the cases 
that fall through the cracks. 

> On Jan 9, 2022, at 1:15 AM, Sean Busbey  wrote:
> 
> We already publish version specific reference guides as a part of our
> binary tarballs. My understanding is that's a big part of why they're
> still in the main source repository; that we get docs built as a part
> of our assembly.
> 
> e.g. just picking a 2.4 release I have handy:
> 
> (base) sbusbey@seans-mbp ~ % tar tzf Downloads/hbase-2.4.8-bin.tar.gz|
> grep pdf
> hbase-2.4.8/docs/apache_hbase_reference_guide.pdf
> (base) sbusbey@seans-mbp ~ % tar tzf Downloads/hbase-2.4.8-bin.tar.gz|
> grep book.html
> hbase-2.4.8/docs/book.html
> 
> Historically these do get maintained for each minor version. My
> understanding is that the diligence on backporting across both
> contributors to docs and release managers has varied considerably over
> time.
> 
> If we're only going to have one version of the docs, then we should
> break the docs out of the main repo entirely so that we can simplify
> their generation.
> 
> IIRC we have only gone through the trouble of publishing to the
> website version specific API docs / ref guides for release lines that
> made it to be the "stable" branch.
> 
>> On Sat, Jan 8, 2022 at 9:37 AM 张铎(Duo Zhang)  wrote:
>> 
>> I agree with Andrew that maybe it is beyond our ability to maintain
>> ref guides for different release lines and keep them all in sync...
>> 
>> What about this, we just make the ref guide for the master branch in
>> sync, which contains all release lines. We will still remove
>> information of the EOL releases as needed, to keep the ref guide
>> clean, but as said in the title, less aggressive.
>> And when we decide to EOL a release line, we copy the ref guide of the
>> current master branch to the specific branch, and generate the ref
>> guide for that release line for the last time.
>> In this way the release manager does not need to always think of
>> keeping the ref guide in sync, we all just need to consider the master
>> one, only one extra work needs to be done when EOLing a release line.
>> 
>> WDYT?
>> 
>> Thanks.
>> 
>> Andrew Purtell  于2022年1月8日周六 07:16写道:
>>> 
>>> There are some challenges with respect to keeping multiple versions of the
>>> documentation around. Each minor release needs a new version? Each RM
>>> managing one or more code line(s) needs to update trunk docs and also
>>> backport to branch docs (or not, depending)? There's been a JIRA open
>>> forever for a 2.4 version of the book and honestly I haven't found time to
>>> do it, because it seems both low priority and nontrivial, and there's never
>>> enough time for everything... although that may be a personal failing.
>>> 
>>>> On Fri, Jan 7, 2022 at 9:05 AM Sean Busbey  wrote:
>>> 
>>>> We should have a version specific version of the ref guide that
>>>> contains that information.
>>>> 
>>>> e.g.
>>>> 
>>>> https://hbase.apache.org/1.4/book.html#hadoop
>>>> 
>>>> https://hbase.apache.org/2.3/book.html#hadoop
>>>> 
>>>> Can we do a better job of making these discoverable to folks rather
>>>> than keeping stuff around?
>>>> 
>>>> On Fri, Jan 7, 2022 at 1:14 AM 张铎(Duo Zhang) 
>>>> wrote:
>>>>> 
>>>>> Recently we've seen several emails on the user list asking whether
>>>>> some hbase versions support specific hadoop versions, usually it will
>>>>> be some versions which are already EOL, so there is no information in
>>>>> our ref guide.
>>>>> 
>>>>> For me I think we could leave the EOL versions in the support matrix
>>>>> for a bit longer. It will be useful for our users.
>>>>> 
>>>>> Thanks.
>>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> Andrew
>>> 
>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>> decrepit hands
>>>   - A23, Crosstalk


Re: [DISCUSS] Keep EOL HBase and Hadoop versions in our support matrix for longer time

2022-01-07 Thread Andrew Purtell
There are some challenges with respect to keeping multiple versions of the
documentation around. Each minor release needs a new version? Each RM
managing one or more code line(s) needs to update trunk docs and also
backport to branch docs (or not, depending)? There's been a JIRA open
forever for a 2.4 version of the book and honestly I haven't found time to
do it, because it seems both low priority and nontrivial, and there's never
enough time for everything... although that may be a personal failing.

On Fri, Jan 7, 2022 at 9:05 AM Sean Busbey  wrote:

> We should have a version specific version of the ref guide that
> contains that information.
>
> e.g.
>
> https://hbase.apache.org/1.4/book.html#hadoop
>
> https://hbase.apache.org/2.3/book.html#hadoop
>
> Can we do a better job of making these discoverable to folks rather
> than keeping stuff around?
>
> On Fri, Jan 7, 2022 at 1:14 AM 张铎(Duo Zhang) 
> wrote:
> >
> > Recently we've seen several emails on the user list asking whether
> > some hbase versions support specific hadoop versions, usually it will
> > be some versions which are already EOL, so there is no information in
> > our ref guide.
> >
> > For me I think we could leave the EOL versions in the support matrix
> > for a bit longer. It will be useful for our users.
> >
> > Thanks.
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: HBase master unable to recover with error "Cannot seek after EOF"

2022-01-07 Thread Andrew Purtell
We run Hadoop (HDFS, YARN) 2.10 at my employer, with 2.4 and 1.6/1.7, so I
can give feedback about functionality and compatibility with respect to
this combination.

Unfortunately I do not have experience running Hadoop 3 in production and
am not directly familiar with anyone who does, although I do believe some
in the community are, and would welcome their experience (and versions) in
this thread if they would like to write in. We can change that statement if
someone can attest to 'sufficiently tested' with some 3.x version. This
change may be due (or not).

I do have some non-prod experience with Hadoop 3.1 and HBase 2.4 with root
filesystem on S3 (with HBOSS) and WAL on HDFS. I did not run into any WAL
related issues like the problems you mention on that thread but would need
more effort to qualify such a configuration for production so it might turn
up under more intensive scenarios. Have not tried Hadoop 3.2 or later.

I'm sorry I could not be more helpful.



On Fri, Jan 7, 2022 at 11:28 AM Claude M  wrote:

> Thanks for your reply.  What about the following statement that is in the
> documentation, is this still true?
>
> Hadoop 3.x is still in early access releases and has not yet been
> sufficiently tested by the HBase community for production use cases.
>
> When I tested HBase 2.3.5 w/ hadoop 3.2.2, I was encountering a problem w/
> Hadoop described here:
> https://www.mail-archive.com/user@hadoop.apache.org/msg24265.html.  When I
> changed it to use Hadoop 2.10.0, I did not have the problem.
>
>
> On Fri, Jan 7, 2022 at 1:13 PM Andrew Purtell 
> wrote:
>
> > The functional compatibility is the same with 2.3 and 2.4 with respect to
> > Hadoop 2.10. The omission in the compatibility chart is a documentation
> > bug. There is an existing JIRA for that omission that will be
> > reprioritized.
> >
> > > On Jan 7, 2022, at 9:38 AM, Claude M  wrote:
> > >
> > > Has HBase 2.4 been tested to be fully functional w/ Hadoop 2.10.0?  I
> > don't
> > > see it in the compatibility chart.
> > >
> > >> On Fri, Jan 7, 2022 at 12:37 AM 张铎(Duo Zhang) 
> > wrote:
> > >>
> > >> You can try to upgrade to 2.4.x, it should be rolling upgradable.
> > >>
> > >> Claude M  于2022年1月4日周二 23:24写道:
> > >>>
> > >>> I don't want to rebuild HBase.  According to the attached
> HBase/Hadoop
> > >> compatibility chart, the latest version of HBase that has been
> verified
> > w/
> > >> Hadoop is 2.3.x.
> > >>> The fix was put into branch 2.3 on 11/21 but there is not going to
> be a
> > >> 2.3.8 release since it is mentioned that branch 2.3 is EOL.  Is there
> > not
> > >> another way around this?
> > >>>
> > >>> On Fri, Dec 24, 2021 at 12:53 AM 张铎(Duo Zhang) <
> palomino...@gmail.com>
> > >> wrote:
> > >>>>
> > >>>> Ah, thanks Yulin Niu for the pointer. HBASE-26053 should be the
> > problem.
> > >>>>
> > >>>> Yulin Niu  于2021年12月19日周日 10:41写道:
> > >>>>>
> > >>>>> https://issues.apache.org/jira/browse/HBASE-25053
> > >>>>> It seems the bug described in this issue, You can try cherry pick
> > this
> > >>>>> patch, Claude M
> > >>>>>
> > >>>>> Viraj Jasani  于2021年12月19日周日 02:17写道:
> > >>>>>
> > >>>>>>> Your fix is a bit dangerous since you may lose some ongoing
> > >> procedures,
> > >>>>>> but
> > >>>>>>> if you did not experience any inconsistency on your cluster, for
> > >> example,
> > >>>>>>> some regions are not online, then it is OK.
> > >>>>>>
> > >>>>>> Duo, out of curiosity, even if some regions are offline and/or
> some
> > >> servers
> > >>>>>> go offline, wouldn't master failover re-trigger SCPs and TRSPs to
> > >> bring all
> > >>>>>> regions ONLINE?
> > >>>>>> I have played around with removal of MasterProcWAL on hbase1 only
> > >> (WAL proc
> > >>>>>> store) and have seen new SCPs getting triggered i.e. AM doesn
> bring
> > >> all
> > >>>>>> regions ONLINE eventually.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thu, Dec 16, 2021 at 9:57 PM 张铎(Duo Zhang) <
> > >> palomino...@gmail.com>
> > >>>&g

Re: HBase master unable to recover with error "Cannot seek after EOF"

2022-01-07 Thread Andrew Purtell
The functional compatibility is the same with 2.3 and 2.4 with respect to 
Hadoop 2.10. The omission in the compatibility chart is a documentation bug. 
There is an existing JIRA for that omission that will be reprioritized. 

> On Jan 7, 2022, at 9:38 AM, Claude M  wrote:
> 
> Has HBase 2.4 been tested to be fully functional w/ Hadoop 2.10.0?  I don't
> see it in the compatibility chart.
> 
>> On Fri, Jan 7, 2022 at 12:37 AM 张铎(Duo Zhang)  wrote:
>> 
>> You can try to upgrade to 2.4.x, it should be rolling upgradable.
>> 
>> Claude M  于2022年1月4日周二 23:24写道:
>>> 
>>> I don't want to rebuild HBase.  According to the attached HBase/Hadoop
>> compatibility chart, the latest version of HBase that has been verified w/
>> Hadoop is 2.3.x.
>>> The fix was put into branch 2.3 on 11/21 but there is not going to be a
>> 2.3.8 release since it is mentioned that branch 2.3 is EOL.  Is there not
>> another way around this?
>>> 
>>> On Fri, Dec 24, 2021 at 12:53 AM 张铎(Duo Zhang) 
>> wrote:
 
 Ah, thanks Yulin Niu for the pointer. HBASE-26053 should be the problem.
 
 Yulin Niu  于2021年12月19日周日 10:41写道:
> 
> https://issues.apache.org/jira/browse/HBASE-25053
> It seems the bug described in this issue, You can try cherry pick this
> patch, Claude M
> 
> Viraj Jasani  于2021年12月19日周日 02:17写道:
> 
>>> Your fix is a bit dangerous since you may lose some ongoing
>> procedures,
>> but
>>> if you did not experience any inconsistency on your cluster, for
>> example,
>>> some regions are not online, then it is OK.
>> 
>> Duo, out of curiosity, even if some regions are offline and/or some
>> servers
>> go offline, wouldn't master failover re-trigger SCPs and TRSPs to
>> bring all
>> regions ONLINE?
>> I have played around with removal of MasterProcWAL on hbase1 only
>> (WAL proc
>> store) and have seen new SCPs getting triggered i.e. AM doesn bring
>> all
>> regions ONLINE eventually.
>> 
>> 
>> On Thu, Dec 16, 2021 at 9:57 PM 张铎(Duo Zhang) <
>> palomino...@gmail.com>
>> wrote:
>> 
>>> I guess this should be a bug. For the master local region we do
>> not
>> handle
>>> broken WAL files which do not even have a valid header.
>>> 
>>> Will take a look at the code tomorrow to confirm whether this is
>> the
>> case.
>>> 
>>> Your fix is a bit dangerous since you may lose some ongoing
>> procedures,
>> but
>>> if you did not experience any inconsistency on your cluster, for
>> example,
>>> some regions are not online, then it is OK.
>>> 
>>> Thanks for reporting.
>>> 
>>> Claude M  于2021年12月16日周四 03:37写道:
>>> 
 Hello,
 
 I have the following installed:
 
   - Hadoop 3.2.2
   - HBase 2.3.5
 
 
 When all the datanodes in Hadoop are stopped but the HBase
>> cluster is
 still running, the HBase master crashes w/ the attached
>> exception and
>> is
 not recoverable.
 
 If I delete the contents under the following directories in
>> hdfs, the
 master will then recover:
 
   - /hbase/MasterData/WALs/
   - /hbase/MasterData/data/master/store/*/recovered.wals/
 
 Is this an appropriate way to resolve the issue?  If not, what
>> should
>> be
 done?
 
 
 Thanks
 
>>> 
>> 
>> 


[ANNOUNCE] Apache HBase 2.4.9 is now available for download

2021-12-24 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.9.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.9 is the ninth patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://issues.apache.org/jira/projects/HBASE/versions/12350709

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


ANNOUNCE] Apache HBase 2.4.8 is now available for download

2021-11-03 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.8.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.8 is the seventh patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.8-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.4.7 is now available for download

2021-10-19 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.7.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.7 is the seventh patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.7-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] New HBase committer Zhuoyue Huang(GeorryHuang)

2021-10-14 Thread Andrew Purtell
Congratulations and welcome! 

> On Oct 14, 2021, at 12:14 AM, Guanghao Zhang  wrote:
> 
> Folks,
> 
> On behalf of the Apache HBase PMC I am pleased to announce that Zhuoyue
> Huang has accepted the PMC's invitation to become a committer on the
> project.
> 
> We appreciate all of the great contributions Zhuoyue Huang has made to the
> community thus far and we look forward to his continued involvement.
> 
> Allow me to be the first to congratulate Zhuoyue Huang on his new role!
> 
> Thanks.


Re: [DISCUSS] Removing problematic terms from our project

2021-09-29 Thread Andrew Purtell
Just to be clear Bryan there is no issue with you asking questions or
reviving discussion. I was just trying to summarize for you because I do
believe we had a fairly clear outcome.

Because there has been no additional comment in a long time -- i.e. it's
now lazy consensus -- I felt I could be more definitive in the restatement
of the summary.

We can certainly reopen the discussion if I have mischaracterized the
consensus, or if there isn't actually a consensus, or if anyone has changed
their position in the intervening months.

Otherwise, we are just waiting for patch contribution...


On Wed, Sep 29, 2021 at 11:02 AM Bryan Beaudreault
 wrote:

> Sorry Andrew, I think I misinterpreted aspects of your last summary. It
> seemed like maybe there were still open questions and I was mostly just
> curious if something had been (or should be) captured anywhere else. Your
> new summary helps clarify the conclusion, thanks for providing it.
>
> On Wed, Sep 29, 2021 at 1:34 PM Andrew Purtell 
> wrote:
>
> > Bryan,
> >
> > Let me paraphrase the resolution of this discussion from the PMC
> > perspective: We are, broadly speaking, supportive of changes to improve
> > conscious language choices. Our project uses some words with known
> > controversial context. Unfortunately one word in particular, "master",
> does
> > not have a consensus that it is or isn't a valid term of art, and in any
> > case is deeply embedded in API and configuration contexts. Other terms,
> > like "slave", have consensus on removal. We would, generally speaking,
> > welcome for review any patches that change conscious language choices for
> > the better. The proposer of the patch can explain the context of the
> change
> > to help make the case it should be applied. The PMC would also provide
> > support, in the form of release management and voting, for necessary
> > deprecation-release-removal-release cycles where termonology changes
> impact
> > one or more of our compatibility guidelines.
> >
> > What has been missing since this thread closed with this conclusion?
> >
> > Actual patches.
> >
> > It's quite easy to advocate someone *else* make language changes.
> >
> >
> >
> > On Wed, Sep 29, 2021 at 5:26 AM Bryan Beaudreault
> >  wrote:
> >
> > > Sorry to revive a very old thread, but I just stumbled across this and
> > > don't see a clear resolution. I wonder if we should create a JIRA from
> > > Andrew's summary and treat that as an umbrella encompassing the
> original
> > 3
> > > JIRAs? I'm also cognizant of the fact that there are rumblings of doing
> > an
> > > initial 3.0 release, and I see above there was a proposal to deprecate
> > in 3
> > > and release in 4. I imagine we're slowly running out of time to make
> that
> > > change.
> > >
> > > If I missed a JIRA somewhere, maybe we can put a link here for
> posterity.
> > >
> > > On Fri, Jun 26, 2020 at 2:35 PM Andrew Purtell 
> > > wrote:
> > >
> > > > Circling back after more inputs, if we use this as a description of
> the
> > > > proposals:
> > > >
> > > > 1. Replace "master"/"hmaster" with ???, this one has by far the most
> > > > significant impact and both opinion and interpretation on this one is
> > > > mixed.
> > > >
> > > > 2. Replace "slave" with "follower", seems to impact the cross cluster
> > > > replication subsystem only.
> > > >
> > > > 3. Replace "black list" with "deny list".
> > > >
> > > > 4. Replace "white list" with "accept list".
> > > >
> > > > Then by my read of the responses we have consensus to do #2, #3, and
> > #4.
> > > > They were not controversial. JIRAs and patches will be welcome. Seems
> > > > pretty clear committers and PMC will approve and do what is needed to
> > > > complete any necessary deprecation cycle.
> > > >
> > > > Regarding #1, opinion is mixed. By my read I also think committers
> and
> > > PMC
> > > > will approve patches and do what is needed to complete any necessary
> > > > deprecation cycle for this one too. Enough PMC members expressed
> > support
> > > to
> > > > successfully vote on a release (although not if there were to be
> > opposing
> > > > votes). If a contributor were to open a JIRA and provide patches for
> > > this,
> > > > there would b

Re: [DISCUSS] Removing problematic terms from our project

2021-09-29 Thread Andrew Purtell
Bryan,

Let me paraphrase the resolution of this discussion from the PMC
perspective: We are, broadly speaking, supportive of changes to improve
conscious language choices. Our project uses some words with known
controversial context. Unfortunately one word in particular, "master", does
not have a consensus that it is or isn't a valid term of art, and in any
case is deeply embedded in API and configuration contexts. Other terms,
like "slave", have consensus on removal. We would, generally speaking,
welcome for review any patches that change conscious language choices for
the better. The proposer of the patch can explain the context of the change
to help make the case it should be applied. The PMC would also provide
support, in the form of release management and voting, for necessary
deprecation-release-removal-release cycles where termonology changes impact
one or more of our compatibility guidelines.

What has been missing since this thread closed with this conclusion?

Actual patches.

It's quite easy to advocate someone *else* make language changes.



On Wed, Sep 29, 2021 at 5:26 AM Bryan Beaudreault
 wrote:

> Sorry to revive a very old thread, but I just stumbled across this and
> don't see a clear resolution. I wonder if we should create a JIRA from
> Andrew's summary and treat that as an umbrella encompassing the original 3
> JIRAs? I'm also cognizant of the fact that there are rumblings of doing an
> initial 3.0 release, and I see above there was a proposal to deprecate in 3
> and release in 4. I imagine we're slowly running out of time to make that
> change.
>
> If I missed a JIRA somewhere, maybe we can put a link here for posterity.
>
> On Fri, Jun 26, 2020 at 2:35 PM Andrew Purtell 
> wrote:
>
> > Circling back after more inputs, if we use this as a description of the
> > proposals:
> >
> > 1. Replace "master"/"hmaster" with ???, this one has by far the most
> > significant impact and both opinion and interpretation on this one is
> > mixed.
> >
> > 2. Replace "slave" with "follower", seems to impact the cross cluster
> > replication subsystem only.
> >
> > 3. Replace "black list" with "deny list".
> >
> > 4. Replace "white list" with "accept list".
> >
> > Then by my read of the responses we have consensus to do #2, #3, and #4.
> > They were not controversial. JIRAs and patches will be welcome. Seems
> > pretty clear committers and PMC will approve and do what is needed to
> > complete any necessary deprecation cycle.
> >
> > Regarding #1, opinion is mixed. By my read I also think committers and
> PMC
> > will approve patches and do what is needed to complete any necessary
> > deprecation cycle for this one too. Enough PMC members expressed support
> to
> > successfully vote on a release (although not if there were to be opposing
> > votes). If a contributor were to open a JIRA and provide patches for
> this,
> > there would be more discussion. There is no consensus, yet, on what
> > replacement term is best. Personally, I can accept Zheng's recent
> > suggestion of "controller". I can see how syllable count matters.
> >
> > I don't mean this summary to close the conversation. It is only a
> > checkpoint.
> >
> > If anyone reading this has an opinion they do not wish to express
> > publically, you are welcome to write to priv...@hbase.apache.org to
> state
> > your opinion and the PMC will of course respectfully listen to it.
> >
> >
> >
> > On Thu, Jun 25, 2020 at 7:47 PM zheng wang <18031...@qq.com> wrote:
> >
> > > I like thecontroller.
> > >
> > >
> > > Coordinator is a bit long for me to write and speak.
> > > Manager and Admin is used somewhere yet in HBase.
> > >
> > >
> > >
> > >
> > > --原始邮件--
> > > 发件人:"Andrew Purtell" > > 发送时间:2020年6月26日(星期五) 上午9:08
> > > 收件人:"Hbase-User" > > 抄送:"dev" > > 主题:Re: [DISCUSS] Removing problematic terms from our project
> > >
> > >
> > >
> > >  - AdminServer (as you already have AdminClient to talk to it).
> > >
> > > Oh... I like AdminServer. AdminServer (serving admin functions) and
> > > RegionServer (serving region data).
> > >
> > > On Thu, Jun 25, 2020 at 4:46 PM Andrey Elenskiy
> > >  > >
> > >   Is there a word that's not "master" and not "coordinator"
> that
> > > is clear
> > >  and
> > >

[ANNOUNCE] Apache HBase 2.4.6 is now available for download

2021-09-17 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.6.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.6 is the sixth patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.6-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.4.5 is now available for download

2021-07-30 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.5.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.5 is the fifth patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.5-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] New HBase PMC Bharath Vissapragada

2021-07-30 Thread Andrew Purtell
Congratulations and welcome, Bharath!

On Fri, Jul 30, 2021 at 6:26 AM Viraj Jasani  wrote:
>
> On behalf of the Apache HBase PMC I am pleased to announce that Bharath
> Vissapragada has accepted our invitation to become a PMC member on the
> HBase project. We appreciate Bharath stepping up to take more
> responsibility for the project.
>
> Congratulations and welcome, Bharath!



-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [NOTICE] HBaseTestingUtility is now deprecated, starting from 3.0.0

2021-07-19 Thread Andrew Purtell
It wasn’t clear until your response that HTBU was retained. The NOTICE didn’t 
mention it and gave the impression it was removed. (“Renamed”)

Glad to hear HBTU was deprecated, marked LP, and copied to src/. Resolves any 
concerns I might have had. 


> On Jul 19, 2021, at 6:56 PM, 张铎  wrote:
> 
> Please see my last sentence in the first email
> 
> We can keep improving it if the current API set is not enough.
>> 
> 
> If you have any concern on the current API design, for example, does not
> support passing a Configuration when constructing, then just open an issue,
> we can add the support, no problem.
> 
> And the original HBTU is still marked as IA.LimitedPrivate("Phoenix"), the
> phoenix project can still use it if the new TestingHBaseCluster is not
> enough.
> 
> And the deprecated HBTU will keep for the whole 3.x lifecycle, we copied it
> to the hbase-testing-util module, under the src/main directory. You can
> still use it for several years...
> 
> Thanks.
> 
> Andrew Purtell  于2021年7月20日周二 上午3:26写道:
> 
>>> Just leaving a reference to the old, lower-level HBTU as a public
>> property of the new
>>> interface seems lower-risk to me. What are the gains from hiding the
>>> existing HBTU?
>> 
>> This would be similar to the strategy we adopted for the Admin
>> interface. Admin there for new users, and as a migration target, but
>> HBaseAdmin still available with deprecation annotations.
>> 
>> I guess the question is if the consensus is HBTU was meant to be
>> adopted and consumed by downstream projects.
>> 
>> In my opinion, nothing in any project's test/* source directories
>> should be considered public, supported, and supportable. Test
>> resources within a project exist to test that project, including its
>> private internals.
>> 
>> What I would recommend, fwiw is two things:
>> 
>> 1. Explicitly release a supported hbase-testing-util artifact, with
>> Public and LimitedPrivate interfaces, with code in src/ not test/.
>> 2. Bring back HBTU, but as a compatibility shim. Provide deprecated
>> access to HBTU for Phoenix, marked LP(Phoenix), with this deprecated
>> accessor to be removed in HBase 4.0, along with the HBTU interface.
>> 
>>> On Mon, Jul 19, 2021 at 12:15 PM Geoffrey Jacoby 
>>> wrote:
>>> 
>>> Can this be a [DISCUSS] rather than a [NOTICE]? The implications for
>>> downstream projects (both Phoenix and many internal projects) are large,
>>> and it seems like something that needs broader discussion before being
>> set
>>> in stone. The HBaseTestingUtility is used extensively in Phoenix, as well
>>> as in many internal projects at my dayjob (some directly and some through
>>> Phoenix's BaseTest wrapping of HBTU) -- it's quite useful.
>>> 
>>> The idea of a better-encapsulated, easier-to-use HBase testing utility
>> is a
>>> good one, and the TestingHBaseCluster interface looks like a definite
>>> improvement. However, I notice at least one large gap right away: there
>>> doesn't appear to be a way to inject a custom Configuration object into
>> the
>>> test cluster, which is a very common pattern. (Example: run a test suite
>>> twice with a new minicluster each time, once with a flag off and then
>> on.)
>>> This seems like a simple fix.
>>> 
>>> More concerning is the underlying assumption of the change, that only the
>>> HBase project, and perhaps Phoenix, will ever need to write a test of
>>> server-side components. That's simply not the case, because HBase has
>> many
>>> integration points that allow downstream developed code to run in
>>> server-side processes.
>>> 
>>> These include:
>>> Coprocessor Observers and Endpoints
>>> Replication Endpoints
>>> MapReduce integration (which acts as a client from HBase's perspective
>> but
>>> runs within YARN services)
>>> 
>>> In addition, Phoenix supports user-defined functions (UDFs) which I
>> believe
>>> can run server-side within a coproc in certain query plans.
>>> 
>>> The change assumes that no one will ever need direct access to the
>> testing
>>> utility's internal ZooKeeper, MR, or DFS services, but this seems
>> relevant
>>> to failure scenario tests of both Replication Endpoints and MapReduce
>> jobs.
>>> The Admin API may be able to replace quite a lot of existing logic going
>>> forward, and many existing tests already use it rather than the test
>>> utility directly.  But there are 

Re: [NOTICE] HBaseTestingUtility is now deprecated, starting from 3.0.0

2021-07-19 Thread Andrew Purtell
> Just leaving a reference to the old, lower-level HBTU as a public property of 
> the new
> interface seems lower-risk to me. What are the gains from hiding the
> existing HBTU?

This would be similar to the strategy we adopted for the Admin
interface. Admin there for new users, and as a migration target, but
HBaseAdmin still available with deprecation annotations.

I guess the question is if the consensus is HBTU was meant to be
adopted and consumed by downstream projects.

In my opinion, nothing in any project's test/* source directories
should be considered public, supported, and supportable. Test
resources within a project exist to test that project, including its
private internals.

What I would recommend, fwiw is two things:

1. Explicitly release a supported hbase-testing-util artifact, with
Public and LimitedPrivate interfaces, with code in src/ not test/.
2. Bring back HBTU, but as a compatibility shim. Provide deprecated
access to HBTU for Phoenix, marked LP(Phoenix), with this deprecated
accessor to be removed in HBase 4.0, along with the HBTU interface.

On Mon, Jul 19, 2021 at 12:15 PM Geoffrey Jacoby  wrote:
>
> Can this be a [DISCUSS] rather than a [NOTICE]? The implications for
> downstream projects (both Phoenix and many internal projects) are large,
> and it seems like something that needs broader discussion before being set
> in stone. The HBaseTestingUtility is used extensively in Phoenix, as well
> as in many internal projects at my dayjob (some directly and some through
> Phoenix's BaseTest wrapping of HBTU) -- it's quite useful.
>
> The idea of a better-encapsulated, easier-to-use HBase testing utility is a
> good one, and the TestingHBaseCluster interface looks like a definite
> improvement. However, I notice at least one large gap right away: there
> doesn't appear to be a way to inject a custom Configuration object into the
> test cluster, which is a very common pattern. (Example: run a test suite
> twice with a new minicluster each time, once with a flag off and then on.)
> This seems like a simple fix.
>
> More concerning is the underlying assumption of the change, that only the
> HBase project, and perhaps Phoenix, will ever need to write a test of
> server-side components. That's simply not the case, because HBase has many
> integration points that allow downstream developed code to run in
> server-side processes.
>
> These include:
> Coprocessor Observers and Endpoints
> Replication Endpoints
> MapReduce integration (which acts as a client from HBase's perspective but
> runs within YARN services)
>
> In addition, Phoenix supports user-defined functions (UDFs) which I believe
> can run server-side within a coproc in certain query plans.
>
> The change assumes that no one will ever need direct access to the testing
> utility's internal ZooKeeper, MR, or DFS services, but this seems relevant
> to failure scenario tests of both Replication Endpoints and MapReduce jobs.
> The Admin API may be able to replace quite a lot of existing logic going
> forward, and many existing tests already use it rather than the test
> utility directly.  But there are literally thousands of downstream tests to
> analyze across many different organizations and institutions to verify that
> nothing important is being lost, and that will take time. Just leaving a
> reference to the old, lower-level HBTU as a public property of the new
> interface seems lower-risk to me. What are the gains from hiding the
> existing HBTU?
>
> Geoffrey
>
>
>
> On Sun, Jul 18, 2021 at 9:44 PM 张铎(Duo Zhang)  wrote:
>
> > Please see the discussion in
> > https://issues.apache.org/jira/browse/HBASE-13126
> >
> > And final work is done in
> > https://issues.apache.org/jira/browse/HBASE-26081
> > https://github.com/apache/hbase/pull/3478
> >
> > The original HBaseTestingUtility has been renamed to HBaseTestingUtil, and
> > MiniHBaseCluster has been renamed to SingleProcessHBaseCluster. Now they
> > are not expected to be used by end users any more. We marked it as
> > IA.LimitedPrivate("Phoenix"), as maybe the Phoenix project may still need
> > to test something internal to HBase.
> >
> > Anyway, we encourage every downstream projects(including Phoenix) to try to
> > make use of the new TestingHBaseCluster introduced in
> > https://issues.apache.org/jira/browse/HBASE-26080
> >
> > We can keep improving it if the current API set is not enough.
> >
> >  简略的中文版通知,非直译 
> >
> > HBaseTestingUtility 已经在 3.0.0 中被标记为 Deprecated,请所有用户尽量尝试使用在 HBASE-26080
> > 中引入的 TestingHBaseCluster。有任何需求请随时反馈,我们会持续优化。
> >



-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [ANNOUNCE] New HBase committer Baiqiang Zhao

2021-07-10 Thread Andrew Purtell
Congratulations and welcome!

> On Jul 10, 2021, at 12:19 PM, Nick Dimiduk  wrote:
> 
> Hi everyone,
> 
> On behalf of the Apache HBase PMC I am pleased to announce that Baiqiang
> Zhao has accepted the PMC's invitation to become a committer on the project!
> 
> We appreciate all of the great contributions Baiqiang has made to
> the community thus far and we look forward to his continued involvement.
> 
> Allow me to be the first to congratulate Baiqiang on his new role!
> 
> Thanks,
> Nick


[ANNOUNCE] Apache HBase 2.4.4 is now available for download

2021-06-14 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.4.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.4 is the fourth patch release in the HBase 2.4.x line. The full
list of issues can be found in the included CHANGES and RELEASENOTES,
or via our issue tracker:

https://s.apache.org/hbase-2.4.4-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: EOL branch-1 and all 1.x ?

2021-05-31 Thread Andrew Purtell
It would be good to do the performance work at least, if you are up for it. 
There are always going to be consequences for the kind of significant evolution 
that 2.x represents over 1.x. 

Regarding performance, a change always has positive and negative consequences. 
It is important to understand them both, informed by real world use cases. My 
guess is you have real world use cases, Reid. Your results will be meaningful. 

Synthetic benchmarks are less interesting unless the regression is obvious and 
more like a bug than a consequence. Sure they will report positive and negative 
changes, but does that actually mean anything? It depends. Sometimes it will 
only mean something if we care about supporting the synthetic benchmark as a 
first class use case. (Usually we don’t; but universal cross system bench tools 
like YCSB are exceptions.)


> On May 31, 2021, at 9:25 AM, Reid Chan  wrote:
> 
> Thanks to Andrew and Sean's help, I managed to release the first candidate
> of 1.7.0 (at least it is a beginning, and graduated from green hand).
> BTW, The [VOTE]
> <https://lists.apache.org/thread.html/r0b96b6596fc423e17ff648633e5ea76fd897d9afb8a03ae6e09cdb8f%40%3Cdev.hbase.apache.org%3E>
> 
> The following are my thoughts:
> I'm willing to continue branch-1's life as a RM.
> And before EOL branch-1, I need to announce EOL of branch-1.4.
> While maintaining the branch-1, I also will do some benchmarks between 1.7+
> and 2.4+ (the latest). If 2.4+ is better, cool. Otherwise, I'm willing to
> spend some time diving in.
> After the performance issue is done, I need to review the upgrade from 1.x
> to 2.x. I remember someone wrote it. But HBASE-25902 seems to reveal some
> problems already.
> I will announce EOL of branch-1 if listed above are done.
> 
> Probably more than 1 year, by estimation, if I have to do it all alone. The
> most time-spending should be performance diving in (if there was) and
> upgrade review.
> 
> Any thought is appreciated.
> 
> 
> ---
> Best regards,
> R.C
> 
> 
> 
> 
>> On Tue, Apr 20, 2021 at 12:13 AM Reid Chan  wrote:
>> 
>> 
>> FYI, a JDK issue when I was making the 1.7.0 release.
>> 
>> 
>> https://lists.apache.org/thread.html/r118b08134676d9234362a28898249186fe73a1fb08535d6eec6a91d3%40%3Cdev.hbase.apache.org%3E
>> 
>> 
>> ---
>> Best Regards,
>> R.C
>> 
>>> On Thu, Apr 1, 2021 at 6:03 AM Andrew Purtell  wrote:
>>> 
>>> Is it time to consider EOL of branch-1 and all 1.x releases ?
>>> 
>>> There doesn't seem to be much developer interest in branch-1 beyond
>>> occasional maintenance. This is understandable. Per our compatibility
>>> guidelines, branch-1 commits must be compatible with Java 7, and the range
>>> of acceptable versions of third party dependencies is also restricted due
>>> to Java 7 compatibility requirements. Most developers are writing code
>>> with
>>> Java 8+ idioms these days. For that reason and because the branch-1 code
>>> base is generally aged at this point, all but trivial (or lucky!)
>>> backports
>>> require substantial changes in order to integrate adequately. Let me also
>>> observe that branch-1 artifacts are not fully compatible with Java 11 or
>>> later. (The shell is a good example of such issues: The version of
>>> jruby-complete required by branch-1 is not compatible with Java 11 and
>>> upgrading to the version used by branch-2 causes shell commands to error
>>> out due to Ruby language changes.)
>>> 
>>> We can a priori determine there is insufficient motivation for production
>>> of release artifacts for the PMC to vote upon. Otherwise, someone would
>>> have done it. We had 12 releases from branch-2 derived code in 2019, 13
>>> releases from branch-2 derived code in 2020, and so far we have had 3
>>> releases from branch-2 derived code in 2021. In contrast, we had 8
>>> releases
>>> from branch-1 derived code in 2019, 0 releases from branch-1 in 2020, and
>>> so far 0 releases from branch-1 in 2021.
>>> 
>>> *  2021202020191.x0282.x31312*
>>> 
>>> If there is someone interested in continuing branch-1, now is the time to
>>> commit. However let me be clear that simply expressing an abstract desire
>>> to see continued branch-1 releases will not be that useful. It will be
>>> noted, but will not have much real world impact. Apache is a do-ocracy. In
>>> the absence of intrinsic motivation of project participants, which is what
>>> we seem to have here, you will need to do something: Fix the compatibility
>>> issues, if any bet

Re: [ANNOUNCE] New HBase Committer Xiaolin Ha(哈晓琳)

2021-05-15 Thread Andrew Purtell
Welcome, Xiaolin Ha.

On Sat, May 15, 2021 at 7:11 AM 张铎(Duo Zhang)  wrote:

> On behalf of the Apache HBase PMC, I am pleased to announce that Xiaolin
> Ha(sunhelly) has accepted the PMC's invitation to become a committer on the
> project. We appreciate all of Xiaolin's generous contributions thus far and
> look forward to her continued involvement.
>
> Congratulations and welcome, Xiaolin Ha!
>
> 我很高兴代表Apache HBase PMC宣布哈晓琳已接受我们的邀请,成为Apache
> HBase项目的Committer。感谢哈晓琳一直以来为HBase项目做出的贡献,并期待她在未来继续承担更多的责任。
>
> 欢迎哈晓琳!
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [ANNOUNCE] New HBase PMC Huaxiang Sun

2021-04-13 Thread Andrew Purtell
Congratulations and welcome, Huaxiang!

On Tue, Apr 13, 2021 at 12:39 AM Viraj Jasani  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that Huaxiang
> Sun has accepted our invitation to become a PMC member on the HBase
> project. We appreciate Huaxiang stepping up to take more responsibility for
> the project.
>
> Congratulations and welcome, Huaxiang!
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [ANNOUNCE] New HBase committer Geoffrey Jacoby

2021-04-09 Thread Andrew Purtell
Congratulations and welcome, Geoffrey!


On Fri, Apr 9, 2021 at 4:24 AM Viraj Jasani  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that Geoffrey
> Jacoby has accepted the PMC's invitation to become a committer on the
> project.
>
> Thanks so much for the work you've been contributing. We look forward to
> your continued involvement.
>
> Congratulations and welcome, Geoffrey!
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: EOL branch-1 and all 1.x ?

2021-03-31 Thread Andrew Purtell
EOL of branch-1 doesn’t mean we take down the 1.6.0 release. It would be fine 
to leave that in place. That can be a separate, future, discussion, although if 
branch-1 becomes EOL its eventual removal would be certain. The question is 
really if we plan to maintain branch-1 going forward. Based on lack of interest 
and demand in releasing it, there does not seem reason to. 


> On Mar 31, 2021, at 7:51 PM, Reid Chan  wrote:
> 
> My only concern is about the performance, once in a while there'll be
> some emails like "2.x.y is slower than 1.x.y".
> 
> 
>> On Thu, Apr 1, 2021 at 6:03 AM Andrew Purtell  wrote:
>> 
>> Is it time to consider EOL of branch-1 and all 1.x releases ?
>> 
>> There doesn't seem to be much developer interest in branch-1 beyond
>> occasional maintenance. This is understandable. Per our compatibility
>> guidelines, branch-1 commits must be compatible with Java 7, and the range
>> of acceptable versions of third party dependencies is also restricted due
>> to Java 7 compatibility requirements. Most developers are writing code with
>> Java 8+ idioms these days. For that reason and because the branch-1 code
>> base is generally aged at this point, all but trivial (or lucky!) backports
>> require substantial changes in order to integrate adequately. Let me also
>> observe that branch-1 artifacts are not fully compatible with Java 11 or
>> later. (The shell is a good example of such issues: The version of
>> jruby-complete required by branch-1 is not compatible with Java 11 and
>> upgrading to the version used by branch-2 causes shell commands to error
>> out due to Ruby language changes.)
>> 
>> We can a priori determine there is insufficient motivation for production
>> of release artifacts for the PMC to vote upon. Otherwise, someone would
>> have done it. We had 12 releases from branch-2 derived code in 2019, 13
>> releases from branch-2 derived code in 2020, and so far we have had 3
>> releases from branch-2 derived code in 2021. In contrast, we had 8 releases
>> from branch-1 derived code in 2019, 0 releases from branch-1 in 2020, and
>> so far 0 releases from branch-1 in 2021.
>> 
>> *  2021202020191.x0282.x31312*
>> 
>> If there is someone interested in continuing branch-1, now is the time to
>> commit. However let me be clear that simply expressing an abstract desire
>> to see continued branch-1 releases will not be that useful. It will be
>> noted, but will not have much real world impact. Apache is a do-ocracy. In
>> the absence of intrinsic motivation of project participants, which is what
>> we seem to have here, you will need to do something: Fix the compatibility
>> issues, if any between the last release of 1.x and the current branch-1
>> head; fix any failing and flaky unit tests; produce release artifacts; and
>> submit those artifacts to the PMC for voting. Or, convince someone with
>> commit rights and/or PMC membership to undertake these actions on your
>> behalf.
>> 
>> Otherwise, I respectfully submit for your consideration, it is time to
>> declare  branch-1 and all 1.x code lines EOL, simply acknowledging what has
>> effectively already happened.
>> 
>> --
>> Best regards,
>> Andrew
>> 
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>   - A23, Crosstalk
>> 


EOL branch-1 and all 1.x ?

2021-03-31 Thread Andrew Purtell
Is it time to consider EOL of branch-1 and all 1.x releases ?

There doesn't seem to be much developer interest in branch-1 beyond
occasional maintenance. This is understandable. Per our compatibility
guidelines, branch-1 commits must be compatible with Java 7, and the range
of acceptable versions of third party dependencies is also restricted due
to Java 7 compatibility requirements. Most developers are writing code with
Java 8+ idioms these days. For that reason and because the branch-1 code
base is generally aged at this point, all but trivial (or lucky!) backports
require substantial changes in order to integrate adequately. Let me also
observe that branch-1 artifacts are not fully compatible with Java 11 or
later. (The shell is a good example of such issues: The version of
jruby-complete required by branch-1 is not compatible with Java 11 and
upgrading to the version used by branch-2 causes shell commands to error
out due to Ruby language changes.)

We can a priori determine there is insufficient motivation for production
of release artifacts for the PMC to vote upon. Otherwise, someone would
have done it. We had 12 releases from branch-2 derived code in 2019, 13
releases from branch-2 derived code in 2020, and so far we have had 3
releases from branch-2 derived code in 2021. In contrast, we had 8 releases
from branch-1 derived code in 2019, 0 releases from branch-1 in 2020, and
so far 0 releases from branch-1 in 2021.

*  2021202020191.x0282.x31312*

If there is someone interested in continuing branch-1, now is the time to
commit. However let me be clear that simply expressing an abstract desire
to see continued branch-1 releases will not be that useful. It will be
noted, but will not have much real world impact. Apache is a do-ocracy. In
the absence of intrinsic motivation of project participants, which is what
we seem to have here, you will need to do something: Fix the compatibility
issues, if any between the last release of 1.x and the current branch-1
head; fix any failing and flaky unit tests; produce release artifacts; and
submit those artifacts to the PMC for voting. Or, convince someone with
commit rights and/or PMC membership to undertake these actions on your
behalf.

Otherwise, I respectfully submit for your consideration, it is time to
declare  branch-1 and all 1.x code lines EOL, simply acknowledging what has
effectively already happened.

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-22 Thread Andrew Purtell
Cool, this is my point of view as well. I filed HBASE-25690
<https://issues.apache.org/jira/browse/HBASE-25690> for specifying and
documenting the criteria (whatever it is) for moving the 'stable' pointer.


On Mon, Mar 22, 2021 at 9:27 AM Stack  wrote:

> On Thu, Mar 18, 2021 at 1:07 PM Andrew Purtell 
> wrote:
>
> > Would they do that before or after we designate it stable? Asking, not
> > trying to be difficult. Kind of a chicken and egg problem?
> >
> >
> Before the release was designated stable.
>
>
> > It would be fine I think to consider reported experience when and if it
> > happens but can't be primary criteria because it has nothing directly to
> do
> > with our PMC or project. We need a criteria we as project and PMC can
> > achieve and implement effectively, and IMHO "one of our project devs has
> it
> > running" does not meet that requirement, because this depends on third
> > party organizations (a dev's employer, and such) and idiosyncratic
> > criteria.
> >
> >
> That's fair.
>
> It would be better if we spec'd what a 'stable release' is and then ran
> candidates through the hoops.
>
> S
>
>
> >
> > > On Mar 18, 2021, at 12:47 PM, Stack  wrote:
> > >
> > > On Thu, Mar 18, 2021 at 11:55 AM Andrew Purtell <
> > andrew.purt...@gmail.com>
> > > wrote:
> > >
> > >> And how would we know we have one? We don't track usage telemetry.
> > >>
> > >>
> > > Someone of us w/ standing volunteers that they have made the move (was
> > what
> > > I was thinking).
> > > S
> > >
> > >
> > >
> > >
> > >>
> > >>>> On Mar 18, 2021, at 11:29 AM, Stack  wrote:
> > >>>
> > >>> On Wed, Mar 17, 2021 at 1:49 PM Andrew Purtell  >
> > >> wrote:
> > >>>
> > >>>> I would like to propose we update the 'stable' release pointer,
> > >> currently
> > >>>> pointing at 2.3.4, to 2.4.2.
> > >>>>
> > >>>> In my testing with aggressive chaos and ITBLL (but in,
> unfortunately,
> > >> due
> > >>>> to resource constraints, in small cluster settings of approximately
> 10
> > >>>> nodes) 2.4.2 is very stable.
> > >>>>
> > >>>> Our sister project Phoenix has updated their build system to support
> > >>>> building against 2.4.1 and later, and the stability of their unit
> and
> > >>>> integration test suite is not impacted by any known HBase issue.
> > >>>>
> > >>>> If there is other criteria that should be considered, I'd like for
> us
> > to
> > >>>> discuss it. Does there need to be public acknowledgement of a
> > production
> > >>>> user? At scale? (How would we know?) Would you like me to attempt an
> > >>>> at-scale test? On the order of 100 nodes might be possible? If so,
> > what
> > >>>> should be the test scenario and criteria for success? What
> > distinguishes
> > >>>> 2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the
> > >> area(s)
> > >>>> of concern with respect to moving the stable pointer forward?
> > >>>>
> > >>>>
> > >>> I suggest a happy production deploy as a prerequisite to moving the
> > >> pointer.
> > >>> S
> > >>>
> > >>>
> > >>>
> > >>>> --
> > >>>> Best regards,
> > >>>> Andrew
> > >>>>
> > >>>> Words like orphans lost among the crosstalk, meaning torn from
> truth's
> > >>>> decrepit hands
> > >>>>  - A23, Crosstalk
> > >>>>
> > >>
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-18 Thread Andrew Purtell
Would they do that before or after we designate it stable? Asking, not trying 
to be difficult. Kind of a chicken and egg problem?

It would be fine I think to consider reported experience when and if it happens 
but can't be primary criteria because it has nothing directly to do with our 
PMC or project. We need a criteria we as project and PMC can achieve and 
implement effectively, and IMHO "one of our project devs has it running" does 
not meet that requirement, because this depends on third party organizations (a 
dev's employer, and such) and idiosyncratic criteria. 


> On Mar 18, 2021, at 12:47 PM, Stack  wrote:
> 
> On Thu, Mar 18, 2021 at 11:55 AM Andrew Purtell 
> wrote:
> 
>> And how would we know we have one? We don't track usage telemetry.
>> 
>> 
> Someone of us w/ standing volunteers that they have made the move (was what
> I was thinking).
> S
> 
> 
> 
> 
>> 
>>>> On Mar 18, 2021, at 11:29 AM, Stack  wrote:
>>> 
>>> On Wed, Mar 17, 2021 at 1:49 PM Andrew Purtell 
>> wrote:
>>> 
>>>> I would like to propose we update the 'stable' release pointer,
>> currently
>>>> pointing at 2.3.4, to 2.4.2.
>>>> 
>>>> In my testing with aggressive chaos and ITBLL (but in, unfortunately,
>> due
>>>> to resource constraints, in small cluster settings of approximately 10
>>>> nodes) 2.4.2 is very stable.
>>>> 
>>>> Our sister project Phoenix has updated their build system to support
>>>> building against 2.4.1 and later, and the stability of their unit and
>>>> integration test suite is not impacted by any known HBase issue.
>>>> 
>>>> If there is other criteria that should be considered, I'd like for us to
>>>> discuss it. Does there need to be public acknowledgement of a production
>>>> user? At scale? (How would we know?) Would you like me to attempt an
>>>> at-scale test? On the order of 100 nodes might be possible? If so, what
>>>> should be the test scenario and criteria for success? What distinguishes
>>>> 2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the
>> area(s)
>>>> of concern with respect to moving the stable pointer forward?
>>>> 
>>>> 
>>> I suggest a happy production deploy as a prerequisite to moving the
>> pointer.
>>> S
>>> 
>>> 
>>> 
>>>> --
>>>> Best regards,
>>>> Andrew
>>>> 
>>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>>> decrepit hands
>>>>  - A23, Crosstalk
>>>> 
>> 


Re: [DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-18 Thread Andrew Purtell
And how would we know we have one? We don't track usage telemetry.


> On Mar 18, 2021, at 11:29 AM, Stack  wrote:
> 
> On Wed, Mar 17, 2021 at 1:49 PM Andrew Purtell  wrote:
> 
>> I would like to propose we update the 'stable' release pointer, currently
>> pointing at 2.3.4, to 2.4.2.
>> 
>> In my testing with aggressive chaos and ITBLL (but in, unfortunately, due
>> to resource constraints, in small cluster settings of approximately 10
>> nodes) 2.4.2 is very stable.
>> 
>> Our sister project Phoenix has updated their build system to support
>> building against 2.4.1 and later, and the stability of their unit and
>> integration test suite is not impacted by any known HBase issue.
>> 
>> If there is other criteria that should be considered, I'd like for us to
>> discuss it. Does there need to be public acknowledgement of a production
>> user? At scale? (How would we know?) Would you like me to attempt an
>> at-scale test? On the order of 100 nodes might be possible? If so, what
>> should be the test scenario and criteria for success? What distinguishes
>> 2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the area(s)
>> of concern with respect to moving the stable pointer forward?
>> 
>> 
> I suggest a happy production deploy as a prerequisite to moving the pointer.
> S
> 
> 
> 
>> --
>> Best regards,
>> Andrew
>> 
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>   - A23, Crosstalk
>> 


[DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-17 Thread Andrew Purtell
I would like to propose we update the 'stable' release pointer, currently
pointing at 2.3.4, to 2.4.2.

In my testing with aggressive chaos and ITBLL (but in, unfortunately, due
to resource constraints, in small cluster settings of approximately 10
nodes) 2.4.2 is very stable.

Our sister project Phoenix has updated their build system to support
building against 2.4.1 and later, and the stability of their unit and
integration test suite is not impacted by any known HBase issue.

If there is other criteria that should be considered, I'd like for us to
discuss it. Does there need to be public acknowledgement of a production
user? At scale? (How would we know?) Would you like me to attempt an
at-scale test? On the order of 100 nodes might be possible? If so, what
should be the test scenario and criteria for success? What distinguishes
2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the area(s)
of concern with respect to moving the stable pointer forward?

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[ANNOUNCE] Apache HBase 2.4.2 is now available for download

2021-03-17 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.2.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.2 is the first patch release in the HBase 2.4.x line, which aims
to improve the stability and reliability of the 2.4 release. The full list
of issues can be found in the included CHANGES.md and RELEASENOTES.md,
or via our issue tracker:

https://s.apache.org/hbase-2.4.2-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.4.1 is now available for download

2021-01-26 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.1.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.1 is the first patch release in the HBase 2.4.x line, which aims
to improve the stability and reliability of the 2.4 release. The full list
of issues can be found in the included CHANGES.md and RELEASENOTES.md,
or via our issue tracker:

https://s.apache.org/hbase-2.4.1-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


[ANNOUNCE] Apache HBase 2.4.0 is now available for download

2020-12-16 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of HBase
2.4.0.

Apache HBase™ is an open-source, distributed, versioned, non-relational
database.

Apache HBase gives you low latency random access to billions of rows with
millions of columns atop non-specialized hardware. To learn more about
HBase, see https://hbase.apache.org/.

HBase 2.4.0 is the fifth minor release in the HBase 2.x line, which aims
to improve the stability and reliability of HBase. The full list of issues
can be found in the included CHANGES.md and RELEASENOTES.md,
or via our issue tracker:

https://s.apache.org/hbase-2.4.0-jira

To download please follow the links and instructions on our website:

https://hbase.apache.org/downloads.html

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org.

Thanks to all who contributed and made this release possible.

Cheers,
The HBase Dev Team


Re: [ANNOUNCE] New HBase committer Yulin Niu

2020-12-04 Thread Andrew Purtell
Congratulations and welcome!

On Thu, Dec 3, 2020 at 1:11 AM Guanghao Zhang  wrote:

> Folks,
>
> On behalf of the Apache HBase PMC I am pleased to announce that Yulin Niu
> has accepted the PMC's invitation to become a committer on the project.
>
> We appreciate all of the great contributions Yulin has made to the
> community thus far and we look forward to his continued involvement.
>
> Allow me to be the first to congratulate Yulin on his new role!
>
> Thanks.
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[ANNOUNCE] Please welcome Viraj Jasani to the Apache HBase PMC

2020-10-05 Thread Andrew Purtell
On behalf of the Apache HBase PMC I am pleased to announce that
Viraj Jasani has accepted our invitation to become a PMC member on the
HBase project. We appreciate Viraj stepping up to take more
responsibility for the project.

Please join me in welcoming Viraj to the HBase PMC!


As a reminder, if anyone would like to nominate another person as a
committer or PMC member, even if you are not currently a committer or
PMC member, you can always drop a note to priv...@hbase.apache.org
to let us know.

-- 
Best regards,
Andrew


Re: [ANNOUNCE] New HBase Committer Zheng Wang(王正)

2020-09-27 Thread Andrew Purtell
Congratulations and welcome!

> On Sep 23, 2020, at 7:25 PM, 张铎(Duo Zhang)  wrote:
> 
> On behalf of the Apache HBase PMC, I am pleased to announce that Zheng Wang
> has accepted the PMC's invitation to become a committer on the project. We
> appreciate all of Zheng's generous contributions thus far and look forward
> to his continued involvement.
> 
> Congratulations and welcome, Zheng Wang!
> 
> 我很高兴代表Apache HBase PMC宣布王正已接受我们的邀请,成为Apache
> HBase项目的Committer。感谢王正一直以来为HBase项目做出的贡献,并期待他在未来继续承担更多的责任。
> 
> 欢迎王正!


Re: [ANNOUNCE] New HBase Committer Zheng Wang(王正)

2020-09-27 Thread Andrew Purtell
Congratulations and welcome!

> On Sep 23, 2020, at 7:25 PM, 张铎(Duo Zhang)  wrote:
> 
> On behalf of the Apache HBase PMC, I am pleased to announce that Zheng Wang
> has accepted the PMC's invitation to become a committer on the project. We
> appreciate all of Zheng's generous contributions thus far and look forward
> to his continued involvement.
> 
> Congratulations and welcome, Zheng Wang!
> 
> 我很高兴代表Apache HBase PMC宣布王正已接受我们的邀请,成为Apache
> HBase项目的Committer。感谢王正一直以来为HBase项目做出的贡献,并期待他在未来继续承担更多的责任。
> 
> 欢迎王正!


Re: [VOTE] The third HBase 2.2.6 release candidate (RC2) is available

2020-09-10 Thread Andrew Purtell
[resending to dev@, sorry]

+1 (binding)

* Signature: ok
* Checksum : ok
* Rat check (1.8.0_272): ok
 - mvn clean apache-rat:check
* Built from source (1.8.0_272): ok
 - mvn clean install -DskipTests
* Unit tests pass (1.8.0_272): ok
 - mvn package -P runAllTests

On Thu, Sep 3, 2020 at 8:32 PM Guanghao Zhang  wrote:
>
>> Please vote on this release candidate (RC2) for Apache HBase 2.2.6.
>>
>> The VOTE will remain open for at least 72 hours.
>>
>> [ ] +1 Release this package as Apache HBase 2.2.6
>> [ ] -1 Do not release this package because ...
>>
>> The tag to be voted on is 2.2.6RC2. The release files, including
>> signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/
>>
>> Maven artifacts are available in a staging repository at:
>> https://repository.apache.org/content/repositories/orgapachehbase-1407
>>
>> Signatures used for HBase RCs can be found in this file:
>> https://dist.apache.org/repos/dist/release/hbase/KEYS
>>
>> The list of bug fixes going into 2.2.6 can be found in included
>> CHANGES.md and RELEASENOTES.md available here:
>> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/CHANGES.md
>> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/RELEASENOTES.md
>>
>> A detailed source and binary compatibility report for this release is
>> available at:
>>
>> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/api_compare_2.2.6RC2_to_2.2.5.html
>>
>> To learn more about Apache HBase, please see http://hbase.apache.org/
>>
>> Thanks,
>> Guanghao Zhang
>>
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [VOTE] The third HBase 2.2.6 release candidate (RC2) is available

2020-09-10 Thread Andrew Purtell
+1 (binding)

* Signature: ok
* Checksum : ok
* Rat check (1.8.0_272): ok
 - mvn clean apache-rat:check
* Built from source (1.8.0_272): ok
 - mvn clean install -DskipTests
* Unit tests pass (1.8.0_272): ok
 - mvn package -P runAllTests


On Thu, Sep 3, 2020 at 8:32 PM Guanghao Zhang  wrote:

> Please vote on this release candidate (RC2) for Apache HBase 2.2.6.
>
> The VOTE will remain open for at least 72 hours.
>
> [ ] +1 Release this package as Apache HBase 2.2.6
> [ ] -1 Do not release this package because ...
>
> The tag to be voted on is 2.2.6RC2. The release files, including
> signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/
>
> Maven artifacts are available in a staging repository at:
> https://repository.apache.org/content/repositories/orgapachehbase-1407
>
> Signatures used for HBase RCs can be found in this file:
> https://dist.apache.org/repos/dist/release/hbase/KEYS
>
> The list of bug fixes going into 2.2.6 can be found in included
> CHANGES.md and RELEASENOTES.md available here:
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/CHANGES.md
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/RELEASENOTES.md
>
> A detailed source and binary compatibility report for this release is
> available at:
>
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC2/api_compare_2.2.6RC2_to_2.2.5.html
>
> To learn more about Apache HBase, please see http://hbase.apache.org/
>
> Thanks,
> Guanghao Zhang
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [VOTE] The second HBase 2.2.6 release candidate (RC1) is available

2020-08-27 Thread Andrew Purtell
+1 (binding)

* Signature: ok
* Checksum : ok
* Rat check (1.8.0_262): ok
 - mvn clean apache-rat:check
* Built from source (1.8.0_262): ok
 - mvn clean install -DskipTests
* Unit tests pass (1.8.0_262): ok
 - mvn package -P runAllTests


On Wed, Aug 26, 2020 at 4:04 AM Guanghao Zhang  wrote:

> Please vote on this release candidate (RC) for Apache HBase 2.2.6.
>
> The VOTE will remain open for at least 72 hours.
>
> [ ] +1 Release this package as Apache HBase 2.2.6
> [ ] -1 Do not release this package because ...
>
> The tag to be voted on is 2.2.6RC1. The release files, including
> signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/
>
> Maven artifacts are available in a staging repository at:
> https://repository.apache.org/content/repositories/orgapachehbase-1406/
>
> Signatures used for HBase RCs can be found in this file:
> https://dist.apache.org/repos/dist/release/hbase/KEYS
>
> The list of bug fixes going into 2.2.6 can be found in included
> CHANGES.md and RELEASENOTES.md available here:
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/CHANGES.md
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/RELEASENOTES.md
>
> A detailed source and binary compatibility report for this release is
> available at:
>
> https://dist.apache.org/repos/dist/dev/hbase/2.2.6RC1/api_compare_2.2.6RC1_to_2.2.5.html
>
> To learn more about Apache HBase, please see http://hbase.apache.org/
>
> Thanks,
> Guanghao Zhang
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Removing problematic terms from our project

2020-06-26 Thread Andrew Purtell
Circling back after more inputs, if we use this as a description of the
proposals:

1. Replace "master"/"hmaster" with ???, this one has by far the most
significant impact and both opinion and interpretation on this one is mixed.

2. Replace "slave" with "follower", seems to impact the cross cluster
replication subsystem only.

3. Replace "black list" with "deny list".

4. Replace "white list" with "accept list".

Then by my read of the responses we have consensus to do #2, #3, and #4.
They were not controversial. JIRAs and patches will be welcome. Seems
pretty clear committers and PMC will approve and do what is needed to
complete any necessary deprecation cycle.

Regarding #1, opinion is mixed. By my read I also think committers and PMC
will approve patches and do what is needed to complete any necessary
deprecation cycle for this one too. Enough PMC members expressed support to
successfully vote on a release (although not if there were to be opposing
votes). If a contributor were to open a JIRA and provide patches for this,
there would be more discussion. There is no consensus, yet, on what
replacement term is best. Personally, I can accept Zheng's recent
suggestion of "controller". I can see how syllable count matters.

I don't mean this summary to close the conversation. It is only a
checkpoint.

If anyone reading this has an opinion they do not wish to express
publically, you are welcome to write to priv...@hbase.apache.org to state
your opinion and the PMC will of course respectfully listen to it.



On Thu, Jun 25, 2020 at 7:47 PM zheng wang <18031...@qq.com> wrote:

> I like thecontroller.
>
>
> Coordinator is a bit long for me to write and speak.
> Manager and Admin is used somewhere yet in HBase.
>
>
>
>
> --原始邮件--
> 发件人:"Andrew Purtell" 发送时间:2020年6月26日(星期五) 上午9:08
> 收件人:"Hbase-User" 抄送:"dev" 主题:Re: [DISCUSS] Removing problematic terms from our project
>
>
>
>  - AdminServer (as you already have AdminClient to talk to it).
>
> Oh... I like AdminServer. AdminServer (serving admin functions) and
> RegionServer (serving region data).
>
> On Thu, Jun 25, 2020 at 4:46 PM Andrey Elenskiy
> 
>   Is there a word that's not "master" and not "coordinator" that
> is clear
>  and
>  suitable for (diverse, polyglot) community?
> 
>  There are also:
>  - captain (sounds pretty close to "master" without the negative side
> and it
>  should be relatable around the world)
>  - conductor (as in orchestra)
>  - controller (in kafka controller assigns partitions)
>  - RegionDriver (more relevant to what it's actually doing in hbase and
>  borrowed from PlacementDrive of TiKV)
>  - AdminServer (as you already have AdminClient to talk to it).
> 
>  On Thu, Jun 25, 2020 at 3:49 PM Sean Busbey  wrote:
> 
>   How about "manager"?
>  
>   (It would help me if folks could explain what is lacking in
>  "coordinator".)
>  
>   On Thu, Jun 25, 2020, 13:32 Nick Dimiduk  wrote:
>  
>On Wed, Jun 24, 2020 at 10:14 PM 张铎(Duo Zhang) <
> palomino...@gmail.com
>wrote:
>   
> -0/+1/+1/+1
>
> I’m the one who asked whether ‘master’ is safe to use
> without ‘slave’
>   in
> the private list.
>
> I’m still not convinced that it is really necessary
> and I do not
>  think
> other words like ‘coordinator’ can fully describe the
> role of HMaster
>   in
> HBase. HBase is more than 10 years old. In the context
> of HBase, the
>   word
> ‘HMaster’ has its own meaning. Changing the name will
> hurt our users
>   and
> make them confusing, especially for us non native
> English speakers...
>
>   
>Is there a word that's not "master" and not "coordinator"
> that is clear
>   and
>suitable for (diverse, polyglot) community?
>   
>Stack 
>  +1/+1/+1/+1 where hbase3 adds the deprecation and
> hbase4 follows
>   hbase3
>  soon after sounds good to me. I'm up for working
> on this.
>  S
> 
>  On Wed, Jun 24, 2020 at 2:26 PM Xu Cang <
> xuc...@apache.org wrote:
> 
>   Strongly agree with what Nick said here:
>  
>   " From my perspective, we gain nothing
> as a project or as a
>community
> be
>   willfully retaining use of language that is
> well understood to be
>   problematic or hurtful, On the contrary,
> we have much to gain
>   by
>   encouraging
>   contributions from as many people as
> possible."
>  
>   +1 to Andrew's proposal.
>  
> 

Re: [DISCUSS] Removing problematic terms from our project

2020-06-25 Thread Andrew Purtell
ould have daemons called “coordinator” and “region
> > > server”.
> > > > > > >
> > > > > > > To me, “master” as in “master branch” does not carry the same
> > > > baggage,
> > > > > > but
> > > > > > > I’m also in favor changing the name of our default branch to a
> > word
> > > > > that
> > > > > > is
> > > > > > > less conflicted. I see nothing that we gain as a community by
> > > > > continuing
> > > > > > to
> > > > > > > use this word.
> > > > > > >
> > > > > > > It seems to me we have, broadly speaking, consensus around
> making
> > > > > *some*
> > > > > > > > changes. I haven't seen a strong push for "break everything
> in
> > > the
> > > > > name
> > > > > > > of
> > > > > > > > expediency" (I would personally be fine with this). So
> barring
> > > > > > additional
> > > > > > > > discussion that favors breaking changes, current approaches
> > > should
> > > > > > > comport
> > > > > > > > with our existing project compatibility goals.
> > > > > > > >
> > > > > > > > Maybe we could stop talking about what-ifs and look at actual
> > > > > practical
> > > > > > > > examples? If anyone is currently up for doing the work of a
> PR
> > we
> > > > can
> > > > > > > look
> > > > > > > > at for one of these?
> > > > > > > >
> > > > > > > > If folks would prefer we e.g. just say "we should break
> > whatever
> > > we
> > > > > > need
> > > > > > > to
> > > > > > > > in 3.0.0 to make this happen" then it would be good to speak
> > up.
> > > > > > > Otherwise
> > > > > > > > likely we would be done with needed changes circa hbase 4,
> > > probably
> > > > > > late
> > > > > > > > 2021 or 2022.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Jun 23, 2020, 03:03 zheng wang <18031...@qq.com>
> > wrote:
> > > > > > > >
> > > > > > > > > IMO, master is ok if not used with slave together.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -1/+1/+1/+1
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --原始邮件--
> > > > > > > > > 发件人:"Andrew Purtell" > > > > > > > > 发送时间:2020年6月23日(星期二) 凌晨5:24
> > > > > > > > > 收件人:"Hbase-User" > > > > > > > > 抄送:"dev" > > > > > > > > 主题:Re: [DISCUSS] Removing problematic terms from our
> > > > project
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In observing something like voting happening on this thread
> > to
> > > > > > express
> > > > > > > > > alignment or not, it might be helpful to first, come up
> with
> > a
> > > > list
> > > > > > of
> > > > > > > > > terms to change (if any), and then propose replacements,
> > > > > > individually.
> > > > > > > So
> > > > > > > > > far we might break this apart into four proposals:
> > > > > > > > >
> > > > > > > > > 1. Replace "master"/"hmaster" with ??? ("coordinator" is
> one
> > > > > option),
> > > > > > > > this
> > > > > > > > > one has by far the most significant impact and both opinion
> > and
> > > > > > > > > interpretation on this one is mixed.
> > > > > > > > >
> > > > > > > > > 2. Replace "slave" with "follower", seems to impact the
> cross
> > > > &g

Re: HBase 2 slower than HBase 1?

2020-06-25 Thread Andrew Purtell















BlockEncoding=FAST_DIFF

















Compress=NONE
Filter=ALL







































On Thu, Jun 11, 2020 at 5:08 PM Andrew Purtell  wrote:

> I used PE to generate 10M row tables with one family with either 1, 10,
> 20, 50, or 100 values per row (unique column-qualifiers). An increase in
> wall clock time was noticeable, for example:
>
> 1.6.0
>
> time ./bin/hbase pe --rows=500 --table=TestTable_f1_c20 --columns=20
> --nomapred scan 2
> real 1m20.179s
> user 0m45.470s
> sys 0m16.572s
>
> 2.2.5
>
> time ./bin/hbase pe --rows=500 --table=TestTable_f1_c20 --columns=20
> --nomapred scan 2
> real 1m31.016s
> user 0m48.658s
> sys 0m16.965s
>
> It didnt really make a difference if I used 1 thread or 4 or 10, the delta
> was about the same, proportionally. I picked two threads in the end so I'd
> have enough time to launch async-profiler twice in another shell, to
> capture flamegraph and call tree output, respectively. async-profiler
> captured 10 seconds at steady state per test case. Upon first inspection
> what jumps out is an increasing proportion of CPU time spent in GC in 2.2.5
> vs 1.6.0. The difference increases as the number of column families
> increase. There is little apparent difference at 1 column family, but a 2x
> or more difference at 20 columns, and a 10x or more difference at 100
> columns, eyeballing the charts, flipping back and forth between browser
> windows. This seems more than coincidental but obviously calls for capture
> and analysis of GC trace, with JFR. Will do that next.
>
> JVM: openjdk version "1.8.0_232" OpenJDK Runtime Environment (Zulu
> 8.42.0.21-CA-macosx) (build 1.8.0_232-b18) OpenJDK 64-Bit Server VM (Zulu
> 8.42.0.21-CA-macosx) (build 25.232-b18, mixed mode)
>
> Regionserver JVM flags: -Xms10g -Xmx10g -XX:+UseG1GC -XX:+AlwaysPreTouch
> -XX:+UseNUMA -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>
>
> On Thu, Jun 11, 2020 at 7:06 AM Jan Van Besien  wrote:
>
>> This is promising, thanks a lot. Testing with hbase 2.2.5 shows an
>> improvement, but we're not there yet.
>>
>> As reported earlier, hbase 2.1.0 was about 60% slower than hbase 1.2.0
>> in a test that simply scans all the regions in parallel without any
>> filter. A test with hbase 2.2.5 shows it to be about 40% slower than
>> 1.2.0. So that is better than 2.1.0, but still substantially slower
>> than what hbase 1.2.0 was.
>>
>> As before, I tested this both on a 3 node cluster as well as with a
>> unit test using HBaseTestingUtility. Both tests show very similar
>> relative differences.
>>
>> Jan
>>
>> On Thu, Jun 11, 2020 at 2:16 PM Anoop John  wrote:
>> >
>> > In another mail thread Zheng Hu brought up an important Jra fix
>> > https://issues.apache.org/jira/browse/HBASE-21657
>> > Can u pls check with this once?
>> >
>> > Anoop
>> >
>> >
>> > On Tue, Jun 9, 2020 at 8:08 PM Jan Van Besien  wrote:
>> >
>> > > On Sun, Jun 7, 2020 at 7:49 AM Anoop John 
>> wrote:
>> > > > As per the above configs, it looks like Bucket Cache is not being
>> used.
>> > > > Only on heap LRU cache in use.
>> > >
>> > > True (but it is large enough to hold everything, so I don't think it
>> > > matters).
>> > >
>> > > > @Jan - Is it possible for you to test with off heap Bucket Cache?
>> > >  Config
>> > > > bucket cache off heap mode with size ~7.5 GB
>> > >
>> > > I did a quick test but it seems not to make a measurable difference.
>> > > If anything, it is actually slightly slower even. I see 100% hit ratio
>> > > in the L1
>> > > LruBlockCache and effectively also 100% in the L2 BucketCache (hit
>> > > ratio is not yet at 100% but hits increase with every test and misses
>> > > do not).
>> > >
>> > > Given that the LruBlockCache was already large enough to cache all the
>> > > data anyway, I did not expect this to help either, to be honest.
>> > >
>> > > > Do you have any DataBlockEncoding enabled on the CF?
>> > >
>> > > Yes, FAST_DIFF. But this is of course true in both the tests with
>> > > hbase2 and hbase1.
>> > >
>> > > Jan
>> > >
>>
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>- A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Removing problematic terms from our project

2020-06-23 Thread Andrew Purtell
If we were going to do this such that replacements for "master" land in
HBase 4, then we should commit to doing it by 2021. That gives us
approximately six months. Contributors who feel this is a priority can do
the work on trunk and everyone else can continue as they like, with the
understanding they may have higher than average work at each rebase. There
would be a stabilization effort on trunk (yeah...) to prepare for an HBase
3 release with all necessary deprecations in place. That would be the bulk
of the time. Immediately after cutting HBase 3 we could make the
substitutions and release HBase 4 with no other changes. If we do it this
way the release of HBase 4 would follow 3 but at most a few weeks. There is
no reason to think we need much of 2021 for this, never mind 2022.



On Tue, Jun 23, 2020 at 1:11 PM Sean Busbey  wrote:

> I would like to make sure I am emphatically clear that "master" by itself
> is not okay if the context is the same as what would normally be a
> master/slave context. Furthermore our use of master is clearly such a
> context.
>
> It seems to me we have, broadly speaking, consensus around making *some*
> changes. I haven't seen a strong push for "break everything in the name of
> expediency" (I would personally be fine with this). So barring additional
> discussion that favors breaking changes, current approaches should comport
> with our existing project compatibility goals.
>
> Maybe we could stop talking about what-ifs and look at actual practical
> examples? If anyone is currently up for doing the work of a PR we can look
> at for one of these?
>
> If folks would prefer we e.g. just say "we should break whatever we need to
> in 3.0.0 to make this happen" then it would be good to speak up. Otherwise
> likely we would be done with needed changes circa hbase 4, probably late
> 2021 or 2022.
>
>
> On Tue, Jun 23, 2020, 03:03 zheng wang <18031...@qq.com> wrote:
>
> > IMO, master is ok if not used with slave together.
> >
> >
> > -1/+1/+1/+1
> >
> >
> > --原始邮件--
> > 发件人:"Andrew Purtell" > 发送时间:2020年6月23日(星期二) 凌晨5:24
> > 收件人:"Hbase-User" > 抄送:"dev" > 主题:Re: [DISCUSS] Removing problematic terms from our project
> >
> >
> >
> > In observing something like voting happening on this thread to express
> > alignment or not, it might be helpful to first, come up with a list of
> > terms to change (if any), and then propose replacements, individually. So
> > far we might break this apart into four proposals:
> >
> > 1. Replace "master"/"hmaster" with ??? ("coordinator" is one option),
> this
> > one has by far the most significant impact and both opinion and
> > interpretation on this one is mixed.
> >
> > 2. Replace "slave" with "follower", seems to impact the cross cluster
> > replication subsystem only.
> >
> > 3. Replace "black list" with "deny list".
> >
> > 4. Replace "white list" with "accept list".
> >
> > Perhaps if you are inclined to respond with a +1/-1/+0/-0, it would be
> > useful to give such an indication for each line item above. Or, offer
> > alternative proposals. Or, if you have a singular opinion, that's fine
> too.
> >
> >
> >
> > On Mon, Jun 22, 2020 at 2:09 PM Geoffrey Jacoby  > wrote:
> >
> >  For most of the proposals (slave - worker, blacklist -
> > denylist,
> >  whitelist- allowlist), I'm +1 (nonbinding). Denylist and
> > acceptlist even
> >  have the advantage of being clearer than the terms they're
> replacing.
> > 
> >  However, I'm not convinced about changing "master" to "coordinator",
> > or
> >  something similar. Unlike "slave", which is negative in any context,
> >  "master" has many definitions, including some common ones which do
> not
> >  appear problematic. See
> > https://www.merriam-webster.com/dictionary/master
> >  <https://www.merriam-webster.com/dictionary/master>; for
> >  examples. In particular, the progression of an artisan was from
> >  "apprentice" to "journeyman" to "master". A master smith, carpenter,
> > or
> >  artist would run a shop managing lots of workers and apprentices who
> > would
> >  hope to become masters of their own someday. So "master" and
> "worker"
> > can
> >  still go together.
> > 
> >  Since it's the least problematic term, and by far the hardest term
> to
> >  cha

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
Regarding "slave", it is a stretch to point to an esoteric technical
definition and ask someone to pretend like all the other pejorative
meanings relating to power relationship are somehow not meaningful. If we
were to be accused of "turning a blind eye", that charge would stick, in my
opinion.



On Mon, Jun 22, 2020 at 2:23 PM Mich Talebzadeh 
wrote:

> Let us look at what *slave* mean
>
> According to the merriam-webster
>
> https://www.merriam-webster.com/dictionary/slave
>
> Definition of *slave*
>
>  (Entry 1 of 4)
> 1: a person held in servitude as the chattel of another
> 2: one that is completely subservient to a dominating influence
> 3: a device (such as the printer of a computer) that is directly responsive
> to another
> 4: DRUDGE <https://www.merriam-webster.com/dictionary/drudge>, TOILER
> <https://www.merriam-webster.com/dictionary/toiler>
> so in the context of Hbase, number *3* is valid. In other words, a
> component which is directly responsive to another, another being *master*.
>
>
> <https://www.merriam-webster.com/dictionary/slave>
>
>
>
> LinkedIn *
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >*
>
>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Mon, 22 Jun 2020 at 22:09, Geoffrey Jacoby  wrote:
>
> > For most of the proposals (slave -> worker, blacklist -> denylist,
> > whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> > have the advantage of being clearer than the terms they're replacing.
> >
> > However, I'm not convinced about changing "master" to "coordinator", or
> > something similar. Unlike "slave", which is negative in any context,
> > "master" has many definitions, including some common ones which do not
> > appear problematic. See
> https://www.merriam-webster.com/dictionary/master
> > for
> > examples. In particular, the progression of an artisan was from
> > "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> > artist would run a shop managing lots of workers and apprentices who
> would
> > hope to become masters of their own someday. So "master" and "worker" can
> > still go together.
> >
> > Since it's the least problematic term, and by far the hardest term to
> > change (both within HBase and with effects on downstream projects such as
> > Ambari), I'm -0 (nonbinding) on changing "master".
> >
> > Geoffrey
> >
> > On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
> >  wrote:
> >
> > > +1 to renaming.
> > >
> > >
> > > Rushabh Shah
> > >
> > >- Software Engineering SMTS | Salesforce
> > >-
> > >   - Mobile: 213 422 9052
> > >
> > >
> > >
> > > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> > >
> > > > +1
> > > >
> > > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > > We should change our use of these terms. We can be equally or more
> > > clear
> > > > in
> > > > > what we are trying to convey where they are present.
> > > > >
> > > > > That they have been used historically is only useful if the
> advantage
> > > we
> > > > > gain from using them through that shared context outweighs the
> > > potential
> > > > > friction they add. They make me personally less enthusiastic about
> > > > > contributing. That's enough friction for me to advocate removing
> > them.
> > > > >
> > > > > AFAICT reworking our replication stuff in terms of "active" and
> > > "passive"
> > > > > clusters did not result in a big spike of folks asking new
> questions
> > > > about
> > > > > where authority for state was.
> > > > >
> > > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > > wrote:
> > > > >
> > > > >> In response to renewed attention at the Foundation toward
> addressing
> >

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
In observing something like voting happening on this thread to express
alignment or not, it might be helpful to first, come up with a list of
terms to change (if any), and then propose replacements, individually. So
far we might break this apart into four proposals:

1. Replace "master"/"hmaster" with ??? ("coordinator" is one option), this
one has by far the most significant impact and both opinion and
interpretation on this one is mixed.

2. Replace "slave" with "follower", seems to impact the cross cluster
replication subsystem only.

3. Replace "black list" with "deny list".

4. Replace "white list" with "accept list".

Perhaps if you are inclined to respond with a +1/-1/+0/-0, it would be
useful to give such an indication for each line item above. Or, offer
alternative proposals. Or, if you have a singular opinion, that's fine too.



On Mon, Jun 22, 2020 at 2:09 PM Geoffrey Jacoby  wrote:

> For most of the proposals (slave -> worker, blacklist -> denylist,
> whitelist-> allowlist), I'm +1 (nonbinding). Denylist and acceptlist even
> have the advantage of being clearer than the terms they're replacing.
>
> However, I'm not convinced about changing "master" to "coordinator", or
> something similar. Unlike "slave", which is negative in any context,
> "master" has many definitions, including some common ones which do not
> appear problematic. See https://www.merriam-webster.com/dictionary/master
> for
> examples. In particular, the progression of an artisan was from
> "apprentice" to "journeyman" to "master". A master smith, carpenter, or
> artist would run a shop managing lots of workers and apprentices who would
> hope to become masters of their own someday. So "master" and "worker" can
> still go together.
>
> Since it's the least problematic term, and by far the hardest term to
> change (both within HBase and with effects on downstream projects such as
> Ambari), I'm -0 (nonbinding) on changing "master".
>
> Geoffrey
>
> On Mon, Jun 22, 2020 at 1:32 PM Rushabh Shah
>  wrote:
>
> > +1 to renaming.
> >
> >
> > Rushabh Shah
> >
> >- Software Engineering SMTS | Salesforce
> >-
> >   - Mobile: 213 422 9052
> >
> >
> >
> > On Mon, Jun 22, 2020 at 1:18 PM Josh Elser  wrote:
> >
> > > +1
> > >
> > > On 6/22/20 4:03 PM, Sean Busbey wrote:
> > > > We should change our use of these terms. We can be equally or more
> > clear
> > > in
> > > > what we are trying to convey where they are present.
> > > >
> > > > That they have been used historically is only useful if the advantage
> > we
> > > > gain from using them through that shared context outweighs the
> > potential
> > > > friction they add. They make me personally less enthusiastic about
> > > > contributing. That's enough friction for me to advocate removing
> them.
> > > >
> > > > AFAICT reworking our replication stuff in terms of "active" and
> > "passive"
> > > > clusters did not result in a big spike of folks asking new questions
> > > about
> > > > where authority for state was.
> > > >
> > > > On Mon, Jun 22, 2020, 13:39 Andrew Purtell 
> > wrote:
> > > >
> > > >> In response to renewed attention at the Foundation toward addressing
> > > >> culturally problematic language and terms often used in technical
> > > >> documentation and discussion, several projects have begun
> discussions,
> > > or
> > > >> made proposals, or started work along these lines.
> > > >>
> > > >> The HBase PMC began its own discussion on private@ on June 9, 2020
> > > with an
> > > >> observation of this activity and this suggestion:
> > > >>
> > > >> There is a renewed push back against classic technology industry
> terms
> > > that
> > > >> have negative modern connotations.
> > > >>
> > > >> In the case of HBase, the following substitutions might be proposed:
> > > >>
> > > >> - Coordinator instead of master
> > > >>
> > > >> - Worker instead of slave
> > > >>
> > > >> Recommendations for these additional substitutions also come up in
> > this
> > > >> type of discussion:
> > > >>
> > > >> - Accept list ins

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
Thank you Mich. 

Hopefully it is clear that there is no community consensus yet, and all voices 
are welcome on the topic. 

> On Jun 22, 2020, at 12:15 PM, Mich Talebzadeh  
> wrote:
> 
> Hi,
> 
> Thank you for the proposals.
> 
> I am afraid I have to agree to differ. The term master and slave (commonly
> used in Big data tools (not confined to HBase only) is BAU and historical)
> and bears no resemblance to anything recent.
> 
> Additionally, both whitelist and blacklist simply refer to a proposal which
> is accepted and a proposal which is struck out (black pencil line).
> 
> So in scientific context these are terminologies used. Terminologies become
> offensive if they are used "in the incorrect context". I don't think anyone
> in HBase or Spark community will have objections if these terminologies are
> used as before. Spark used the term in master/slave in Standalone mode if i
> recall correctly.
> 
> Changing something for the sake of "now being in the limelight" does not
> make it right. So I beg to differ on this. Having said that it is indeed a
> sign of a civilised mind to entertain an idea without accepting it so
> whatever the community wishes.
> 
> HTH
> 
> 
> 
> 
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
> 
> 
> 
> 
> 
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
> 
> 
> 
> 
>> On Mon, 22 Jun 2020 at 19:39, Andrew Purtell  wrote:
>> 
>> In response to renewed attention at the Foundation toward addressing
>> culturally problematic language and terms often used in technical
>> documentation and discussion, several projects have begun discussions, or
>> made proposals, or started work along these lines.
>> 
>> The HBase PMC began its own discussion on private@ on June 9, 2020 with an
>> observation of this activity and this suggestion:
>> 
>> There is a renewed push back against classic technology industry terms that
>> have negative modern connotations.
>> 
>> In the case of HBase, the following substitutions might be proposed:
>> 
>> - Coordinator instead of master
>> 
>> - Worker instead of slave
>> 
>> Recommendations for these additional substitutions also come up in this
>> type of discussion:
>> 
>> - Accept list instead of white list
>> 
>> - Deny list instead of black list
>> 
>> Unfortunately we have Master all over our code base, baked into various
>> APIs and configuration variable names, so for us the necessary changes
>> amount to a new major release and deprecation cycle. It could well be worth
>> it in the long run. We exist only as long as we draw a willing and
>> sufficient contributor community. It also wouldn’t be great to have an
>> activist fork appear somewhere, even if unlikely to be successful.
>> 
>> Relevant JIRAs are:
>> 
>>   - HBASE-12677 <https://issues.apache.org/jira/browse/HBASE-12677>:
>>   Update replication docs to clarify terminology
>>   - HBASE-13852 <https://issues.apache.org/jira/browse/HBASE-13852>:
>>   Replace master-slave terminology in book, site, and javadoc with a more
>>   modern vocabulary
>>   - HBASE-24576 <https://issues.apache.org/jira/browse/HBASE-24576>:
>>   Changing "whitelist" and "blacklist" in our docs and project
>> 
>> In response to this proposal, a member of the PMC asked if the term
>> 'master' used by itself would be fine, because we only have use of 'slave'
>> in replication documentation and that is easily addressed. In response to
>> this question, others on the PMC suggested that even if only 'master' is
>> used, in this context it is still a problem.
>> 
>> For folks who are surprised or lacking context on the details of this
>> discussion, one PMC member offered a link to this draft RFC as background:
>> https://tools.ietf.org/id/draft-knodel-terminology-00.html
>> 
>> There was general support for removing the term "master" / "hmaster" from
>> our code base and using the terms "coordinator" or "leader" instead. In the
>> context of replication, "worker" makes less sense and perhaps "destination"
>

[DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Andrew Purtell
In response to renewed attention at the Foundation toward addressing
culturally problematic language and terms often used in technical
documentation and discussion, several projects have begun discussions, or
made proposals, or started work along these lines.

The HBase PMC began its own discussion on private@ on June 9, 2020 with an
observation of this activity and this suggestion:

There is a renewed push back against classic technology industry terms that
have negative modern connotations.

In the case of HBase, the following substitutions might be proposed:

- Coordinator instead of master

- Worker instead of slave

Recommendations for these additional substitutions also come up in this
type of discussion:

- Accept list instead of white list

- Deny list instead of black list

Unfortunately we have Master all over our code base, baked into various
APIs and configuration variable names, so for us the necessary changes
amount to a new major release and deprecation cycle. It could well be worth
it in the long run. We exist only as long as we draw a willing and
sufficient contributor community. It also wouldn’t be great to have an
activist fork appear somewhere, even if unlikely to be successful.

Relevant JIRAs are:

   - HBASE-12677 :
   Update replication docs to clarify terminology
   - HBASE-13852 :
   Replace master-slave terminology in book, site, and javadoc with a more
   modern vocabulary
   - HBASE-24576 :
   Changing "whitelist" and "blacklist" in our docs and project

In response to this proposal, a member of the PMC asked if the term
'master' used by itself would be fine, because we only have use of 'slave'
in replication documentation and that is easily addressed. In response to
this question, others on the PMC suggested that even if only 'master' is
used, in this context it is still a problem.

For folks who are surprised or lacking context on the details of this
discussion, one PMC member offered a link to this draft RFC as background:
https://tools.ietf.org/id/draft-knodel-terminology-00.html

There was general support for removing the term "master" / "hmaster" from
our code base and using the terms "coordinator" or "leader" instead. In the
context of replication, "worker" makes less sense and perhaps "destination"
or "follower" would be more appropriate terms.

One PMC member's thoughts on language and non-native English speakers is
worth including in its entirety:

While words like blacklist/whitelist/slave clearly have those negative
references, word master might not have the same impact for non native
English speakers like myself where the literal translation to my mother
tongue does not have this same bad connotation. Replacing all references
for word *master *on our docs/codebase is a huge effort, I guess such a
decision would be more suitable for native English speakers folks, and
maybe we should consider the opinion of contributors from that ethinic
minority as well?

These are good questions for public discussion.

We have a consensus in the PMC, at this time, that is supportive of making
the above discussed terminology changes. However, we also have concerns
about what it would take to accomplish meaningful changes. Several on the
PMC offered support in the form of cycles to review pull requests and
patches, and two PMC members offered  personal bandwidth for creating and
releasing new code lines as needed to complete a deprecation cycle.

Unfortunately, the terms "master" and "hmaster" appear throughout our code
base in class names, user facing API subject to our project compatibility
guidelines, and configuration variable names, which are also implicated by
compatibility guidelines given the impact of changes to operators and
operations. The changes being discussed are not backwards compatible
changes and cannot be executed with swiftness while simultaneously
preserving compatibility. There must be a deprecation cycle. First, we must
tag all implicated public API and configuration variables as deprecated,
and release HBase 3 with these deprecations in place. Then, we must
undertake rename and removal as appropriate, and release the result as
HBase 4.

One PMC member raised a question in this context included here in entirety:

Are we willing to commit to rolling through the major versions at a pace
that's necessary to make this transition as swift as
reasonably possible?

This is a question for all of us. For the PMC, who would supervise the
effort, perhaps contribute to it, and certainly vote on the release
candidates. For contributors and potential contributors, who would provide
the necessary patches. For committers, who would be required to review and
commit the relevant changes.

Although there has been some initial discussion, there is no singular
proposal, or plan, or set of decisions made at this time. Wrestling 

Re: HBase 2 slower than HBase 1?

2020-06-11 Thread Andrew Purtell
I used PE to generate 10M row tables with one family with either 1, 10, 20,
50, or 100 values per row (unique column-qualifiers). An increase in wall
clock time was noticeable, for example:

1.6.0

time ./bin/hbase pe --rows=500 --table=TestTable_f1_c20 --columns=20
--nomapred scan 2
real 1m20.179s
user 0m45.470s
sys 0m16.572s

2.2.5

time ./bin/hbase pe --rows=500 --table=TestTable_f1_c20 --columns=20
--nomapred scan 2
real 1m31.016s
user 0m48.658s
sys 0m16.965s

It didnt really make a difference if I used 1 thread or 4 or 10, the delta
was about the same, proportionally. I picked two threads in the end so I'd
have enough time to launch async-profiler twice in another shell, to
capture flamegraph and call tree output, respectively. async-profiler
captured 10 seconds at steady state per test case. Upon first inspection
what jumps out is an increasing proportion of CPU time spent in GC in 2.2.5
vs 1.6.0. The difference increases as the number of column families
increase. There is little apparent difference at 1 column family, but a 2x
or more difference at 20 columns, and a 10x or more difference at 100
columns, eyeballing the charts, flipping back and forth between browser
windows. This seems more than coincidental but obviously calls for capture
and analysis of GC trace, with JFR. Will do that next.

JVM: openjdk version "1.8.0_232" OpenJDK Runtime Environment (Zulu
8.42.0.21-CA-macosx) (build 1.8.0_232-b18) OpenJDK 64-Bit Server VM (Zulu
8.42.0.21-CA-macosx) (build 25.232-b18, mixed mode)

Regionserver JVM flags: -Xms10g -Xmx10g -XX:+UseG1GC -XX:+AlwaysPreTouch
-XX:+UseNUMA -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled


On Thu, Jun 11, 2020 at 7:06 AM Jan Van Besien  wrote:

> This is promising, thanks a lot. Testing with hbase 2.2.5 shows an
> improvement, but we're not there yet.
>
> As reported earlier, hbase 2.1.0 was about 60% slower than hbase 1.2.0
> in a test that simply scans all the regions in parallel without any
> filter. A test with hbase 2.2.5 shows it to be about 40% slower than
> 1.2.0. So that is better than 2.1.0, but still substantially slower
> than what hbase 1.2.0 was.
>
> As before, I tested this both on a 3 node cluster as well as with a
> unit test using HBaseTestingUtility. Both tests show very similar
> relative differences.
>
> Jan
>
> On Thu, Jun 11, 2020 at 2:16 PM Anoop John  wrote:
> >
> > In another mail thread Zheng Hu brought up an important Jra fix
> > https://issues.apache.org/jira/browse/HBASE-21657
> > Can u pls check with this once?
> >
> > Anoop
> >
> >
> > On Tue, Jun 9, 2020 at 8:08 PM Jan Van Besien  wrote:
> >
> > > On Sun, Jun 7, 2020 at 7:49 AM Anoop John 
> wrote:
> > > > As per the above configs, it looks like Bucket Cache is not being
> used.
> > > > Only on heap LRU cache in use.
> > >
> > > True (but it is large enough to hold everything, so I don't think it
> > > matters).
> > >
> > > > @Jan - Is it possible for you to test with off heap Bucket Cache?
> > >  Config
> > > > bucket cache off heap mode with size ~7.5 GB
> > >
> > > I did a quick test but it seems not to make a measurable difference.
> > > If anything, it is actually slightly slower even. I see 100% hit ratio
> > > in the L1
> > > LruBlockCache and effectively also 100% in the L2 BucketCache (hit
> > > ratio is not yet at 100% but hits increase with every test and misses
> > > do not).
> > >
> > > Given that the LruBlockCache was already large enough to cache all the
> > > data anyway, I did not expect this to help either, to be honest.
> > >
> > > > Do you have any DataBlockEncoding enabled on the CF?
> > >
> > > Yes, FAST_DIFF. But this is of course true in both the tests with
> > > hbase2 and hbase1.
> > >
> > > Jan
> > >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: HBase 2 slower than HBase 1?

2020-06-04 Thread Andrew Purtell
The same settings, same instances, same jvm for 1.6.0 and 2.2.4.

On Thu, Jun 4, 2020 at 3:35 PM Tak-Lon Wu  wrote:

> hey guys, I got a question on the performance test between 1.6.0 and 2.2.4
> .
>
> To Andrew, did you turn on the performance tuning on 1.6.0 as well ? or
> did you run it without any configuration on 1.6.0 ?
>
> GC: -XX:+UseShenandoahGC -Xms31g -Xmx31g -XX:+AlwaysPreTouch -XX:+UseNUMA
> -XX:-UseBiasedLocking
> Non-default settings:
> hbase.regionserver.handler.count=256
> hbase.ipc.server.callqueue.type=codel
> dfs.client.read.shortcircuit=true
>
> Thanks,
> Stephen
>
>
> On 2020/05/25 11:28:46, Bruno Dumon  wrote:
> > Thanks a lot for doing this test. Its results are encouraging. My
> > non-cluster testing was more focussed on full table scans, which YSCB
> does
> > not do. The full table scans are only done by batch jobs, so if they are
> a
> > bit slower it is not much of a problem, but in our case they seemed a lot
> > slower.
> >
> > I agree that testing overall performance on a non-cluster environment is
> > not a good idea, but it doesn't seem unreasonable when focussing on a
> > specific algorithm? I only started testing in this manner after noticing
> > problems in cluster-based tests.
> >
> > I meanwhile tried a variant of my test where I used the same number of
> > cells, but spread over much less rows, in total 1500 rows of each 10K
> > (small) cells. In that test, the difference in the scan speed is much
> lower
> > (HBase 2.2.4 being only about 10% slower). This suggests that the
> slowdown
> > in HBase 2 might be due to things that happen per row being scanned.
> >
> > Anyway, we'll do some further testing, also with our normal workloads on
> > clusters, and try to further analyse it.
> >
> >
> > On Fri, May 22, 2020 at 1:52 AM Andrew Purtell 
> wrote:
> >
> > > It depends what you are measuring and how. I test every so often with
> YCSB,
> > > which admittedly is not representative of real world workloads but is
> > > widely used for apples to apples testing among datastores, and we can
> apply
> > > the same test tool and test methodology to different versions to get
> > > meaningful results. I also test on real clusters. The single all-in-one
> > > process zk+master+regionserver "minicluster" cannot provide you
> meaningful
> > > performance data. Only distributed clusters can provide meaningful
> results.
> > > Some defaults are also important to change, like the number of RPC
> handlers
> > > you plan to use in production.
> > >
> > > After reading this thread I tested 1.6.0 and 2.2.4 using my standard
> > > methodology, described below. 2.2.4 is better, often significantly
> better,
> > > in most measures in most cases.
> > >
> > > Cluster: AWS Amazon Linux AMI, 1 x master, 5 x regionserver, 1 x
> client,
> > > m5d.4xlarge
> > > Hadoop: 2.10.0, ZK: 3.4.14
> > >
> > >
> > > JVM: 8u252 shenandoah (provided by AMI)
> > >
> > >
> > > GC: -XX:+UseShenandoahGC -Xms31g -Xmx31g -XX:+AlwaysPreTouch
> -XX:+UseNUMA
> > > -XX:-UseBiasedLocking
> > > Non-default settings: hbase.regionserver.handler.count=256
> > > hbase.ipc.server.callqueue.type=codel dfs.client.read.shortcircuit=true
> > > Methodology:
> > >
> > >
> > >   1. Create 100M row base table (ROW_INDEX_V1 encoding, ZSTANDARD
> > > compression)
> > >   2. Snapshot base table
> > >
> > >
> > >   3. Enable balancer
> > >
> > >
> > >   4. Clone test table from base table snapshot
> > >
> > >
> > >   5. Balance, then disable balancer
> > >
> > >
> > >   6. Run YCSB 0.18 workload --operationcount 100 (1M rows)
> -threads 200
> > > -target 10 (100k/ops/sec)
> > >   7. Drop test table
> > >
> > >
> > >   8. Back to step 3 until all workloads complete
> > >
> > >
> > >
> > >
> > >
> > >
> > > Workload A 1.6.0 2.2.4 Difference
> > > [OVERALL], RunTime(ms) 20552 20655 100.50%
> > > [OVERALL], Throughput(ops/sec) 97314 96829 99.50%
> > > [READ], AverageLatency(us) 591 418 70.75%
> > > [READ], MinLatency(us) 191 201 105.24%
> > > [READ], MaxLatency(us) 146047 80895 55.39%
> > > [READ], 95thPercentileLatency(us) 3013 542 17.99%
> > > [READ], 99thPercentileLatency(us) 5427 2559 47.15%
> > > [UPDATE], AverageLatenc

Re: [ANNOUNCE] Please welcome Lijin Bin to the HBase PMC

2020-05-25 Thread Andrew Purtell
Congratulations and welcome, Lijin Bin.

On Mon, May 25, 2020 at 7:40 AM Guanghao Zhang  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that Lijin Bin
> has accepted our invitation to become a PMC member on the Apache HBase
> project. We appreciate Lijin Bin stepping up to take more responsibility in
> the HBase project.
>
> Please join me in welcoming Lijin Bin to the HBase PMC!
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: HBase 2 slower than HBase 1?

2020-05-22 Thread Andrew Purtell
Thank you Lars. I suppose it is not possible to characterize the problem with 
anonymous detail enough to provide some clues for follow up, or you would have 
done it. 

> On May 22, 2020, at 6:01 AM, Lars Francke  wrote:
> 
> I've refrained from commenting here so far because I cannot share much/any
> data but I can also report that we've seen worse performance with HBase 2
> (similar/same settings and same workload, same hardware). This is on a 40+
> node cluster.
> Unfortunately, I wasn't tasked with debugging. The customer decided to stay
> on 1.x for this reason.
> 
>> On Fri, May 22, 2020 at 1:52 AM Andrew Purtell  wrote:
>> 
>> It depends what you are measuring and how. I test every so often with YCSB,
>> which admittedly is not representative of real world workloads but is
>> widely used for apples to apples testing among datastores, and we can apply
>> the same test tool and test methodology to different versions to get
>> meaningful results. I also test on real clusters. The single all-in-one
>> process zk+master+regionserver "minicluster" cannot provide you meaningful
>> performance data. Only distributed clusters can provide meaningful results.
>> Some defaults are also important to change, like the number of RPC handlers
>> you plan to use in production.
>> 
>> After reading this thread I tested 1.6.0 and 2.2.4 using my standard
>> methodology, described below. 2.2.4 is better, often significantly better,
>> in most measures in most cases.
>> 
>> Cluster: AWS Amazon Linux AMI, 1 x master, 5 x regionserver, 1 x client,
>> m5d.4xlarge
>> Hadoop: 2.10.0, ZK: 3.4.14
>> 
>> 
>> JVM: 8u252 shenandoah (provided by AMI)
>> 
>> 
>> GC: -XX:+UseShenandoahGC -Xms31g -Xmx31g -XX:+AlwaysPreTouch -XX:+UseNUMA
>> -XX:-UseBiasedLocking
>> Non-default settings: hbase.regionserver.handler.count=256
>> hbase.ipc.server.callqueue.type=codel dfs.client.read.shortcircuit=true
>> Methodology:
>> 
>> 
>>  1. Create 100M row base table (ROW_INDEX_V1 encoding, ZSTANDARD
>> compression)
>>  2. Snapshot base table
>> 
>> 
>>  3. Enable balancer
>> 
>> 
>>  4. Clone test table from base table snapshot
>> 
>> 
>>  5. Balance, then disable balancer
>> 
>> 
>>  6. Run YCSB 0.18 workload --operationcount 100 (1M rows) -threads 200
>> -target 10 (100k/ops/sec)
>>  7. Drop test table
>> 
>> 
>>  8. Back to step 3 until all workloads complete
>> 
>> 
>> 
>> 
>> 
>> 
>> Workload A 1.6.0 2.2.4 Difference
>> [OVERALL], RunTime(ms) 20552 20655 100.50%
>> [OVERALL], Throughput(ops/sec) 97314 96829 99.50%
>> [READ], AverageLatency(us) 591 418 70.75%
>> [READ], MinLatency(us) 191 201 105.24%
>> [READ], MaxLatency(us) 146047 80895 55.39%
>> [READ], 95thPercentileLatency(us) 3013 542 17.99%
>> [READ], 99thPercentileLatency(us) 5427 2559 47.15%
>> [UPDATE], AverageLatency(us) 833 460 55.23%
>> [UPDATE], MinLatency(us) 348 230 66.09%
>> [UPDATE], MaxLatency(us) 149887 80959 54.01%
>> [UPDATE], 95thPercentileLatency(us) 3403 607 17.84%
>> [UPDATE], 99thPercentileLatency(us) 5751 3045 52.95%
>> 
>> 
>> 
>> 
>> Workload B 1.6.0 2.2.4 Difference
>> [OVERALL], RunTime(ms) 20555 20679 100.60%
>> [OVERALL], Throughput(ops/sec) 97300 96716 99.40%
>> [READ], AverageLatency(us) 417 427 102.54%
>> [READ], MinLatency(us) 179 194 108.38%
>> [READ], MaxLatency(us) 124095 76799 61.89%
>> [READ], 95thPercentileLatency(us) 498 564 113.25%
>> [READ], 99thPercentileLatency(us) 3679 3785 102.88%
>> [UPDATE], AverageLatency(us) 665 488 73.28%
>> [UPDATE], MinLatency(us) 380 237 62.37%
>> [UPDATE], MaxLatency(us) 95167 76287 80.16%
>> [UPDATE], 95thPercentileLatency(us) 718 629 87.60%
>> [UPDATE], 99thPercentileLatency(us) 4015 4023 100.20%
>> 
>> 
>> 
>> 
>> Workload C 1.6.0 2.2.4 Difference
>> [OVERALL], RunTime(ms) 20525 20648 100.60%
>> [OVERALL], Throughput(ops/sec) 97442 96862 99.40%
>> [READ], AverageLatency(us) 385 382 99.07%
>> [READ], MinLatency(us) 178 198 111.24%
>> [READ], MaxLatency(us) 74943 76415 101.96%
>> [READ], 95thPercentileLatency(us) 437 477 109.15%
>> [READ], 99thPercentileLatency(us) 3349 2219 66.26%
>> 
>> 
>> 
>> 
>> Workload D 1.6.0 2.2.4 Difference
>> [OVERALL], RunTime(ms) 20538 20644 100.52%
>> [OVERALL], Throughput(ops/sec) 97380 96880 99.49%
>> [READ], AverageLatency(us) 372 393 105.49%
>> [READ], MinLatency(us) 116 137 118.10%
>> [R

Re: HBase 2 slower than HBase 1?

2020-05-21 Thread Andrew Purtell
It depends what you are measuring and how. I test every so often with YCSB,
which admittedly is not representative of real world workloads but is
widely used for apples to apples testing among datastores, and we can apply
the same test tool and test methodology to different versions to get
meaningful results. I also test on real clusters. The single all-in-one
process zk+master+regionserver "minicluster" cannot provide you meaningful
performance data. Only distributed clusters can provide meaningful results.
Some defaults are also important to change, like the number of RPC handlers
you plan to use in production.

After reading this thread I tested 1.6.0 and 2.2.4 using my standard
methodology, described below. 2.2.4 is better, often significantly better,
in most measures in most cases.

Cluster: AWS Amazon Linux AMI, 1 x master, 5 x regionserver, 1 x client,
m5d.4xlarge
Hadoop: 2.10.0, ZK: 3.4.14


JVM: 8u252 shenandoah (provided by AMI)


GC: -XX:+UseShenandoahGC -Xms31g -Xmx31g -XX:+AlwaysPreTouch -XX:+UseNUMA
-XX:-UseBiasedLocking
Non-default settings: hbase.regionserver.handler.count=256
hbase.ipc.server.callqueue.type=codel dfs.client.read.shortcircuit=true
Methodology:


  1. Create 100M row base table (ROW_INDEX_V1 encoding, ZSTANDARD
compression)
  2. Snapshot base table


  3. Enable balancer


  4. Clone test table from base table snapshot


  5. Balance, then disable balancer


  6. Run YCSB 0.18 workload --operationcount 100 (1M rows) -threads 200
-target 10 (100k/ops/sec)
  7. Drop test table


  8. Back to step 3 until all workloads complete






Workload A 1.6.0 2.2.4 Difference
[OVERALL], RunTime(ms) 20552 20655 100.50%
[OVERALL], Throughput(ops/sec) 97314 96829 99.50%
[READ], AverageLatency(us) 591 418 70.75%
[READ], MinLatency(us) 191 201 105.24%
[READ], MaxLatency(us) 146047 80895 55.39%
[READ], 95thPercentileLatency(us) 3013 542 17.99%
[READ], 99thPercentileLatency(us) 5427 2559 47.15%
[UPDATE], AverageLatency(us) 833 460 55.23%
[UPDATE], MinLatency(us) 348 230 66.09%
[UPDATE], MaxLatency(us) 149887 80959 54.01%
[UPDATE], 95thPercentileLatency(us) 3403 607 17.84%
[UPDATE], 99thPercentileLatency(us) 5751 3045 52.95%




Workload B 1.6.0 2.2.4 Difference
[OVERALL], RunTime(ms) 20555 20679 100.60%
[OVERALL], Throughput(ops/sec) 97300 96716 99.40%
[READ], AverageLatency(us) 417 427 102.54%
[READ], MinLatency(us) 179 194 108.38%
[READ], MaxLatency(us) 124095 76799 61.89%
[READ], 95thPercentileLatency(us) 498 564 113.25%
[READ], 99thPercentileLatency(us) 3679 3785 102.88%
[UPDATE], AverageLatency(us) 665 488 73.28%
[UPDATE], MinLatency(us) 380 237 62.37%
[UPDATE], MaxLatency(us) 95167 76287 80.16%
[UPDATE], 95thPercentileLatency(us) 718 629 87.60%
[UPDATE], 99thPercentileLatency(us) 4015 4023 100.20%




Workload C 1.6.0 2.2.4 Difference
[OVERALL], RunTime(ms) 20525 20648 100.60%
[OVERALL], Throughput(ops/sec) 97442 96862 99.40%
[READ], AverageLatency(us) 385 382 99.07%
[READ], MinLatency(us) 178 198 111.24%
[READ], MaxLatency(us) 74943 76415 101.96%
[READ], 95thPercentileLatency(us) 437 477 109.15%
[READ], 99thPercentileLatency(us) 3349 2219 66.26%




Workload D 1.6.0 2.2.4 Difference
[OVERALL], RunTime(ms) 20538 20644 100.52%
[OVERALL], Throughput(ops/sec) 97380 96880 99.49%
[READ], AverageLatency(us) 372 393 105.49%
[READ], MinLatency(us) 116 137 118.10%
[READ], MaxLatency(us) 107391 73215 68.18%
[READ], 95thPercentileLatency(us) 916 983 107.31%
[READ], 99thPercentileLatency(us) 3183 2473 77.69%
[INSERT], AverageLatency(us) 732 526 71.86%
[INSERT], MinLatency(us) 418 289 69.14%
[INSERT], MaxLatency(us) 109183 80255 73.51%
[INSERT], 95thPercentileLatency(us) 823 724 87.97%
[INSERT], 99thPercentileLatency(us) 3961 3003 75.81%




Workload E 1.6.0 2.2.4 Difference
[OVERALL], RunTime(ms) 120157 119728 99.64%
[OVERALL], Throughput(ops/sec) 16645 16705 100.36%
[INSERT], AverageLatency(us) 11787 11102 94.19%
[INSERT], MinLatency(us) 459 296 64.49%
[INSERT], MaxLatency(us) 172927 131583 76.09%
[INSERT], 95thPercentileLatency(us) 32143 28911 89.94%
[INSERT], 99thPercentileLatency(us) 36063 31423 87.13%
[SCAN], AverageLatency(us) 11891 11875 99.87%
[SCAN], MinLatency(us) 219 255 116.44%
[SCAN], MaxLatency(us) 179071 188671 105.36%
[SCAN], 95thPercentileLatency(us) 32639 29615 90.74%
[SCAN], 99thPercentileLatency(us) 36671 32175 87.74%




Workload F 1.6.0 2.2.4 Difference
[OVERALL], RunTime(ms) 20766 20655 99.47%
[OVERALL], Throughput(ops/sec) 96311 96829 100.54%
[READ], AverageLatency(us) 1242 591 47.61%
[READ], MinLatency(us) 183 212 115.85%
[READ], MaxLatency(us) 80959 90111 111.30%
[READ], 95thPercentileLatency(us) 3397 1511 44.48%
[READ], 99thPercentileLatency(us) 4515 3063 67.84%
[READ-MODIFY-WRITE], AverageLatency(us) 2768 1193 43.10%
[READ-MODIFY-WRITE], MinLatency(us) 596 496 83.22%
[READ-MODIFY-WRITE], MaxLatency(us) 128639 112191 87.21%
[READ-MODIFY-WRITE], 95thPercentileLatency(us) 7071 3263 46.15%
[READ-MODIFY-WRITE], 99thPercentileLatency(us) 9919 6547 66.00%
[UPDATE], 

Re: [ANNOUNCE] New HBase committer Wei-Chiu Chuang

2020-05-13 Thread Andrew Purtell
Congratulations and welcome Wei-Chiu!

On Wed, May 13, 2020 at 12:10 PM Sean Busbey  wrote:

> Folks,
>
> On behalf of the Apache HBase PMC I am pleased to announce that Wei-Chiu
> Chuang has accepted the PMC's invitation to become a committer on the
> project.
>
> We appreciate all of the great contributions Wei-Chiu has made to the
> community thus far and we look forward to his continued involvement.
>
> Allow me to be the first to congratulate Wei-Chiu on his new role!
>
> thanks,
> busbey
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: DISCUSS: Move hbase-thrift and hbase-rest out of core to hbase-connectors project?

2020-04-24 Thread Andrew Purtell
+1
Let's do it.

On Fri, Apr 24, 2020 at 2:34 PM Stack  wrote:

> Taking a sounding
>
> We've talked of moving the hbase-rest and hbase-thrift modules out of core
> over to hbase-connectors project [2, 3]. The connectors project [1] was
> meant for the likes of REST and thrift. I'm thinking of trying to do the
> move in the next few days BEFORE 2.3.0RC0. Any objections? I'd make a
> release from hbase-connectors as part of this effort and would make sure it
> works w/ 2.3.0.
>
> Thank you,
> S
>
>
> 1. https://github.com/apache/hbase-connectors
> 2. https://issues.apache.org/jira/browse/HBASE-20999
> 3. https://issues.apache.org/jira/browse/HBASE-20998
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[ANNOUNCE] Apache HBase 1.6.0 is now available for download

2020-03-06 Thread Andrew Purtell
The HBase team is happy to announce the immediate availability of Apache
HBase 1.6.0!

Apache HBase is an open-source, distributed, versioned, non-relational
database. Apache HBase gives you low latency random access to billions of
rows with millions of columns atop non-specialized hardware. To learn more
about HBase, see https://hbase.apache.org/.

Download from http://hbase.apache.org/downloads

HBase 1.6.0 is the latest minor release of HBase version 1, continuing on
the theme of bringing a stable, reliable database to the Apache Big Data
ecosystem and beyond.

For instructions on verifying ASF release downloads, please see

https://www.apache.org/dyn/closer.cgi#verify

Project member signature keys can be found at

https://www.apache.org/dist/hbase/KEYS

Thanks to all the contributors who made this release possible!

A list of the 73 issues resolved in this release can be found at
https://s.apache.org/b7eml .

Thanks to all the contributors who made this release possible!

Questions, comments, and problems are always welcome at:
d...@hbase.apache.org

Best,
The HBase Dev Team


Re: [ANNOUNCE] New HBase committer Bharath Vissapragada

2020-02-06 Thread Andrew Purtell
Congratulations and welcome, Bharath!

On Wed, Feb 5, 2020 at 7:36 PM Nick Dimiduk  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that Bharath
> Vissapragada has accepted the PMC's invitation to become a commiter on the
> project. We appreciate all of Bharath's generous contributions thus far and
> look forward to his continued involvement.
>
> Allow me to be the first to congratulate and welcome Bharath into his new
> role!
>
> Thanks,
> Nick
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: [DISCUSS] Bump hadoop versions

2020-01-03 Thread Andrew Purtell
Releasing HBase 3 snapshots off master would be fine by me. 

> On Jan 3, 2020, at 7:37 PM, Sean Busbey  wrote:
> 
> A while back I offered to start making hbase 3 alpha releases so we had
> something we could all refer to in test environments. That offer still
> stands.
> 
> 
> I personally would still rather those releases be off of the master branch
> because we still have too many branches given our tooling for backports.
> 
>> On Fri, Jan 3, 2020, 21:30 Andrew Purtell  wrote:
>> 
>> Why not start a branch-3 and begin SNAPSHOT releasing of this branch right
>> now?
>> 
>> +1 to dropping Hadoop 2 support in HBase 3. We need the major increment to
>> make this kind of change, let’s take the opportunity.
>> 
>> Regarding Hadoop 2, the discussion I have seen indicates Hadoop thinks it
>> will be releasing 2.x for up to two more years. I don’t know how many
>> releases there will actually be but let’s assume at least one more 2.9, and
>> a few 2.10. My employer is expected to use these versions along a
>> transition path to 3.x for at least the next eighteen months. We are
>> probably typical. We won’t need Hadoop 2 support for a HBase 3 but will
>> need it for HBase 1 and HBase 2 for “a couple years”.
>> 
>>>> On Jan 3, 2020, at 6:52 PM, Sean Busbey  wrote:
>>> 
>>> I personally like having Hadoop 2 support still, but I agree the cadence
>>> out of Hadoop has been problematic.
>>> 
>>> I would prefer we not change the state of Hadoop support in the master
>>> branch until we have a release plan of some kind for HBase 3. I'd rather
>>> that be sooner rather than later.
>>> 
>>>> On Fri, Jan 3, 2020, 18:08 张铎(Duo Zhang)  wrote:
>>>> 
>>>> Support Hadoop 2.x in 3.0.0 means we need to carry it over the whole 3.x
>>>> release lines, which seems to be a problem since the Hadoop community do
>>>> not want to new 2.x release line any more...
>>>> 
>>>> Nick Dimiduk 于2020年1月4日 周六06:55写道:
>>>> 
>>>>> On Wed, Dec 25, 2019 at 5:38 PM 张铎(Duo Zhang) 
>>>>> wrote:
>>>>> 
>>>>>> We will only remove the hadoop 2.x support from hbase 3.x, which does
>>>> not
>>>>>> have a formal release plan yet, for 2.x we will still support hadoop
>>>> 2.x.
>>>>>> 
>>>>> 
>>>>> Indeed there is no formal release plan for HBase-3.0, but I hope it's
>>>>> sooner than 2+ years away! What's the motivation for dropping Hadoop2
>>>>> support?
>>>>> 
>>>>> Wei-Chiu Chuang  于2019年12月26日周四 上午8:57写道:
>>>>>> 
>>>>>>> With my Hadoop hat's on, we have not yet officially declared Hadoop
>>>> 2.8
>>>>>>> EOL. I think the 2.8 download missing from the web page is just a
>>>>>> mistake.
>>>>>>> 
>>>>>>> That being said, some of the biggest Hadoop users (LinkedIn, Yahoo,
>>>>>>> Microsoft) that I am aware of are moving up from 2.7/2.8 to 2.10, and
>>>>>> that
>>>>>>> 2.8.5 (the last version in the 2.8 line) was released in Sep 2018,
>>>> more
>>>>>>> than a year ago. It doesn't look like the community has the desire to
>>>>>>> continue the 2.8 line.
>>>>>>> 
>>>>>>> I think it is a little extreme to remove hadoop2 profile, given that
>>>>>> Hadoop
>>>>>>> 2.9 and 2.10 are still active and I expect Hadoop 2 to stay around
>>>> for
>>>>> at
>>>>>>> least 2 years out.
>>>>>>> 
>>>>>>> On Thu, Dec 26, 2019 at 8:41 AM 张铎(Duo Zhang) >>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hadoop 2.8.x has been removed from the download page of hadoop so I
>>>>>> think
>>>>>>>> it is time to bump the hadoop dependency to 2.9.x, on master an
>>>>>> branch-2.
>>>>>>>> 
>>>>>>>> And the hadoop community is going to make 2.10.x the last minor
>>>>> release
>>>>>>>> line for 2.x
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://lists.apache.org/thread.html/cab84265d632b90d66dcd1ad957a7439a2c76a987c7e62feafb4812e%40%3Ccommon-dev.hadoop.apache.org%3E
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I think this is a sign that the community is moving forward to 3.x.
>>>>> So
>>>>>> I
>>>>>>>> propose we make the master branch hadoop3 only, This requires
>>>>> changing
>>>>>>> the
>>>>>>>> pom a bit to active hadoop3 profile by default and remove the
>>>> hadoop2
>>>>>>>> profile.
>>>>>>>> 
>>>>>>>> Thoughts? Thanks.
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 


Re: [DISCUSS] Bump hadoop versions

2020-01-03 Thread Andrew Purtell
Why not start a branch-3 and begin SNAPSHOT releasing of this branch right now? 

+1 to dropping Hadoop 2 support in HBase 3. We need the major increment to make 
this kind of change, let’s take the opportunity. 

Regarding Hadoop 2, the discussion I have seen indicates Hadoop thinks it will 
be releasing 2.x for up to two more years. I don’t know how many releases there 
will actually be but let’s assume at least one more 2.9, and a few 2.10. My 
employer is expected to use these versions along a transition path to 3.x for 
at least the next eighteen months. We are probably typical. We won’t need 
Hadoop 2 support for a HBase 3 but will need it for HBase 1 and HBase 2 for “a 
couple years”.

> On Jan 3, 2020, at 6:52 PM, Sean Busbey  wrote:
> 
> I personally like having Hadoop 2 support still, but I agree the cadence
> out of Hadoop has been problematic.
> 
> I would prefer we not change the state of Hadoop support in the master
> branch until we have a release plan of some kind for HBase 3. I'd rather
> that be sooner rather than later.
> 
>> On Fri, Jan 3, 2020, 18:08 张铎(Duo Zhang)  wrote:
>> 
>> Support Hadoop 2.x in 3.0.0 means we need to carry it over the whole 3.x
>> release lines, which seems to be a problem since the Hadoop community do
>> not want to new 2.x release line any more...
>> 
>> Nick Dimiduk 于2020年1月4日 周六06:55写道:
>> 
>>> On Wed, Dec 25, 2019 at 5:38 PM 张铎(Duo Zhang) 
>>> wrote:
>>> 
 We will only remove the hadoop 2.x support from hbase 3.x, which does
>> not
 have a formal release plan yet, for 2.x we will still support hadoop
>> 2.x.
 
>>> 
>>> Indeed there is no formal release plan for HBase-3.0, but I hope it's
>>> sooner than 2+ years away! What's the motivation for dropping Hadoop2
>>> support?
>>> 
>>> Wei-Chiu Chuang  于2019年12月26日周四 上午8:57写道:
 
> With my Hadoop hat's on, we have not yet officially declared Hadoop
>> 2.8
> EOL. I think the 2.8 download missing from the web page is just a
 mistake.
> 
> That being said, some of the biggest Hadoop users (LinkedIn, Yahoo,
> Microsoft) that I am aware of are moving up from 2.7/2.8 to 2.10, and
 that
> 2.8.5 (the last version in the 2.8 line) was released in Sep 2018,
>> more
> than a year ago. It doesn't look like the community has the desire to
> continue the 2.8 line.
> 
> I think it is a little extreme to remove hadoop2 profile, given that
 Hadoop
> 2.9 and 2.10 are still active and I expect Hadoop 2 to stay around
>> for
>>> at
> least 2 years out.
> 
> On Thu, Dec 26, 2019 at 8:41 AM 张铎(Duo Zhang) >> 
> wrote:
> 
>> Hadoop 2.8.x has been removed from the download page of hadoop so I
 think
>> it is time to bump the hadoop dependency to 2.9.x, on master an
 branch-2.
>> 
>> And the hadoop community is going to make 2.10.x the last minor
>>> release
>> line for 2.x
>> 
>> 
>> 
> 
 
>>> 
>> https://lists.apache.org/thread.html/cab84265d632b90d66dcd1ad957a7439a2c76a987c7e62feafb4812e%40%3Ccommon-dev.hadoop.apache.org%3E
>> 
>> 
>> I think this is a sign that the community is moving forward to 3.x.
>>> So
 I
>> propose we make the master branch hadoop3 only, This requires
>>> changing
> the
>> pom a bit to active hadoop3 profile by default and remove the
>> hadoop2
>> profile.
>> 
>> Thoughts? Thanks.
>> 
> 
 
>>> 
>> 


Re: [ANNOUNCE] New HBase committer Viraj Jasani

2019-12-27 Thread Andrew Purtell
Congratulations and welcome Viraj, thanks for all of your efforts so far. 

> On Dec 27, 2019, at 5:02 AM, Peter Somogyi  wrote:
> 
> On behalf of the Apache HBase PMC I am pleased to announce that
> Viraj Jasani has accepted the PMC's invitation to become a
> commiter on the project.
> 
> Thanks so much for the work you've been contributing. We look forward
> to your continued involvement.
> 
> Congratulations and welcome!


Re: [ANNOUNCE] Please welcome Wellington Chevreuil to the Apache HBase PMC

2019-11-01 Thread Andrew Purtell
Congratulations and welcome!

> On Oct 23, 2019, at 1:16 PM, Sean Busbey  wrote:
> 
> On behalf of the Apache HBase PMC I am pleased to announce that
> Wellington Chevreuil has accepted our invitation to become a PMC member on the
> HBase project. We appreciate Wellington stepping up to take more
> responsibility in the HBase project.
> 
> Please join me in welcoming Wellington to the HBase PMC!
> 
> 
> 
> As a reminder, if anyone would like to nominate another person as a
> committer or PMC member, even if you are not currently a committer or
> PMC member, you can always drop a note to priv...@hbase.apache.org to
> let us know.


Re: [ANNOUNCE] Please welcome Sakthi to the Apache HBase PMC

2019-10-28 Thread Andrew Purtell
Congratulations and welcome, Sakthi!

On Wed, Oct 23, 2019 at 1:14 PM Sean Busbey  wrote:

> On behalf of the Apache HBase PMC I am pleased to announce that
> Sakthi has accepted our invitation to become a PMC member on the
> HBase project. We appreciate Sakthi stepping up to take more
> responsibility in the HBase project.
>
> Please join me in welcoming Jan to the HBase PMC!
>
>
>
> As a reminder, if anyone would like to nominate another person as a
> committer or PMC member, even if you are not currently a committer or
> PMC member, you can always drop a note to priv...@hbase.apache.org to
> let us know.
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


Re: The note of the round table meeting after HBaseConAsia 2019

2019-08-08 Thread Andrew Purtell
Ok, in that spirit let me say I've always found Apache Trafodion to be
interesting and credible technology and worthy of anyone's consideration.


On Thu, Aug 8, 2019 at 1:34 PM Rohit Jain  wrote:

> Andrew,
>
> I would never dump on Apache Phoenix.  I have worked with James for years
> and have always wanted to see how we could collaborate on various aspects,
> including common data type support and transaction management, to name a
> few.  I think the challenges we faced is the Java vs C++ nature of the two
> projects.  I am just pointing out that Apache Trafodion is an alternate
> option available.  I am also letting people know what is NOT in Apache
> Trafodion, so they understand that before making the time investment.
>
> Yes, I do apologize it sounds a bit like marketing, even though I tried to
> minimize that.  But you will see that we have had no marketing at all
> elsewhere.  One of the reasons why no one seems to know about Apache
> Trafodion.
>
> Rohit
>
> -Original Message-
> From: Andrew Purtell 
> Sent: Thursday, August 8, 2019 1:25 PM
> To: Hbase-User 
> Cc: HBase Dev List 
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
> This is great, but in the future please refrain from borderline marketing
> of a commercial product on these lists. This is not the appropriate venue
> for that.
>
> It is especially poor form to dump on a fellow open source project, as you
> claim to be. This I think is the tell behind the commercial motivation.
>
> Also I should point out, being pretty familiar with Phoenix in operation
> where I work, and in my interactions with various Phoenix committers and
> PMC, that the particular group of HBasers in that group appeared to share a
> negative view - which I will not comment on, they are entitled to their
> opinions, and more choice in SQL access to HBase is good! - that should not
> be claimed to be universal or even representative.
>
>
>
> On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain  wrote:
>
> > Hi folks,
> >
> > This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> > would like to address the points I have pulled out from write-up (at
> > the bottom of this message).
> >
> > Many in the HBase community may not be aware that besides Apache
> > Phoenix, there has been a project called Apache Trafodion, contributed
> > by Hewlett-Packard in 2015 that has now been top-level project for a
> while.
> > Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> > started its OLTP / Operational journey as NonStop SQL effectively in
> > the early 1990s.  Granted it is a C++ project, but it has 170+ patents
> > as part of it that were contributed to Apache.  These are capabilities
> > that still don’t exist in other databases.
> >
> > It is a full-fledged SQL relational database engine with the breadth
> > of ANSI SQL support, including OLAP functions mentioned, and including
> > many de facto standard functions from databases like Oracle.  You can
> > go to the Apache Trafodion wiki to see the documentation as to what
> > all is supported by Trafodion.
> >
> > When we introduced Apache Trafodion, we implemented a completely
> > distributed transaction management capability right into the HBase
> > engine using coprocessors, that is completely scalable with no
> > bottlenecks what-so-ever.  We have made this infrastructure very
> > efficient over time, e.g. reducing two-phase commit overhead for
> > single region transactions.  We have presented this at HBaseCon.
> >
> > The engine also supports secondary indexes.  However, because of our
> > Multi-dimensional Access Method patented technology the need to use a
> > secondary index is substantially reduced.  All DDL and index updates
> > are completely protected by ACID transactions.
> >
> > Probably because of our own inability to create excitement about the
> > project, and potentially other reasons, we could not get community
> > involvement as we were expecting.  That is why you may see that while
> > we are maintaining the code base and introducing enhancements to it,
> > much of our focus has shifted to the commercial product based on
> > Apache Trafodion, namely EsgynDB.  But if the community involvement
> > increases, we can certainly refresh Trafodion with some of the
> > additional functionality we have added on the HBase side of the product.
> >
> > But let me be clear.  We are about 150 employees at Esgyn with 40 or
> > so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing,
> > and Guiyang.  We cannot sustain the company on service rev

Re: The note of the round table meeting after HBaseConAsia 2019

2019-08-08 Thread Andrew Purtell
This is great, but in the future please refrain from borderline marketing
of a commercial product on these lists. This is not the appropriate venue
for that.

It is especially poor form to dump on a fellow open source project, as you
claim to be. This I think is the tell behind the commercial motivation.

Also I should point out, being pretty familiar with Phoenix in operation
where I work, and in my interactions with various Phoenix committers and
PMC, that the particular group of HBasers in that group appeared to share a
negative view - which I will not comment on, they are entitled to their
opinions, and more choice in SQL access to HBase is good! - that should not
be claimed to be universal or even representative.



On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain  wrote:

> Hi folks,
>
> This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> would like to address the points I have pulled out from write-up (at the
> bottom of this message).
>
> Many in the HBase community may not be aware that besides Apache Phoenix,
> there has been a project called Apache Trafodion, contributed by
> Hewlett-Packard in 2015 that has now been top-level project for a while.
> Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> started its OLTP / Operational journey as NonStop SQL effectively in the
> early 1990s.  Granted it is a C++ project, but it has 170+ patents as part
> of it that were contributed to Apache.  These are capabilities that still
> don’t exist in other databases.
>
> It is a full-fledged SQL relational database engine with the breadth of
> ANSI SQL support, including OLAP functions mentioned, and including many de
> facto standard functions from databases like Oracle.  You can go to the
> Apache Trafodion wiki to see the documentation as to what all is supported
> by Trafodion.
>
> When we introduced Apache Trafodion, we implemented a completely
> distributed transaction management capability right into the HBase engine
> using coprocessors, that is completely scalable with no bottlenecks
> what-so-ever.  We have made this infrastructure very efficient over time,
> e.g. reducing two-phase commit overhead for single region transactions.  We
> have presented this at HBaseCon.
>
> The engine also supports secondary indexes.  However, because of our
> Multi-dimensional Access Method patented technology the need to use a
> secondary index is substantially reduced.  All DDL and index updates are
> completely protected by ACID transactions.
>
> Probably because of our own inability to create excitement about the
> project, and potentially other reasons, we could not get community
> involvement as we were expecting.  That is why you may see that while we
> are maintaining the code base and introducing enhancements to it, much of
> our focus has shifted to the commercial product based on Apache Trafodion,
> namely EsgynDB.  But if the community involvement increases, we can
> certainly refresh Trafodion with some of the additional functionality we
> have added on the HBase side of the product.
>
> But let me be clear.  We are about 150 employees at Esgyn with 40 or so in
> the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and
> Guiyang.  We cannot sustain the company on service revenue alone.  You have
> seen companies that tried to do that have not been successful, unless they
> have a way to leverage the open source project for a different business
> model – enhanced capabilities, Cloud services, etc.
>
> To that end we have added to EsgynDB complete Disaster Recovery,
> Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> Manager, Multi-tenancy, and a large number of other capabilities for High
> Availability scale-out production deployments.  EsgynDB also provides full
> BI and Analytics capabilities, again because of our heritage products
> supporting up to 250TB EDWs for HP and customers like Walmart competing
> with Teradata, leveraging Apache ORC and Parquet.  So yes, it can integrate
> with other storage engines as needed.
>
> However, in spite of all this, the pricing on EsgynDB is very competitive
> – in other words “cheap” compared to anything else with the same caliber of
> capabilities.
>
> We have demonstrated the capability of the product by running the TPC-C
> and TPC-DS (all 99 queries) benchmarks, especially at high concurrency
> which our product is especially well suited for, based on its architecture
> and patents.  (The TPC-DS benchmarks are run on ORC and Parquet for obvious
> reasons.)
>
> We just closed a couple of very large Core Banking deals in Guiyang where
> we are replacing the entire Core Banking system for these banks from their
> current Oracle implementations – where they were having challenges scaling
> at a reasonable cost.  But we have many customers both in the US and China
> that are using EsgynDB for operational, BI and Analytics needs.  And now
> finally … OLTP.
>
> I know that this is sounding more like 

  1   2   3   4   5   6   7   8   9   >