from:"Christopher"

Re: Fixing the 2.1 branch - a brief analysis

2022-12-01 Thread Christopher

Oh, in case you updated your local 2.1 branch after main was merged
onto it, but before I force-pushed to fix it, just reset your 2.1
branch by checking out your local 2.1 branch, doing `git remote
update` and then `git reset --hard upstream/2.1` before you do
anything else. That way, we don't get those commits added back in
accidentally.

On Thu, Dec 1, 2022 at 5:26 PM Christopher  wrote:
>
> Hi Accumulo devs,
>
> I just wanted to give you a heads up about branch maintenance for the
> 2.1 branch. A few mistakes were made, and the main branch accidentally
> got merged backwards into the 2.1 maintenance branch instead of the
> other way around. In order to fix this and not have the history
> completely unintelligible, I force-pushed the 2.1 maintenance branch
> back to the commit just prior. It was just the single merge commit
> that grabbed the main branch that needed to be removed to fix things
> in the 2.1 branch. Only the 2.1 branch was affected. The main branch
> did not need to be corrected so drastically. But, after I fixed the
> 2.1 branch, I did merge it forward into the main branch as per our
> usual procedure, to complete the original task that was being
> attempted.
>
> tl;dr -
> If you're curious, the relevant tickets were #3082, #3101, and #3102.
> #3082 was the original ticket adding a feature to 2.1.1. #3101
> correctly reverted it from the 2.1 branch. This revert was then
> attempted to be merged into the main branch. That was done as a
> separate PR in #3102. For what it's worth, I don't recommend doing
> merges using the GitHub UI that way. GitHub assumes you're combining
> the history of both branches fully into a single resulting branch,
> instead of merely incorporating one branch into the other. This
> assumption caused several problems. The first is the presence of a
> very risky "delete" button suggesting the 2.1 branch can be deleted
> when we don't want it to be deleted. The second, and more troublesome
> problem that caused the issues we needed to fix, was that GitHub's
> conflict resolution UI will merge all of the main branch into the 2.1
> branch during conflict resolution, because it believes you're merely
> bringing the 2.1 branch up-to-date with  main in order to merge a
> feature branch into main and remove the feature branch. It assumes a
> feature branch workflow, not a maintenance branch, one-way merge,
> workflow.
>
> In order to avoid problems like this in future, I recommend merging
> into the main branch from the maintenance branches by using the
> command-line. `git mergetool` is your friend :)
>
> I hope this explanation helps others understand what happened and how
> to avoid similar issues with GitHub in future.
>
> Regards,
> Christopher Tubbs

Fixing the 2.1 branch - a brief analysis

2022-12-01 Thread Christopher

Hi Accumulo devs,

I just wanted to give you a heads up about branch maintenance for the
2.1 branch. A few mistakes were made, and the main branch accidentally
got merged backwards into the 2.1 maintenance branch instead of the
other way around. In order to fix this and not have the history
completely unintelligible, I force-pushed the 2.1 maintenance branch
back to the commit just prior. It was just the single merge commit
that grabbed the main branch that needed to be removed to fix things
in the 2.1 branch. Only the 2.1 branch was affected. The main branch
did not need to be corrected so drastically. But, after I fixed the
2.1 branch, I did merge it forward into the main branch as per our
usual procedure, to complete the original task that was being
attempted.

tl;dr -
If you're curious, the relevant tickets were #3082, #3101, and #3102.
#3082 was the original ticket adding a feature to 2.1.1. #3101
correctly reverted it from the 2.1 branch. This revert was then
attempted to be merged into the main branch. That was done as a
separate PR in #3102. For what it's worth, I don't recommend doing
merges using the GitHub UI that way. GitHub assumes you're combining
the history of both branches fully into a single resulting branch,
instead of merely incorporating one branch into the other. This
assumption caused several problems. The first is the presence of a
very risky "delete" button suggesting the 2.1 branch can be deleted
when we don't want it to be deleted. The second, and more troublesome
problem that caused the issues we needed to fix, was that GitHub's
conflict resolution UI will merge all of the main branch into the 2.1
branch during conflict resolution, because it believes you're merely
bringing the 2.1 branch up-to-date with  main in order to merge a
feature branch into main and remove the feature branch. It assumes a
feature branch workflow, not a maintenance branch, one-way merge,
workflow.

In order to avoid problems like this in future, I recommend merging
into the main branch from the maintenance branches by using the
command-line. `git mergetool` is your friend :)

I hope this explanation helps others understand what happened and how
to avoid similar issues with GitHub in future.

Regards,
Christopher Tubbs

Re: Pinning a Table to a Specific Tablet Server

2022-11-07 Thread Christopher

Yes. You can do this with the balancer. The default balancer
("manager.tablet.balancer") is a per-table TableLoadBalancer, so each
table can have custom configuration for your specific table, with its
own custom balancer, or you may be able to use the existing
HostRegexTableLoadBalancer for your purpose. I don't have expert
knowledge on how to configure this, but hopefully that's a start.

On Mon, Nov 7, 2022 at 12:41 PM Logan Jones  wrote:
>
> Hello:
>
> I would like to know if there's a way to configure Accumulo such that a
> specific table is hosted by a specific tablet server. Ideally I could
> specify a specific tablet server, but a specific node might also be
> acceptable.
>
> Thanks in advance,
>
>  - Logan

Re: Rollback to 1.9.3

2022-11-04 Thread Christopher

I don't know what could go wrong, so it's hard to say that it would be
obvious.

For what it's worth, if we can identify a performance bug, we can release a
fix in a 1.10.3, so you can upgrade instead of downgrade.

On Fri, Nov 4, 2022, 16:26 Logan Jones  wrote:

> Thanks all,
>
> We have a test system that we could try rolling back. If something does
> break, will it be obvious?
>
> Dave, the ingest rates are slightly more spikey, but I think it's mostly
> because tservers are bouncing and the cluster is working to catch up.
> Nothing major jumps out as an increase in throughput (i.e. ingest rate in
> terms of operations per second seem to be roughly equivalent. The same is
> true for the ingest rate in MB/s.)
>
> On Fri, Nov 4, 2022 at 4:21 PM Christopher  wrote:
>
> > I don't think it has any changes that would prevent rollback, but it's
> not
> > a scenario that has been tested to my knowledge.
> >
> > On Fri, Nov 4, 2022, 16:15 Dave Marion  wrote:
> >
> > > It's going to take some time to review the changes[1], but I don't see
> > > changes in the default JVM sizes. I was wondering if maybe the issue is
> > > that it's running faster. You are loading the same amount of data, but
> is
> > > it going faster by chance? If so, you could be creating more garbage
> per
> > > unit time putting more pressure on the GC. Just a thought.
> > >
> > > [1] https://github.com/apache/accumulo/compare/rel/1.9.3..rel/1.10.2
> > >
> > > On Fri, Nov 4, 2022 at 4:02 PM Logan Jones 
> > wrote:
> > >
> > > > Yeah, our memory usage is drastically different since the upgrade.
> > > >
> > > > We are seeing spikes in heap utilization on tablet servers that
> weren't
> > > > happening before the upgrade despite our ingest load being roughly
> the
> > > > same. This increase in heap utilization seems to be causing long GC
> > > times.
> > > > Those GC times are long enough that the tablet servers lose their
> locks
> > > and
> > > > then die.
> > > >
> > > > Looking into the JVM options, we don't see anything obvious that
> > changed
> > > > around the garbage collector, and looking at the Accumulo release
> notes
> > > > didn't leave us any indication that something like this should have
> > > > changed, but nevertheless we are seeing crashes of tservers. I'm
> mostly
> > > > trying to identify whether or not rollback is even an option.
> > > >
> > > > - Logan
> > > >
> > > > On Fri, Nov 4, 2022 at 3:49 PM Dave Marion 
> > wrote:
> > > >
> > > > >   Are you running into an error or some other issue that is making
> > you
> > > > > think that you have to rollback? I don't know that rolling back has
> > > been
> > > > > tested.
> > > > >
> > > > > On Fri, Nov 4, 2022 at 3:40 PM Logan Jones 
> > > > wrote:
> > > > >
> > > > > > Hello:
> > > > > >
> > > > > > We recently upgraded from Accumulo 1.9.3 to 1.10.2. Is it safe to
> > > roll
> > > > > back
> > > > > > to Accumulo 1.9.3?
> > > > > >
> > > > > > Thanks in advance,
> > > > > >
> > > > > > - Logan
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Rollback to 1.9.3

2022-11-04 Thread Christopher

I don't think it has any changes that would prevent rollback, but it's not
a scenario that has been tested to my knowledge.

On Fri, Nov 4, 2022, 16:15 Dave Marion  wrote:

> It's going to take some time to review the changes[1], but I don't see
> changes in the default JVM sizes. I was wondering if maybe the issue is
> that it's running faster. You are loading the same amount of data, but is
> it going faster by chance? If so, you could be creating more garbage per
> unit time putting more pressure on the GC. Just a thought.
>
> [1] https://github.com/apache/accumulo/compare/rel/1.9.3..rel/1.10.2
>
> On Fri, Nov 4, 2022 at 4:02 PM Logan Jones  wrote:
>
> > Yeah, our memory usage is drastically different since the upgrade.
> >
> > We are seeing spikes in heap utilization on tablet servers that weren't
> > happening before the upgrade despite our ingest load being roughly the
> > same. This increase in heap utilization seems to be causing long GC
> times.
> > Those GC times are long enough that the tablet servers lose their locks
> and
> > then die.
> >
> > Looking into the JVM options, we don't see anything obvious that changed
> > around the garbage collector, and looking at the Accumulo release notes
> > didn't leave us any indication that something like this should have
> > changed, but nevertheless we are seeing crashes of tservers. I'm mostly
> > trying to identify whether or not rollback is even an option.
> >
> > - Logan
> >
> > On Fri, Nov 4, 2022 at 3:49 PM Dave Marion  wrote:
> >
> > >   Are you running into an error or some other issue that is making you
> > > think that you have to rollback? I don't know that rolling back has
> been
> > > tested.
> > >
> > > On Fri, Nov 4, 2022 at 3:40 PM Logan Jones 
> > wrote:
> > >
> > > > Hello:
> > > >
> > > > We recently upgraded from Accumulo 1.9.3 to 1.10.2. Is it safe to
> roll
> > > back
> > > > to Accumulo 1.9.3?
> > > >
> > > > Thanks in advance,
> > > >
> > > > - Logan
> > > >
> > >
> >
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 2.1.0

2022-11-02 Thread Christopher

1. If there's anything missing from the release notes, we can still add
them. Please do so.

2. For the description, I can make that change.

On Wed, Nov 2, 2022 at 7:11 AM Keith Turner  wrote:

> On Wed, Nov 2, 2022 at 10:59 AM Dave Marion  wrote:
> >
> > I don't see it either. I know that it is documented in the user guide. I
> > can certainly add a one-liner under "Other notable changes" if you like.
>
> That would be appreciated.  You probably know the most about the
> changes.  If the user guide documentation is all in place then the
> release notes could link to it.
>
> >
> > On Wed, Nov 2, 2022 at 6:53 AM Keith Turner  wrote:
> >
> > > In the release notes I do not see mention of the new cluster yaml file
> > > that replaced the tservers, monitor, gc, master files. Am I just
> > > missing it?  Thats a really nice user facing change that should be
> > > mentioned.
> > >
> > > On Wed, Nov 2, 2022 at 7:55 AM Christopher 
> wrote:
> > > >
> > > > The following is a draft announcement for the 2.1 release. I've
> prepared
> > > it
> > > > here for a little bit of feedback, before sending it out later today.
> > > >
> > > > ***
> > > >
> > > > The Apache Accumulo project is pleased to announce the release
> > > > of Apache Accumulo 2.1.0! Apache Accumulo 2.1.0 contains numerous
> > > > features and improvements, and contains over 1200 contributions from
> > > > over 50 contributors.
> > > >
> > > > This release includes external compactions, separate scanner thread
> > > > pools, separate compaction queues, per-table encryption
> configuration,
> > > > scan servers, atomic configuration of multiple properties, more
> efficient
> > > > use of ZooKeeper watches on configuration nodes, a convenient JShell
> > > > launch script, and many, many more changes.
> > > >
> > > > See the release notes linked below for more details.
> > > >
> > > > 2.1 is an LTM (Long-Term Maintenance) release line, Users of 2.0 or
> > > > 1.10 are encouraged to upgrade to this latest version. 2.0 is
> end-of-life
> > > > immediately, and will not receive any further updates. 1.10 will
> reach
> > > > that end-of-life in one year. Upgrades are supported directly from
> the
> > > > latest 1.10 or 2.0.1 only, so it is recommended to upgrade to one of
> > > > these first.
> > > >
> > > > ***
> > > >
> > > > Apache Accumulo® is a sorted, distributed key/value store that
> > > > provides robust, scalable data storage and retrieval. With
> > > > Apache Accumulo, users can store and manage large data sets
> > > > across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> > > > its data and Apache ZooKeeper for consensus.
> > > >
> > > > This version is now available in Maven Central, and at:
> > > > https://accumulo.apache.org/downloads/
> > > >
> > > > The full release notes can be viewed at:
> > > > https://accumulo.apache.org/release/accumulo-2.1.0/
> > >
>

[DRAFT][ANNOUNCE] Apache Accumulo 2.1.0

2022-11-02 Thread Christopher

The following is a draft announcement for the 2.1 release. I've prepared it
here for a little bit of feedback, before sending it out later today.

***

The Apache Accumulo project is pleased to announce the release
of Apache Accumulo 2.1.0! Apache Accumulo 2.1.0 contains numerous
features and improvements, and contains over 1200 contributions from
over 50 contributors.

This release includes external compactions, separate scanner thread
pools, separate compaction queues, per-table encryption configuration,
scan servers, atomic configuration of multiple properties, more efficient
use of ZooKeeper watches on configuration nodes, a convenient JShell
launch script, and many, many more changes.

See the release notes linked below for more details.

2.1 is an LTM (Long-Term Maintenance) release line, Users of 2.0 or
1.10 are encouraged to upgrade to this latest version. 2.0 is end-of-life
immediately, and will not receive any further updates. 1.10 will reach
that end-of-life in one year. Upgrades are supported directly from the
latest 1.10 or 2.0.1 only, so it is recommended to upgrade to one of
these first.

***

Apache Accumulo® is a sorted, distributed key/value store that
provides robust, scalable data storage and retrieval. With
Apache Accumulo, users can store and manage large data sets
across a cluster. Accumulo uses Apache Hadoop's HDFS to store
its data and Apache ZooKeeper for consensus.

This version is now available in Maven Central, and at:
https://accumulo.apache.org/downloads/

The full release notes can be viewed at:
https://accumulo.apache.org/release/accumulo-2.1.0/

[RESULT][VOTE] Apache Accumulo 2.1.0-rc4

2022-11-01 Thread Christopher

This vote passes with 6 +1s and no other votes.

Post-vote tasks are being tracked in
https://github.com/apache/accumulo/issues/3060 and I'll be working through
the list today.

On Tue, Nov 1, 2022 at 10:00 AM Christopher  wrote:

> +1 (binding)
>
> Verified:
> * Checksums and signatures
> * All jars have corresponding source and javadoc jar
> * All jars are sealed
> * Jar manifests reference correct git reference
> * All jars in the repo match the jars in the lib/ dir in the binary tarball
> * Source tarball contents match git checkout
> * All ITs pass
>
>
> On Tue, Nov 1, 2022 at 9:09 AM Dominic Garguilo 
> wrote:
>
>> +1 (binding)
>>
>>  - Validated checksums
>>  - Ran ITs
>>
>> Created an 8 node cluster using:
>>   CentOS 7.9
>>   ZooKeeper 3.8.0
>>   Hadoop 3.3.4
>>   Accumulo 2.1.0-RC4
>>
>> Ran the following tests on the cluster:
>>
>> Created ci table with per-table encryption and internal compactions
>> enabled.
>>
>>  - Ran continuous ingest for 22 hours then stopped and flushed the table
>>  - Ran verify with eventual scans which completed successfully
>>  - Ran another ~5 hours of continuous ingest with agitation then stopped
>> and flushed the table
>>  - Ran verify with eventual scans which completed successfully
>>
>>
>>
>>
>> On Mon, Oct 31, 2022 at 3:14 PM Keith Turner  wrote:
>>
>> > +1  after reading about the testing Dave did.
>> >
>> > On Thu, Oct 27, 2022 at 7:02 AM Christopher 
>> wrote:
>> > >
>> > > Accumulo Developers,
>> > >
>> > > Please consider the following candidate for Apache Accumulo 2.1.0.
>> > > The only change since 2.1.0-rc3 was
>> > > https://github.com/apache/accumulo/pull/3051
>> > >
>> > > Git Commit:
>> > > 706612f859d6e68891d487d624eda9ecf3fea7f9
>> > > Branch:
>> > > 2.1.0-rc4
>> > >
>> > > If this vote passes, a gpg-signed tag will be created using:
>> > > git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
>> > > 706612f859d6e68891d487d624eda9ecf3fea7f9
>> > >
>> > > Staging repo:
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097
>> > > Source (official release artifact):
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
>> > > Binary:
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
>> > >
>> > > Append ".asc" to download the cryptographic signature for a given
>> > artifact.
>> > > (You can also append ".sha1" or ".md5" instead in order to verify the
>> > > checksums
>> > > generated by Maven to verify the integrity of the Nexus repository
>> > staging
>> > > area.)
>> > >
>> > > Signing keys are available at
>> https://www.apache.org/dist/accumulo/KEYS
>> > > (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>> > >
>> > > In addition to the tarballs and their signatures, the following
>> checksum
>> > > files will be added to the dist/release SVN area after release:
>> > > accumulo-2.1.0-src.tar.gz.sha512 will contain:
>> > > SHA512 (accumulo-2.1.0-src.tar.gz) =
>> > >
>> >
>> 048a5520870ef7417570da9f084ba197b9982caafaca43558e63d58e0407b495d94acbb21acb478c307c25c4c3f6be03510d278acc55aa0cf500a55e9d923d8c
>> > > accumulo-2.1.0-bin.tar.gz.sha512 will contain:
>> > > SHA512 (accumulo-2.1.0-bin.tar.gz) =
>> > >
>> >
>> 9bbc4defc114013f145e9e9fdd08683c842a29faaa01ea6e8049a6aecef86ee6657cce23183b411a64c31123db682a6944e2825eef83c76fb5a91620235f
>> > >
>> > > Release notes (in progress) can be found at:
>> > > https://accumulo.staged.apache.org/release/accumulo-2.1.0
>> > >
>> > > Release testing instructions:
>> > > https://accumulo.apache.org/contributor/verifying-release
>> > >
>> > > Please vote one of:
>> > > [ ] +1 - I have verified and accept...
>> > > [ ] +0 - I have reservations, but not strong enough to vote against...
>> > > [ ] -1 - Because..., I do not accept...
>> > > ... these artifacts as the 2.1.0 release of Apache Accumulo.
>> > >
>> > > This vote will remain open until at least Sun Oct 30 06:00:00 AM UTC
>> > 2022.
>> > > (Sun Oct 30 02:00:00 AM EDT 2022 / Sat Oct 29 11:00:00 PM PDT 2022)
>> > > Voting can continue after this deadline until the release manager
>> > > sends an email ending the vote.
>> > >
>> > > Thanks!
>> > >
>> > > P.S. Hint: download the whole staging repo with
>> > > wget -erobots=off -r -l inf -np -nH \
>> > >
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/
>> > > # note the trailing slash is needed
>> >
>>
>

Re: [VOTE] Apache Accumulo 2.1.0-rc4

2022-11-01 Thread Christopher

+1 (binding)

Verified:
* Checksums and signatures
* All jars have corresponding source and javadoc jar
* All jars are sealed
* Jar manifests reference correct git reference
* All jars in the repo match the jars in the lib/ dir in the binary tarball
* Source tarball contents match git checkout
* All ITs pass


On Tue, Nov 1, 2022 at 9:09 AM Dominic Garguilo 
wrote:

> +1 (binding)
>
>  - Validated checksums
>  - Ran ITs
>
> Created an 8 node cluster using:
>   CentOS 7.9
>   ZooKeeper 3.8.0
>   Hadoop 3.3.4
>   Accumulo 2.1.0-RC4
>
> Ran the following tests on the cluster:
>
> Created ci table with per-table encryption and internal compactions
> enabled.
>
>  - Ran continuous ingest for 22 hours then stopped and flushed the table
>  - Ran verify with eventual scans which completed successfully
>  - Ran another ~5 hours of continuous ingest with agitation then stopped
> and flushed the table
>  - Ran verify with eventual scans which completed successfully
>
>
>
>
> On Mon, Oct 31, 2022 at 3:14 PM Keith Turner  wrote:
>
> > +1  after reading about the testing Dave did.
> >
> > On Thu, Oct 27, 2022 at 7:02 AM Christopher  wrote:
> > >
> > > Accumulo Developers,
> > >
> > > Please consider the following candidate for Apache Accumulo 2.1.0.
> > > The only change since 2.1.0-rc3 was
> > > https://github.com/apache/accumulo/pull/3051
> > >
> > > Git Commit:
> > > 706612f859d6e68891d487d624eda9ecf3fea7f9
> > > Branch:
> > > 2.1.0-rc4
> > >
> > > If this vote passes, a gpg-signed tag will be created using:
> > > git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
> > > 706612f859d6e68891d487d624eda9ecf3fea7f9
> > >
> > > Staging repo:
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097
> > > Source (official release artifact):
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
> > > Binary:
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
> > >
> > > Append ".asc" to download the cryptographic signature for a given
> > artifact.
> > > (You can also append ".sha1" or ".md5" instead in order to verify the
> > > checksums
> > > generated by Maven to verify the integrity of the Nexus repository
> > staging
> > > area.)
> > >
> > > Signing keys are available at
> https://www.apache.org/dist/accumulo/KEYS
> > > (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
> > >
> > > In addition to the tarballs and their signatures, the following
> checksum
> > > files will be added to the dist/release SVN area after release:
> > > accumulo-2.1.0-src.tar.gz.sha512 will contain:
> > > SHA512 (accumulo-2.1.0-src.tar.gz) =
> > >
> >
> 048a5520870ef7417570da9f084ba197b9982caafaca43558e63d58e0407b495d94acbb21acb478c307c25c4c3f6be03510d278acc55aa0cf500a55e9d923d8c
> > > accumulo-2.1.0-bin.tar.gz.sha512 will contain:
> > > SHA512 (accumulo-2.1.0-bin.tar.gz) =
> > >
> >
> 9bbc4defc114013f145e9e9fdd08683c842a29faaa01ea6e8049a6aecef86ee6657cce23183b411a64c31123db682a6944e2825eef83c76fb5a91620235f
> > >
> > > Release notes (in progress) can be found at:
> > > https://accumulo.staged.apache.org/release/accumulo-2.1.0
> > >
> > > Release testing instructions:
> > > https://accumulo.apache.org/contributor/verifying-release
> > >
> > > Please vote one of:
> > > [ ] +1 - I have verified and accept...
> > > [ ] +0 - I have reservations, but not strong enough to vote against...
> > > [ ] -1 - Because..., I do not accept...
> > > ... these artifacts as the 2.1.0 release of Apache Accumulo.
> > >
> > > This vote will remain open until at least Sun Oct 30 06:00:00 AM UTC
> > 2022.
> > > (Sun Oct 30 02:00:00 AM EDT 2022 / Sat Oct 29 11:00:00 PM PDT 2022)
> > > Voting can continue after this deadline until the release manager
> > > sends an email ending the vote.
> > >
> > > Thanks!
> > >
> > > P.S. Hint: download the whole staging repo with
> > > wget -erobots=off -r -l inf -np -nH \
> > >
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/
> > > # note the trailing slash is needed
> >
>

Re: [VOTE] Apache Accumulo 2.1.0-rc4

2022-10-29 Thread Christopher Shannon

+1 (binding), looks good to me

Some of the things I did for verification/testing locally:

* Validated signatures and checksums
* Verified license and notice files in archives
* Verified source license headers with 'mvn apache-rat:check'
* Built and ran all the sunny integration tests
* Ran several tests using Uno for the new features in 2.1.0 such as per
table encryption, shell command changes, etc.
* Ran tests using accumulo-testing against Uno including ingest and verify

On Thu, Oct 27, 2022 at 2:02 AM Christopher  wrote:

> Accumulo Developers,
>
> Please consider the following candidate for Apache Accumulo 2.1.0.
> The only change since 2.1.0-rc3 was
> https://github.com/apache/accumulo/pull/3051
>
> Git Commit:
> 706612f859d6e68891d487d624eda9ecf3fea7f9
> Branch:
> 2.1.0-rc4
>
> If this vote passes, a gpg-signed tag will be created using:
> git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
> 706612f859d6e68891d487d624eda9ecf3fea7f9
>
> Staging repo:
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097
> Source (official release artifact):
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
> Binary:
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
>
> Append ".asc" to download the cryptographic signature for a given artifact.
> (You can also append ".sha1" or ".md5" instead in order to verify the
> checksums
> generated by Maven to verify the integrity of the Nexus repository staging
> area.)
>
> Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>
> In addition to the tarballs and their signatures, the following checksum
> files will be added to the dist/release SVN area after release:
> accumulo-2.1.0-src.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-src.tar.gz) =
>
> 048a5520870ef7417570da9f084ba197b9982caafaca43558e63d58e0407b495d94acbb21acb478c307c25c4c3f6be03510d278acc55aa0cf500a55e9d923d8c
> accumulo-2.1.0-bin.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-bin.tar.gz) =
>
> 9bbc4defc114013f145e9e9fdd08683c842a29faaa01ea6e8049a6aecef86ee6657cce23183b411a64c31123db682a6944e2825eef83c76fb5a91620235f
>
> Release notes (in progress) can be found at:
> https://accumulo.staged.apache.org/release/accumulo-2.1.0
>
> Release testing instructions:
> https://accumulo.apache.org/contributor/verifying-release
>
> Please vote one of:
> [ ] +1 - I have verified and accept...
> [ ] +0 - I have reservations, but not strong enough to vote against...
> [ ] -1 - Because..., I do not accept...
> ... these artifacts as the 2.1.0 release of Apache Accumulo.
>
> This vote will remain open until at least Sun Oct 30 06:00:00 AM UTC 2022.
> (Sun Oct 30 02:00:00 AM EDT 2022 / Sat Oct 29 11:00:00 PM PDT 2022)
> Voting can continue after this deadline until the release manager
> sends an email ending the vote.
>
> Thanks!
>
> P.S. Hint: download the whole staging repo with
> wget -erobots=off -r -l inf -np -nH \
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/
> # note the trailing slash is needed
>

[VOTE] Apache Accumulo 2.1.0-rc4

2022-10-27 Thread Christopher

Accumulo Developers,

Please consider the following candidate for Apache Accumulo 2.1.0.
The only change since 2.1.0-rc3 was
https://github.com/apache/accumulo/pull/3051

Git Commit:
706612f859d6e68891d487d624eda9ecf3fea7f9
Branch:
2.1.0-rc4

If this vote passes, a gpg-signed tag will be created using:
git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
706612f859d6e68891d487d624eda9ecf3fea7f9

Staging repo:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1097
Source (official release artifact):
https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
Binary:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz

Append ".asc" to download the cryptographic signature for a given artifact.
(You can also append ".sha1" or ".md5" instead in order to verify the
checksums
generated by Maven to verify the integrity of the Nexus repository staging
area.)

Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
(Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)

In addition to the tarballs and their signatures, the following checksum
files will be added to the dist/release SVN area after release:
accumulo-2.1.0-src.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-src.tar.gz) =
048a5520870ef7417570da9f084ba197b9982caafaca43558e63d58e0407b495d94acbb21acb478c307c25c4c3f6be03510d278acc55aa0cf500a55e9d923d8c
accumulo-2.1.0-bin.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-bin.tar.gz) =
9bbc4defc114013f145e9e9fdd08683c842a29faaa01ea6e8049a6aecef86ee6657cce23183b411a64c31123db682a6944e2825eef83c76fb5a91620235f

Release notes (in progress) can be found at:
https://accumulo.staged.apache.org/release/accumulo-2.1.0

Release testing instructions:
https://accumulo.apache.org/contributor/verifying-release

Please vote one of:
[ ] +1 - I have verified and accept...
[ ] +0 - I have reservations, but not strong enough to vote against...
[ ] -1 - Because..., I do not accept...
... these artifacts as the 2.1.0 release of Apache Accumulo.

This vote will remain open until at least Sun Oct 30 06:00:00 AM UTC 2022.
(Sun Oct 30 02:00:00 AM EDT 2022 / Sat Oct 29 11:00:00 PM PDT 2022)
Voting can continue after this deadline until the release manager
sends an email ending the vote.

Thanks!

P.S. Hint: download the whole staging repo with
wget -erobots=off -r -l inf -np -nH \

https://repository.apache.org/content/repositories/orgapacheaccumulo-1097/
# note the trailing slash is needed

[WITHDRAWN][VOTE] Apache Accumulo 2.1.0-rc3

2022-10-26 Thread Christopher

Okay, hopefully this is the last time. I'm once again withdrawing this
release candidate due to the issue identified in #3050, and the resulting
failing ITs.

On Wed, Oct 26, 2022 at 3:38 PM dev1  wrote:

> -1 - There seems to be an issue with the testCompactions test in
> ShellServerIT - not sure if its related to the other metadata issues, but
> looks odd - see https://github.com/apache/accumulo/issues/3050
>
> -Original Message-
> From: Christopher 
> Sent: Wednesday, October 26, 2022 3:03 PM
> To: accumulo-dev 
> Subject: [VOTE] Apache Accumulo 2.1.0-rc3
>
> Accumulo Developers,
>
> Please consider the following candidate for Apache Accumulo 2.1.0.
> Notably, the changes since 2.1.0-rc1 were:
>
> * https://github.com/apache/accumulo/pull/3047
> * https://github.com/apache/accumulo/pull/3049
>
> Git Commit:
> d174737cb132027b9e74b5d30ddfbf949d3d9848
> Branch:
> 2.1.0-rc3
>
> If this vote passes, a gpg-signed tag will be created using:
> git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
> d174737cb132027b9e74b5d30ddfbf949d3d9848
>
> Staging repo:
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096
> Source (official release artifact):
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
> Binary:
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
>
> Append ".asc" to download the cryptographic signature for a given artifact.
> (You can also append ".sha1" or ".md5" instead in order to verify the
> checksums generated by Maven to verify the integrity of the Nexus
> repository staging
> area.)
>
> Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>
> In addition to the tarballs and their signatures, the following checksum
> files will be added to the dist/release SVN area after release:
> accumulo-2.1.0-src.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-src.tar.gz) =
>
> b9e61247eea3674fac6ee7a146420d6d0c132bade82e04769951b2fb8e601982a9c0c2ba8c3aa1adf24e04dcee633a50fef36e36d3a08eb2d00e1c4a88d3176e
> accumulo-2.1.0-bin.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-bin.tar.gz) =
>
> 8a6960c91ae49f21ffc9a7c432c585aa03bc5ffd2cfa95b7d4424c861008490322e0463772be28210333b41ede7cfd7f5c8a79af12ce400f57c6d592e3210b5f
>
> Release notes (in progress) can be found at:
> https://accumulo.staged.apache.org/release/accumulo-2.1.0
>
> Release testing instructions:
> https://accumulo.apache.org/contributor/verifying-release
>
> Please vote one of:
> [ ] +1 - I have verified and accept...
> [ ] +0 - I have reservations, but not strong enough to vote against...
> [ ] -1 - Because..., I do not accept...
> ... these artifacts as the 2.1.0 release of Apache Accumulo.
>
> This vote will remain open until at least Sat Oct 29 07:30:00 PM UTC 2022.
> (Sat Oct 29 03:30:00 PM EDT 2022 / Sat Oct 29 12:30:00 PM PDT 2022) Voting
> can continue after this deadline until the release manager sends an email
> ending the vote.
>
> Thanks!
>
> P.S. Hint: download the whole staging repo with
> wget -erobots=off -r -l inf -np -nH \
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/
> # note the trailing slash is needed
>

Re: [VOTE] Apache Accumulo 2.1.0-rc3

2022-10-26 Thread Christopher

That should have said the changes since rc2, not since rc1. They are
aggregate.

On Wed, Oct 26, 2022 at 3:02 PM Christopher  wrote:

> Accumulo Developers,
>
> Please consider the following candidate for Apache Accumulo 2.1.0.
> Notably, the changes since 2.1.0-rc1 were:
>
> * https://github.com/apache/accumulo/pull/3047
> * https://github.com/apache/accumulo/pull/3049
>
> Git Commit:
> d174737cb132027b9e74b5d30ddfbf949d3d9848
> Branch:
> 2.1.0-rc3
>
> If this vote passes, a gpg-signed tag will be created using:
> git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
> d174737cb132027b9e74b5d30ddfbf949d3d9848
>
> Staging repo:
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096
> Source (official release artifact):
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
> Binary:
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
>
> Append ".asc" to download the cryptographic signature for a given artifact.
> (You can also append ".sha1" or ".md5" instead in order to verify the
> checksums
> generated by Maven to verify the integrity of the Nexus repository staging
> area.)
>
> Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>
> In addition to the tarballs and their signatures, the following checksum
> files will be added to the dist/release SVN area after release:
> accumulo-2.1.0-src.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-src.tar.gz) =
> b9e61247eea3674fac6ee7a146420d6d0c132bade82e04769951b2fb8e601982a9c0c2ba8c3aa1adf24e04dcee633a50fef36e36d3a08eb2d00e1c4a88d3176e
> accumulo-2.1.0-bin.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-bin.tar.gz) =
> 8a6960c91ae49f21ffc9a7c432c585aa03bc5ffd2cfa95b7d4424c861008490322e0463772be28210333b41ede7cfd7f5c8a79af12ce400f57c6d592e3210b5f
>
> Release notes (in progress) can be found at:
> https://accumulo.staged.apache.org/release/accumulo-2.1.0
>
> Release testing instructions:
> https://accumulo.apache.org/contributor/verifying-release
>
> Please vote one of:
> [ ] +1 - I have verified and accept...
> [ ] +0 - I have reservations, but not strong enough to vote against...
> [ ] -1 - Because..., I do not accept...
> ... these artifacts as the 2.1.0 release of Apache Accumulo.
>
> This vote will remain open until at least Sat Oct 29 07:30:00 PM UTC 2022.
> (Sat Oct 29 03:30:00 PM EDT 2022 / Sat Oct 29 12:30:00 PM PDT 2022)
> Voting can continue after this deadline until the release manager
> sends an email ending the vote.
>
> Thanks!
>
> P.S. Hint: download the whole staging repo with
> wget -erobots=off -r -l inf -np -nH \
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/
> # note the trailing slash is needed
>

[VOTE] Apache Accumulo 2.1.0-rc3

2022-10-26 Thread Christopher

Accumulo Developers,

Please consider the following candidate for Apache Accumulo 2.1.0.
Notably, the changes since 2.1.0-rc1 were:

* https://github.com/apache/accumulo/pull/3047
* https://github.com/apache/accumulo/pull/3049

Git Commit:
d174737cb132027b9e74b5d30ddfbf949d3d9848
Branch:
2.1.0-rc3

If this vote passes, a gpg-signed tag will be created using:
git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
d174737cb132027b9e74b5d30ddfbf949d3d9848

Staging repo:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1096
Source (official release artifact):
https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
Binary:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz

Append ".asc" to download the cryptographic signature for a given artifact.
(You can also append ".sha1" or ".md5" instead in order to verify the
checksums
generated by Maven to verify the integrity of the Nexus repository staging
area.)

Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
(Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)

In addition to the tarballs and their signatures, the following checksum
files will be added to the dist/release SVN area after release:
accumulo-2.1.0-src.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-src.tar.gz) =
b9e61247eea3674fac6ee7a146420d6d0c132bade82e04769951b2fb8e601982a9c0c2ba8c3aa1adf24e04dcee633a50fef36e36d3a08eb2d00e1c4a88d3176e
accumulo-2.1.0-bin.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-bin.tar.gz) =
8a6960c91ae49f21ffc9a7c432c585aa03bc5ffd2cfa95b7d4424c861008490322e0463772be28210333b41ede7cfd7f5c8a79af12ce400f57c6d592e3210b5f

Release notes (in progress) can be found at:
https://accumulo.staged.apache.org/release/accumulo-2.1.0

Release testing instructions:
https://accumulo.apache.org/contributor/verifying-release

Please vote one of:
[ ] +1 - I have verified and accept...
[ ] +0 - I have reservations, but not strong enough to vote against...
[ ] -1 - Because..., I do not accept...
... these artifacts as the 2.1.0 release of Apache Accumulo.

This vote will remain open until at least Sat Oct 29 07:30:00 PM UTC 2022.
(Sat Oct 29 03:30:00 PM EDT 2022 / Sat Oct 29 12:30:00 PM PDT 2022)
Voting can continue after this deadline until the release manager
sends an email ending the vote.

Thanks!

P.S. Hint: download the whole staging repo with
wget -erobots=off -r -l inf -np -nH \

https://repository.apache.org/content/repositories/orgapacheaccumulo-1096/
# note the trailing slash is needed

[WITHDRAWN][VOTE] Apache Accumulo 2.1.0-rc2

2022-10-26 Thread Christopher

I agree. I'm withdrawing this RC and will create RC3 shortly.

On Wed, Oct 26, 2022 at 12:35 PM Keith Turner  wrote:

> -1
>
> We probably want the changes in
> https://github.com/apache/accumulo/pull/3049
>
> The changes in #3044 broke an important part of Ample.  The problem
> does not manifest until there are multiple tables, which is why the
> bulk ITs were happy.
>
> On Wed, Oct 26, 2022 at 4:21 AM Christopher  wrote:
> >
> > Accumulo Developers,
> >
> > Please consider the following candidate for Apache Accumulo 2.1.0.
> > Notably, the changes since 2.1.0-rc1 were:
> >
> >   * https://github.com/apache/accumulo/pull/3043
> >   * https://github.com/apache/accumulo/pull/3044
> >
> > There are also 3 issues that were identified that could yet serve as
> > blockers for this release, but need further investigation:
> >
> >   * https://github.com/apache/accumulo/issues/3046
> >   * https://github.com/apache/accumulo/issues/3045
> >   * https://github.com/apache/accumulo/issues/3042
> >
> >
> > Git Commit:
> > 224a2baaab04b2ec5da92daae7cc74cbd85eed2d
> > Branch:
> > 2.1.0-rc2
> >
> > If this vote passes, a gpg-signed tag will be created using:
> > git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
> > 224a2baaab04b2ec5da92daae7cc74cbd85eed2d
> >
> > Staging repo:
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1095
> > Source (official release artifact):
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1095/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
> > Binary:
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1095/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
> >
> > Append ".asc" to download the cryptographic signature for a given
> artifact.
> > (You can also append ".sha1" or ".md5" instead in order to verify the
> > checksums
> > generated by Maven to verify the integrity of the Nexus repository
> staging
> > area.)
> >
> > Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> > (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
> >
> > In addition to the tarballs and their signatures, the following checksum
> > files will be added to the dist/release SVN area after release:
> > accumulo-2.1.0-src.tar.gz.sha512 will contain:
> > SHA512 (accumulo-2.1.0-src.tar.gz) =
> >
> 99806fad6cc0cd23999ff0f3404d0a25fe8ddbbe6fb1cc783dd153580a3aed98d773981dbe26c89b04a644205bdd04019d98208e681c52fb9ccaa4f0fa1d8254
> > accumulo-2.1.0-bin.tar.gz.sha512 will contain:
> > SHA512 (accumulo-2.1.0-bin.tar.gz) =
> >
> 40a6befb7dd223af2fccb8eb67a28022f293275b193297de91b209ed7c8103a28fd3a4811216ccc4ce1bf953dc89674a46abbb4851cbca62795cf1211ba9022b
> >
> > Release notes (in progress) can be found at:
> > https://accumulo.staged.apache.org/release/accumulo-2.1.0
> >
> > Release testing instructions:
> > https://accumulo.apache.org/contributor/verifying-release
> >
> > Please vote one of:
> > [ ] +1 - I have verified and accept...
> > [ ] +0 - I have reservations, but not strong enough to vote against...
> > [ ] -1 - Because..., I do not accept...
> > ... these artifacts as the 2.1.0 release of Apache Accumulo.
> >
> > This vote will remain open until at least Sat Oct 29 03:30:00 AM UTC
> 2022.
> > (Fri Oct 28 11:30:00 PM EDT 2022 / Fri Oct 28 08:30:00 PM PDT 2022)
> > Voting can continue after this deadline until the release manager
> > sends an email ending the vote.
> >
> > Thanks!
> >
> > P.S. Hint: download the whole staging repo with
> > wget -erobots=off -r -l inf -np -nH \
> >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1095/
> > # note the trailing slash is needed
>

[VOTE] Apache Accumulo 2.1.0-rc2

2022-10-25 Thread Christopher

Accumulo Developers,

Please consider the following candidate for Apache Accumulo 2.1.0.
Notably, the changes since 2.1.0-rc1 were:

  * https://github.com/apache/accumulo/pull/3043
  * https://github.com/apache/accumulo/pull/3044

There are also 3 issues that were identified that could yet serve as
blockers for this release, but need further investigation:

  * https://github.com/apache/accumulo/issues/3046
  * https://github.com/apache/accumulo/issues/3045
  * https://github.com/apache/accumulo/issues/3042


Git Commit:
224a2baaab04b2ec5da92daae7cc74cbd85eed2d
Branch:
2.1.0-rc2

If this vote passes, a gpg-signed tag will be created using:
git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
224a2baaab04b2ec5da92daae7cc74cbd85eed2d

Staging repo:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1095
Source (official release artifact):
https://repository.apache.org/content/repositories/orgapacheaccumulo-1095/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
Binary:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1095/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz

Append ".asc" to download the cryptographic signature for a given artifact.
(You can also append ".sha1" or ".md5" instead in order to verify the
checksums
generated by Maven to verify the integrity of the Nexus repository staging
area.)

Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
(Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)

In addition to the tarballs and their signatures, the following checksum
files will be added to the dist/release SVN area after release:
accumulo-2.1.0-src.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-src.tar.gz) =
99806fad6cc0cd23999ff0f3404d0a25fe8ddbbe6fb1cc783dd153580a3aed98d773981dbe26c89b04a644205bdd04019d98208e681c52fb9ccaa4f0fa1d8254
accumulo-2.1.0-bin.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-bin.tar.gz) =
40a6befb7dd223af2fccb8eb67a28022f293275b193297de91b209ed7c8103a28fd3a4811216ccc4ce1bf953dc89674a46abbb4851cbca62795cf1211ba9022b

Release notes (in progress) can be found at:
https://accumulo.staged.apache.org/release/accumulo-2.1.0

Release testing instructions:
https://accumulo.apache.org/contributor/verifying-release

Please vote one of:
[ ] +1 - I have verified and accept...
[ ] +0 - I have reservations, but not strong enough to vote against...
[ ] -1 - Because..., I do not accept...
... these artifacts as the 2.1.0 release of Apache Accumulo.

This vote will remain open until at least Sat Oct 29 03:30:00 AM UTC 2022.
(Fri Oct 28 11:30:00 PM EDT 2022 / Fri Oct 28 08:30:00 PM PDT 2022)
Voting can continue after this deadline until the release manager
sends an email ending the vote.

Thanks!

P.S. Hint: download the whole staging repo with
wget -erobots=off -r -l inf -np -nH \

https://repository.apache.org/content/repositories/orgapacheaccumulo-1095/
# note the trailing slash is needed

[WITHDRAWN][VOTE] Apache Accumulo 2.1.0-rc1

2022-10-25 Thread Christopher

I'm withdrawing RC1 from consideration, due to the issues that have been
identified so far, and will prepare an RC2 when these issues are resolved.

Please continue whatever testing you are doing, in case there are any other
issues to fix.

Note: please submit PRs against the newly created 2.1 maintenance branch. I
will do subsequent release candidates from that branch. Those changes can
be merged into the main branch after being applied to the 2.1 maintenance
branch first.

Thanks,
Christopher

On Tue, Oct 25, 2022 at 11:35 AM dev1  wrote:

> -1
>
> discovered an issue with the upgrade code.  If the stand-alone upgrade
> tool is used the cluster will not start (issue #3041)
>
> Ed Coleman
>
> -Original Message-
> From: Christopher 
> Sent: Monday, October 24, 2022 5:13 PM
> To: accumulo-dev 
> Subject: [VOTE] Apache Accumulo 2.1.0-rc1
>
> Accumulo Developers,
>
> Please consider the following candidate for Apache Accumulo 2.1.0.
>
> Git Commit:
> 92b07213f5e3e7f77be56f0866316b2f0eebe191
> Branch:
> 2.1.0-rc1
>
> If this vote passes, a gpg-signed tag will be created using:
> git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
> 92b07213f5e3e7f77be56f0866316b2f0eebe191
>
> Staging repo:
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1094
> Source (official release artifact):
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1094/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
> Binary:
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1094/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz
>
> Append ".asc" to download the cryptographic signature for a given artifact.
> (You can also append ".sha1" or ".md5" instead in order to verify the
> checksums generated by Maven to verify the integrity of the Nexus
> repository staging
> area.)
>
> Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>
> In addition to the tarballs and their signatures, the following checksum
> files will be added to the dist/release SVN area after release:
> accumulo-2.1.0-src.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-src.tar.gz) =
>
> 4ce9f2cccd1f126eaa46c1c504b56255c18add04ab655821e1dc64ea74a1954f2124a88e75d3223792184eaa0b49c13ac6f00f563bf94069c79b45fc6a0fd5c6
> accumulo-2.1.0-bin.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.1.0-bin.tar.gz) =
>
> 9b619fa56f5d3532c226aa0fa0d88a0ee53692f890c047f7e0436f6e6814397562201302cfbfb2531f5705750f3576480bc522a8ba662c55c4242def801351dd
>
> Release notes (in progress) can be found at:
> https://accumulo.staged.apache.org/release/accumulo-2.1.0
>
> Release testing instructions:
> https://accumulo.apache.org/contributor/verifying-release
>
> Please vote one of:
> [ ] +1 - I have verified and accept...
> [ ] +0 - I have reservations, but not strong enough to vote against...
> [ ] -1 - Because..., I do not accept...
> ... these artifacts as the 2.1.0 release of Apache Accumulo.
>
> This vote will remain open until at least Thu Oct 27 09:30:00 PM UTC 2022.
> (Thu Oct 27 05:30:00 PM EDT 2022 / Thu Oct 27 02:30:00 PM PDT 2022) Voting
> can continue after this deadline until the release manager sends an email
> ending the vote.
>
> Thanks!
>
> P.S. Hint: download the whole staging repo with
> wget -erobots=off -r -l inf -np -nH \
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1094/
> # note the trailing slash is needed
>

Creating a 2.1 maintenance branch and next version thoughts

2022-10-25 Thread Christopher

I've created a 2.1 maintenance branch off of main. If more release
candidates are needed, we can create them from there. That allows us to
move the main branch forward for the next version without impacting 2.1. I
expect the next release to be a 3.0 non-LTM, because we have a lot of cruft
to remove that is breaking API changes, and it'd be good to do that prior
to adding new 3.x features.

Christopher

[VOTE] Apache Accumulo 2.1.0-rc1

2022-10-24 Thread Christopher

Accumulo Developers,

Please consider the following candidate for Apache Accumulo 2.1.0.

Git Commit:
92b07213f5e3e7f77be56f0866316b2f0eebe191
Branch:
2.1.0-rc1

If this vote passes, a gpg-signed tag will be created using:
git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
92b07213f5e3e7f77be56f0866316b2f0eebe191

Staging repo:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1094
Source (official release artifact):
https://repository.apache.org/content/repositories/orgapacheaccumulo-1094/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
Binary:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1094/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz

Append ".asc" to download the cryptographic signature for a given artifact.
(You can also append ".sha1" or ".md5" instead in order to verify the
checksums
generated by Maven to verify the integrity of the Nexus repository staging
area.)

Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
(Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)

In addition to the tarballs and their signatures, the following checksum
files will be added to the dist/release SVN area after release:
accumulo-2.1.0-src.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-src.tar.gz) =
4ce9f2cccd1f126eaa46c1c504b56255c18add04ab655821e1dc64ea74a1954f2124a88e75d3223792184eaa0b49c13ac6f00f563bf94069c79b45fc6a0fd5c6
accumulo-2.1.0-bin.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-bin.tar.gz) =
9b619fa56f5d3532c226aa0fa0d88a0ee53692f890c047f7e0436f6e6814397562201302cfbfb2531f5705750f3576480bc522a8ba662c55c4242def801351dd

Release notes (in progress) can be found at:
https://accumulo.staged.apache.org/release/accumulo-2.1.0

Release testing instructions:
https://accumulo.apache.org/contributor/verifying-release

Please vote one of:
[ ] +1 - I have verified and accept...
[ ] +0 - I have reservations, but not strong enough to vote against...
[ ] -1 - Because..., I do not accept...
... these artifacts as the 2.1.0 release of Apache Accumulo.

This vote will remain open until at least Thu Oct 27 09:30:00 PM UTC 2022.
(Thu Oct 27 05:30:00 PM EDT 2022 / Thu Oct 27 02:30:00 PM PDT 2022)
Voting can continue after this deadline until the release manager
sends an email ending the vote.

Thanks!

P.S. Hint: download the whole staging repo with
wget -erobots=off -r -l inf -np -nH \

https://repository.apache.org/content/repositories/orgapacheaccumulo-1094/
# note the trailing slash is needed

[TEST][VOTE] Apache Accumulo 2.1.0-rc0

2022-10-19 Thread Christopher

Accumulo Developers,

NOTE: this is only a test vote. You do *NOT* need to vote. This test
release candidate can serve as a baseline for testing for 2.1.0 releases,
and to test the release process automation. Again, this is not a real vote.
A subsequent release candidate will be made when an actual vote is called.
However, feel free to reply to this thread with any testing results, or to
discuss any remaining tasks needed to release 2.1.

Please consider the following candidate for Apache Accumulo 2.1.0.

Git Commit:
5a005dd6a7752330e9984ff4ee7d8a122b5b116d
Branch:
2.1.0-rc0

If this vote passes, a gpg-signed tag will be created using:
git tag -f -s -m 'Apache Accumulo 2.1.0' rel/2.1.0 \
5a005dd6a7752330e9984ff4ee7d8a122b5b116d

Staging repo:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1093
Source (official release artifact):
https://repository.apache.org/content/repositories/orgapacheaccumulo-1093/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-src.tar.gz
Binary:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1093/org/apache/accumulo/accumulo/2.1.0/accumulo-2.1.0-bin.tar.gz

Append ".asc" to download the cryptographic signature for a given artifact.
(You can also append ".sha1" or ".md5" instead in order to verify the
checksums
generated by Maven to verify the integrity of the Nexus repository staging
area.)

Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
(Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)

In addition to the tarballs and their signatures, the following checksum
files will be added to the dist/release SVN area after release:
accumulo-2.1.0-src.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-src.tar.gz) =
d4bb6142482212990455205d16ea68ce401b7c750a9a8ed2ca75475bf6fedf4620526122ab861d4faea13808f5c19a2587d1d15b648f027bba25bb524e670119
accumulo-2.1.0-bin.tar.gz.sha512 will contain:
SHA512 (accumulo-2.1.0-bin.tar.gz) =
9f238ef8af12bda3cdca7b73b10156654c682082492111a5c88174fc97312e08b0a8862567e85e78675d3764adfc442fbe7d6dd049f982260e60b2929cabffdf

Release notes (in progress) can be found at:
https://accumulo.staged.apache.org/release/accumulo-2.1.0

Release testing instructions:
https://accumulo.apache.org/contributor/verifying-release

Please vote one of:
[ ] +1 - I have verified and accept...
[ ] +0 - I have reservations, but not strong enough to vote against...
[ ] -1 - Because..., I do not accept...
... these artifacts as the 2.1.0 release of Apache Accumulo.

This vote will remain open until at least Sun Oct 23 05:00:00 AM UTC 2022.
(Sun Oct 23 01:00:00 AM EDT 2022 / Sat Oct 22 10:00:00 PM PDT 2022)
Voting can continue after this deadline until the release manager
sends an email ending the vote.

Thanks!

P.S. Hint: download the whole staging repo with
wget -erobots=off -r -l inf -np -nH \

https://repository.apache.org/content/repositories/orgapacheaccumulo-1093/
# note the trailing slash is needed

Re: Hadoop Metrics2 and JMX

2022-10-12 Thread Christopher

I don't think we're doing anything special to publish to JMX. I think this
is something that is a feature of Hadoop Metrics2 that we're simply
enabling. So, this might be a question for the Hadoop general mailing list
if nobody knows the answer here.

On Wed, Oct 12, 2022 at 1:06 PM Logan Jones  wrote:

> Hello:
>
> I'm trying to figure out more about the metrics coming out of Accumulo
> 1.9.3 and 1.10.2. I'm currently configuring the hadoop metrics 2 system and
> sending that to influxDB. In theory, I could also look at the JMX metrics.
>
> Are the JMX metrics a superset of what comes out of Hadoop Metrics2?
>
> Thanks in advance,
>
> - Logan
>

Re: [EXTERNAL] Re: New committer / PMC member: Chris Shannon

2022-09-29 Thread Christopher Shannon

Thanks everyone, I am excited to join the PMC and looking forward to
continuing my contributions in the future in my new role.

On Thu, Sep 29, 2022 at 10:22 AM Arvind Shyamsundar
 wrote:

> Congratulations, Chris!
>
> Arvind Shyamsundar (HE / HIM)
>
> -Original Message-
> From: Christopher 
> Sent: Thursday, September 29, 2022 7:19 AM
> To: dev@accumulo.apache.org
> Subject: [EXTERNAL] Re: New committer / PMC member: Chris Shannon
>
> Congrats, and welcome
>
> On Thu, Sep 29, 2022 at 9:51 AM Jeffrey Manno 
> wrote:
>
> > Congrats and welcome!
> >
> > On Thu, Sep 29, 2022 at 9:43 AM Dave Marion  wrote:
> >
> > > Congrats Chris!
> > >
> > > On Thu, Sep 29, 2022 at 9:26 AM Dominic Garguilo
> > >  > >
> > > wrote:
> > >
> > > > Congrats and welcome, Chris!
> > > >
> > > > On Thu, Sep 29, 2022 at 9:15 AM dev1  wrote:
> > > >
> > > > > The Project Management Committee (PMC) for Apache Accumulo has
> > invited
> > > > > Chris to become a committer and PMC member and we are pleased to
> > > announce
> > > > > that they have accepted.  Please join me in welcoming Chris
> > > > > Shannon
> > to
> > > > the
> > > > > Accumulo Community.
> > > > >
> > > > > Chris has contributed several quality improvements and new
> > > functionality
> > > > > to Accumulo, has written documentation for our website, has been
> > > > > very helpful in engaging with questions on various issues, and
> > > > > has helped
> > in
> > > > > testing. We are happy to have them contributing to the Accumulo
> > > > community!
> > > > >
> > > > > Being a committer enables easier contribution to the project
> > > > > since
> > > there
> > > > > is no need to go via the patch submission process. This should
> > > > > enable better productivity. Being a PMC member enables
> > > > > assistance with the management and to guide the direction of the
> project.
> > > > >
> > > > > Welcome Chris!
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: New committer / PMC member: Chris Shannon

2022-09-29 Thread Christopher

Congrats, and welcome

On Thu, Sep 29, 2022 at 9:51 AM Jeffrey Manno 
wrote:

> Congrats and welcome!
>
> On Thu, Sep 29, 2022 at 9:43 AM Dave Marion  wrote:
>
> > Congrats Chris!
> >
> > On Thu, Sep 29, 2022 at 9:26 AM Dominic Garguilo  >
> > wrote:
> >
> > > Congrats and welcome, Chris!
> > >
> > > On Thu, Sep 29, 2022 at 9:15 AM dev1  wrote:
> > >
> > > > The Project Management Committee (PMC) for Apache Accumulo has
> invited
> > > > Chris to become a committer and PMC member and we are pleased to
> > announce
> > > > that they have accepted.  Please join me in welcoming Chris Shannon
> to
> > > the
> > > > Accumulo Community.
> > > >
> > > > Chris has contributed several quality improvements and new
> > functionality
> > > > to Accumulo, has written documentation for our website, has been very
> > > > helpful in engaging with questions on various issues, and has helped
> in
> > > > testing. We are happy to have them contributing to the Accumulo
> > > community!
> > > >
> > > > Being a committer enables easier contribution to the project since
> > there
> > > > is no need to go via the patch submission process. This should enable
> > > > better productivity. Being a PMC member enables assistance with the
> > > > management and to guide the direction of the project.
> > > >
> > > > Welcome Chris!
> > > >
> > > >
> > > >
> > >
> >
>

Re: Re: Re: Re: Question about Accumulo Tracer

2022-07-29 Thread Christopher

If the version being used is 2.0.1, then that version is still affected by
the bug you mention. However, since TraceServer never used AbstractServer,
it was not affected and should still work fine.

On Fri, Jul 29, 2022 at 3:05 PM Dave Marion  wrote:

> Ok, I wasn't sure what version was being used here. I saw a reference to
> 2.0.1 in an earlier email.
>
> On Fri, Jul 29, 2022 at 3:03 PM Christopher  wrote:
>
> > That was a bug in the new AbstractServer class in 2.x. I don't think it
> > ever affected 1.x
> > I checked the 1.10 code and it wouldn't affect the tracer server. `-a`
> > should still work fine there.
> >
> > On Fri, Jul 29, 2022 at 3:00 PM Dave Marion  wrote:
> >
> > > https://github.com/apache/accumulo/pull/2119 fixed a bug in 2.1 where
> > the
> > > -a argument was not correctly setting the hostname. It looks like 2.0.1
> > is
> > > affected by this too.
> > >
> > > On Fri, Jul 29, 2022 at 12:57 PM Christopher 
> > wrote:
> > >
> > > > The tracer should be advertising its own address in ZK. By default,
> the
> > > > server listens on `0.0.0.0`, unless `-a` or `--address` is specified
> on
> > > the
> > > > command-line when it is started. Most server types use a utility
> class
> > > that
> > > > will use `InetAddress.getLocalHost().getCanonicalHostName()` for the
> > > > advertisement address if it sees that it is listening on `0.0.0.0`.
> > > > However, it looks like the tracer doesn't do this.
> > > >
> > > > I would try to verify that `0.0.0.0` actually appears as the
> advertised
> > > > address in ZK for the tracer service, just be sure it's not a
> > > > ZooTraceClient bug on the tablet server side (but I'm pretty sure
> it's
> > > not,
> > > > after looking at the code).
> > > >
> > > > As a workaround, I would start up the tracer service using a script
> > that
> > > > sets `--address accumulo_tracer` on the command-line, where
> > > > "accumulo_tracer" is the hostname or IP address that you want the
> > tracer
> > > > listening on and advertising to other servers. This address should be
> > > > bindable where the tracer is running, and reachable from wherever the
> > > > tservers are (so make sure it's not a private address only available
> > > inside
> > > > the container running the tracer). The reason for this is that the
> > > hostname
> > > > will resolve to an IP address inside the tracer process, and then the
> > > > tracer will advertise the IP address. It won't advertise the original
> > > > hostname, if you specified a hostname.
> > > >
> > > > I hope that helps.
> > > >
> > > >
> > > > On Fri, Jul 29, 2022 at 12:18 PM kma  wrote:
> > > >
> > > > > Nice,
> > > > >
> > > > > Turn on debug really help.
> > > > >
> > > > > 2022-07-29 12:12:14,235 [tracer.ZooTraceClient] DEBUG: Scanning
> trace
> > > > > hosts in zookeeper: /tracers
> > > > > 2022-07-29 12:12:14,240 [tracer.ZooTraceClient] DEBUG: Trace hosts:
> > > > > [0.0.0.0:12234, 0.0.0.0:12234]
> > > > > 2022-07-29 12:12:14,240 [tracer.ZooTraceClient] DEBUG: Successfully
> > > > > initialized tracer hosts from ZooKeeper
> > > > >
> > > > > [root@accumulo_gc /]# curl 0.0.0.0:12234
> > > > > curl: (7) Failed connect to 0.0.0.0:12234; Connection refused
> > > > >
> > > > > [root@accumulo_gc /]# curl accumulo_tracer:12234
> > > > > curl: (52) Empty reply from server
> > > > >
> > > > > Is there a way to set Trace hosts to accumulo_tracer:12234
> > > > >
> > > > > Cheers!...
> > > > > ...Keith
> > > > >
> > > > >
> > > > > On 2022/07/28 23:40:18 Christopher wrote:
> > > > >  > Do all servers have the same configuration?
> > > > >  >
> > > > >  > I would investigate the tablet server debug logs to determine if
> > > it's
> > > > >  > having trouble setting up tracing. It should be able to locate
> the
> > > > > tracer
> > > > >  > service by talking to ZooKeeper and observing its service
> address
> > > > >  > advertisement, similar to how other servers register themselves
> in
> &g

Re: Re: Re: Re: Question about Accumulo Tracer

2022-07-29 Thread Christopher

That was a bug in the new AbstractServer class in 2.x. I don't think it
ever affected 1.x
I checked the 1.10 code and it wouldn't affect the tracer server. `-a`
should still work fine there.

On Fri, Jul 29, 2022 at 3:00 PM Dave Marion  wrote:

> https://github.com/apache/accumulo/pull/2119 fixed a bug in 2.1 where the
> -a argument was not correctly setting the hostname. It looks like 2.0.1 is
> affected by this too.
>
> On Fri, Jul 29, 2022 at 12:57 PM Christopher  wrote:
>
> > The tracer should be advertising its own address in ZK. By default, the
> > server listens on `0.0.0.0`, unless `-a` or `--address` is specified on
> the
> > command-line when it is started. Most server types use a utility class
> that
> > will use `InetAddress.getLocalHost().getCanonicalHostName()` for the
> > advertisement address if it sees that it is listening on `0.0.0.0`.
> > However, it looks like the tracer doesn't do this.
> >
> > I would try to verify that `0.0.0.0` actually appears as the advertised
> > address in ZK for the tracer service, just be sure it's not a
> > ZooTraceClient bug on the tablet server side (but I'm pretty sure it's
> not,
> > after looking at the code).
> >
> > As a workaround, I would start up the tracer service using a script that
> > sets `--address accumulo_tracer` on the command-line, where
> > "accumulo_tracer" is the hostname or IP address that you want the tracer
> > listening on and advertising to other servers. This address should be
> > bindable where the tracer is running, and reachable from wherever the
> > tservers are (so make sure it's not a private address only available
> inside
> > the container running the tracer). The reason for this is that the
> hostname
> > will resolve to an IP address inside the tracer process, and then the
> > tracer will advertise the IP address. It won't advertise the original
> > hostname, if you specified a hostname.
> >
> > I hope that helps.
> >
> >
> > On Fri, Jul 29, 2022 at 12:18 PM kma  wrote:
> >
> > > Nice,
> > >
> > > Turn on debug really help.
> > >
> > > 2022-07-29 12:12:14,235 [tracer.ZooTraceClient] DEBUG: Scanning trace
> > > hosts in zookeeper: /tracers
> > > 2022-07-29 12:12:14,240 [tracer.ZooTraceClient] DEBUG: Trace hosts:
> > > [0.0.0.0:12234, 0.0.0.0:12234]
> > > 2022-07-29 12:12:14,240 [tracer.ZooTraceClient] DEBUG: Successfully
> > > initialized tracer hosts from ZooKeeper
> > >
> > > [root@accumulo_gc /]# curl 0.0.0.0:12234
> > > curl: (7) Failed connect to 0.0.0.0:12234; Connection refused
> > >
> > > [root@accumulo_gc /]# curl accumulo_tracer:12234
> > > curl: (52) Empty reply from server
> > >
> > > Is there a way to set Trace hosts to accumulo_tracer:12234
> > >
> > > Cheers!...
> > > ...Keith
> > >
> > >
> > > On 2022/07/28 23:40:18 Christopher wrote:
> > >  > Do all servers have the same configuration?
> > >  >
> > >  > I would investigate the tablet server debug logs to determine if
> it's
> > >  > having trouble setting up tracing. It should be able to locate the
> > > tracer
> > >  > service by talking to ZooKeeper and observing its service address
> > >  > advertisement, similar to how other servers register themselves in
> > >  > ZooKeeper. I'm not a docker network expert, but whatever service
> > address
> > >  > the tracer service is advertising there should be routable from the
> > > tablet
> > >  > servers.
> > >  >
> > >  > On Thu, Jul 28, 2022 at 7:31 PM kma  wrote:
> > >  >
> > >  > > Thanks again Christopher,
> > >  > >
> > >  > > Our environment is a little different from usual.
> > >  > >
> > >  > > We have accumulo tracer running in it's own docker container and
> the
> > >  > > other accumulo services (e.g. gc, master, tserver, etc.) are also
> > >  > > running in different docker containers. We also have kerberos
> > enabled.
> > >  > >
> > >  > > The following is our trace related configurations
> > >  > >
> > >  > > default | trace.port.client . | 12234
> > >  > > default | trace.span.receivers .. |
> > >  > > org.apache.accumulo.tracer.ZooTraceClient
> > >  > > default | trace.table . | trace
> > >  > > default | trace.token.type ...

Re: Re: Re: Re: Question about Accumulo Tracer

2022-07-29 Thread Christopher

The tracer should be advertising its own address in ZK. By default, the
server listens on `0.0.0.0`, unless `-a` or `--address` is specified on the
command-line when it is started. Most server types use a utility class that
will use `InetAddress.getLocalHost().getCanonicalHostName()` for the
advertisement address if it sees that it is listening on `0.0.0.0`.
However, it looks like the tracer doesn't do this.

I would try to verify that `0.0.0.0` actually appears as the advertised
address in ZK for the tracer service, just be sure it's not a
ZooTraceClient bug on the tablet server side (but I'm pretty sure it's not,
after looking at the code).

As a workaround, I would start up the tracer service using a script that
sets `--address accumulo_tracer` on the command-line, where
"accumulo_tracer" is the hostname or IP address that you want the tracer
listening on and advertising to other servers. This address should be
bindable where the tracer is running, and reachable from wherever the
tservers are (so make sure it's not a private address only available inside
the container running the tracer). The reason for this is that the hostname
will resolve to an IP address inside the tracer process, and then the
tracer will advertise the IP address. It won't advertise the original
hostname, if you specified a hostname.

I hope that helps.

On Fri, Jul 29, 2022 at 12:18 PM kma  wrote:

> Nice,
>
> Turn on debug really help.
>
> 2022-07-29 12:12:14,235 [tracer.ZooTraceClient] DEBUG: Scanning trace
> hosts in zookeeper: /tracers
> 2022-07-29 12:12:14,240 [tracer.ZooTraceClient] DEBUG: Trace hosts:
> [0.0.0.0:12234, 0.0.0.0:12234]
> 2022-07-29 12:12:14,240 [tracer.ZooTraceClient] DEBUG: Successfully
> initialized tracer hosts from ZooKeeper
>
> [root@accumulo_gc /]# curl 0.0.0.0:12234
> curl: (7) Failed connect to 0.0.0.0:12234; Connection refused
>
> [root@accumulo_gc /]# curl accumulo_tracer:12234
> curl: (52) Empty reply from server
>
> Is there a way to set Trace hosts to accumulo_tracer:12234
>
> Cheers!...
> ...Keith
>
>
> On 2022/07/28 23:40:18 Christopher wrote:
>  > Do all servers have the same configuration?
>  >
>  > I would investigate the tablet server debug logs to determine if it's
>  > having trouble setting up tracing. It should be able to locate the
> tracer
>  > service by talking to ZooKeeper and observing its service address
>  > advertisement, similar to how other servers register themselves in
>  > ZooKeeper. I'm not a docker network expert, but whatever service address
>  > the tracer service is advertising there should be routable from the
> tablet
>  > servers.
>  >
>  > On Thu, Jul 28, 2022 at 7:31 PM kma  wrote:
>  >
>  > > Thanks again Christopher,
>  > >
>  > > Our environment is a little different from usual.
>  > >
>  > > We have accumulo tracer running in it's own docker container and the
>  > > other accumulo services (e.g. gc, master, tserver, etc.) are also
>  > > running in different docker containers. We also have kerberos enabled.
>  > >
>  > > The following is our trace related configurations
>  > >
>  > > default | trace.port.client . | 12234
>  > > default | trace.span.receivers .. |
>  > > org.apache.accumulo.tracer.ZooTraceClient
>  > > default | trace.table . | trace
>  > > default | trace.token.type  |
>  > > org.apache.accumulo.core.client.security.tokens.PasswordToken
>  > > site | @override .. |
>  > > org.apache.accumulo.core.client.security.tokens.KerberosToken
>  > > default | trace.user .. | root
>  > > site | @override .. |
>  > > accumulo-tra...@dev.phemi.com
>  > > default | trace.zookeeper.path  | /tracers
>  > >
>  > > And we can confirm that `trace on` and `trace off` works well in the
>  > > accumulo tracer container where the tracer process is running.
>  > >
>  > > However, `trace on` and `trace off` does not work in any other
>  > > containers. This is probably why we don't see compaction trace
> messages
>  > > in the trace table. And it's likely because these containers don't
> know
>  > > where the tracer service is running ?
>  > >
>  > > Question: How does the other accumulo services know where tracer
> service
>  > > is running ?
>  > > Question: Is there a way to configure the tracer host where the tracer
>  > > service is running ?
>  > >
>  > > Cheers!...
>  > > ...Keith
>  > >
>  > >
&

Re: Re: Re: Question about Accumulo Tracer

2022-07-28 Thread Christopher

Do all servers have the same configuration?

I would investigate the tablet server debug logs to determine if it's
having trouble setting up tracing. It should be able to locate the tracer
service by talking to ZooKeeper and observing its service address
advertisement, similar to how other servers register themselves in
ZooKeeper. I'm not a docker network expert, but whatever service address
the tracer service is advertising there should be routable from the tablet
servers.

On Thu, Jul 28, 2022 at 7:31 PM kma  wrote:

> Thanks again Christopher,
>
> Our environment is a little different from usual.
>
> We have accumulo tracer running in it's own docker container and the
> other accumulo services (e.g. gc, master, tserver, etc.) are also
> running in different docker containers. We also have kerberos enabled.
>
> The following is our trace related configurations
>
> default| trace.port.client . | 12234
> default| trace.span.receivers .. |
> org.apache.accumulo.tracer.ZooTraceClient
> default| trace.table . | trace
> default| trace.token.type  |
> org.apache.accumulo.core.client.security.tokens.PasswordToken
> site  |@override .. |
> org.apache.accumulo.core.client.security.tokens.KerberosToken
> default| trace.user .. | root
> site  |@override .. |
> accumulo-tra...@dev.phemi.com
> default| trace.zookeeper.path  | /tracers
>
> And we can confirm that `trace on` and `trace off` works well in the
> accumulo tracer container where the tracer process is running.
>
> However, `trace on` and `trace off` does not work in any other
> containers. This is probably why we don't see compaction trace messages
> in the trace table. And it's likely because these containers don't know
> where the tracer service is running ?
>
> Question: How does the other accumulo services know where tracer service
> is running ?
> Question: Is there a way to configure the tracer host where the tracer
> service is running ?
>
> Cheers!...
> ...Keith
>
>
> On 2022/07/22 14:58:07 Christopher wrote:
>  > I would double check your trace credentials in your Accumulo
> configuration
>  > file. And double check that the user credentials used to write to the
> trace
>  > table have permission to write to that table, and that you're running
> the
>  > accumulo-tracer service.
>  >
>  > On Fri, Jul 22, 2022 at 10:45 AM kma  wrote:
>  >
>  > > Thank you Christopher,
>  > >
>  > > We can confirm that trace on and trace off are working fine in
> accumulo
>  > > shell. We can see those trace messages in the trace table and tracer
>  > > service is running.
>  > >
>  > > However, we are not seeing trace messages for MajC / MinC
>  > >
>  > > We do see these warnings in Monitor
>  > >
>  > > "Tracing spans are being dropped because there are already 5000 spans
>  > > queued for delivery. This does not affect performance, security or
> data
>  > > integrity, but distributed tracing information is being lost."
>  > >
>  > > Any suggestions ?
>  > >
>  > > Cheers!...
>  > > ...Keith
>  > >
>  > >
>  > > On 2022/07/12 19:42:07 Christopher wrote:
>  > > > By default, the tracer service that collects traces and writes
> them to
>  > > > the trace table does not run. Generally, it is also responsible for
>  > > > creating the trace table. If you have an empty trace table, was it
>  > > > created by the tracer service, or did you create it manually?
>  > > >
>  > > > Assuming tracing is working as intended and there is a tracer
> service
>  > > > running, there's only a few operations that trace automatically, and
>  > > > even they do so at a small probability by default (like
> compactions).
>  > > > If your system is just started and you're using default settings, it
>  > > > may not have had a chance to create any traces yet.
>  > > >
>  > > > Some operations can be traced by explicitly turning on tracing once
>  > > > everything is set up. You can test this by trying to turn on tracing
>  > > > in the shell, using the "trace on" command, and then doing a scan,
>  > > > then doing "trace off". I believe this still works as expected in
>  > > > 2.0.1. This should trace client side operations through to the
> server,
>  > > > initiated from the shell. This is also

Re: Re: Question about Accumulo Tracer

2022-07-22 Thread Christopher

I would double check your trace credentials in your Accumulo configuration
file. And double check that the user credentials used to write to the trace
table have permission to write to that table, and that you're running the
accumulo-tracer service.

On Fri, Jul 22, 2022 at 10:45 AM kma  wrote:

> Thank you Christopher,
>
> We can confirm that trace on and trace off are working fine in accumulo
> shell. We can see those trace messages in the trace table and tracer
> service is running.
>
> However, we are not seeing trace messages for MajC / MinC
>
> We do see these warnings in Monitor
>
> "Tracing spans are being dropped because there are already 5000 spans
> queued for delivery. This does not affect performance, security or data
> integrity, but distributed tracing information is being lost."
>
> Any suggestions ?
>
> Cheers!...
> ...Keith
>
>
> On 2022/07/12 19:42:07 Christopher wrote:
>  > By default, the tracer service that collects traces and writes them to
>  > the trace table does not run. Generally, it is also responsible for
>  > creating the trace table. If you have an empty trace table, was it
>  > created by the tracer service, or did you create it manually?
>  >
>  > Assuming tracing is working as intended and there is a tracer service
>  > running, there's only a few operations that trace automatically, and
>  > even they do so at a small probability by default (like compactions).
>  > If your system is just started and you're using default settings, it
>  > may not have had a chance to create any traces yet.
>  >
>  > Some operations can be traced by explicitly turning on tracing once
>  > everything is set up. You can test this by trying to turn on tracing
>  > in the shell, using the "trace on" command, and then doing a scan,
>  > then doing "trace off". I believe this still works as expected in
>  > 2.0.1. This should trace client side operations through to the server,
>  > initiated from the shell. This is also possible to be done in
>  > user-code, but there is no public API for it.
>  >
>  > A lot of the behavior of tracing has had to change in the last few
>  > versions because of the dependencies we previously used for tracing
>  > are no longer supported. Everything should still work like it did in
>  > previous versions, but it's possible 2.0.1 does have issues. For 2.1,
>  > we're expecting to use OpenTelemetry, which should make tracing much
>  > easier and more reliable to configure. However, we do not provide a
>  > built-in OpenTelemetry sink in 2.1 to write tracing information to a
>  > table in Accumulo, like we did in previous versions for HTrace. Such a
>  > thing could be added by the user or a 3rd party, though.
>  >
>  > On Tue, Jul 12, 2022 at 3:25 PM kma  wrote:
>  > >
>  > > Hi Team,
>  > >
>  > > In one of out latest deployment of Accumulo 2.0.1, the trace table is
>  > > empty.
>  > >
>  > > Question: is tracing automatically enabled by default ? If not, how to
>  > > enable it ?
>  > >
>  > > --
>  > > __
>  > > Keith Ma
>  > > Software Developer
>  > > *PHEMI Systems*
>  > > Suite 600 – 777 Hornby Street
>  > > Vancouver, BC V6Z 1S4
>  > > 604-336-1119
>  > >
>  > > website <http://www.phemi.com/> twitter
>  > > <https://twitter.com/PHEMISystems>linkedin
>  > >
> <
> http://www.linkedin.com/company/3561810?trk=tyah=tarId:1403279580554,tas:phemi%20hea,idx:1-1-1
> >
>  > >
>  >
> --
> __
> Keith Ma
> Software Developer
> *PHEMI Systems*
> Suite 600 – 777 Hornby Street
> Vancouver, BC V6Z 1S4
> 604-336-1119
>
> website <http://www.phemi.com/> twitter
> <https://twitter.com/PHEMISystems>linkedin
> <
> http://www.linkedin.com/company/3561810?trk=tyah=tarId:1403279580554,tas:phemi%20hea,idx:1-1-1
> >
>
>

Re: Question about Accumulo Tracer

2022-07-12 Thread Christopher

By default, the tracer service that collects traces and writes them to
the trace table does not run. Generally, it is also responsible for
creating the trace table. If you have an empty trace table, was it
created by the tracer service, or did you create it manually?

Assuming tracing is working as intended and there is a tracer service
running, there's only a few operations that trace automatically, and
even they do so at a small probability by default (like compactions).
If your system is just started and you're using default settings, it
may not have had a chance to create any traces yet.

Some operations can be traced by explicitly turning on tracing once
everything is set up. You can test this by trying to turn on tracing
in the shell, using the "trace on" command, and then doing a scan,
then doing "trace off". I believe this still works as expected in
2.0.1. This should trace client side operations through to the server,
initiated from the shell. This is also possible to be done in
user-code, but there is no public API for it.

A lot of the behavior of tracing has had to change in the last few
versions because of the dependencies we previously used for tracing
are no longer supported. Everything should still work like it did in
previous versions, but it's possible 2.0.1 does have issues. For 2.1,
we're expecting to use OpenTelemetry, which should make tracing much
easier and more reliable to configure. However, we do not provide a
built-in OpenTelemetry sink in 2.1 to write tracing information to a
table in Accumulo, like we did in previous versions for HTrace. Such a
thing could be added by the user or a 3rd party, though.

On Tue, Jul 12, 2022 at 3:25 PM kma  wrote:
>
> Hi Team,
>
> In one of out latest deployment of Accumulo 2.0.1, the trace table is
> empty.
>
> Question: is tracing automatically enabled by default ? If not, how to
> enable it ?
>
> --
> __
> Keith Ma
> Software Developer
> *PHEMI Systems*
> Suite 600 – 777 Hornby Street
> Vancouver, BC V6Z 1S4
> 604-336-1119
>
> website  twitter
> linkedin
> 
>

Re: Behavior of Fates on Failed Compactions

2022-07-06 Thread Christopher

The behavior in case of error is likely undefined, so I'm not entirely
surprised it's behaving this way. There may be things we can do to try to
handle errors more gracefully for user initiated compactions when an
iterator throws an exception, but it's definitely a good idea to write
custom iterators in a way that tries to handle its own errors as much as
possible.

On Wed, Jul 6, 2022, 20:42 Logan Jones  wrote:

> Thanks Chris for the quick reply. I'll explain the behavior I'm seeing, and
> then maybe you all could either confirm this is the intended behavior, or
> decide it's maybe not that great.
>
> My understanding of the happy case for running a user-initiated compaction
> is that a fate/transaction gets created in zookeeper, and the Accumulo
> master node ends up farming off the compactions to the correct tablet
> servers, once the tablets have been completed, somehow the
> fates/transactions in zookeeper get cleaned up.
>
> I experienced a problem, however, in the unhappy case for compactions which
> I have since reproduced. We had a custom iterator configured for a table,
> and that custom iterator was in a bad state (i.e. it was always throwing an
> exception during initialization). What we noticed is that the fates are
> indefinitely stuck IN_PROGRESS and never go away in this case. Effectively
> we have a poison pill, and if you issue too many compactions against that
> table, you can cause other bad problems.
>
> I created a repo to demonstrate the problem as succinctly as I could
> manage:
>
> https://github.com/loganasherjones/accumulo-iterator-failures
>
> I thought initially that maybe it was due to the fact that our iterator was
> throwing an error during initialization, but this appears to be happening
> for any error on next, seek, or init calls.
>
> So my questions are
>
> 1. Is it expected that a failure in a seek, next, or init in an iterator
> during a user-initiated compaction would cause accumulo to non-stop retry
> the compaction
> 2. If so, could you help me understand why?
>
> Thanks in advance,
>
> - Logan
>
>
>
> On Wed, Jul 6, 2022 at 6:31 PM Christopher  wrote:
>
> > Yes, either here (especially if it's related to a bug or proposed code
> > change) or at user@ would work, if it's more of a user question. Here is
> > fine if you're not sure.
> >
> > On Wed, Jul 6, 2022, 16:35 Logan Jones  wrote:
> >
> > > Hello:
> > >
> > > I would like to discuss what happens when iterators cause
> user-initiated
> > > compactions to fail, specifically in relation to the fate transactions.
> > Is
> > > this the right list for this discussion?
> > >
> > > Thanks,
> > >
> > > - Logan
> > >
> >
>

Re: Behavior of Fates on Failed Compactions

2022-07-06 Thread Christopher

Yes, either here (especially if it's related to a bug or proposed code
change) or at user@ would work, if it's more of a user question. Here is
fine if you're not sure.

On Wed, Jul 6, 2022, 16:35 Logan Jones  wrote:

> Hello:
>
> I would like to discuss what happens when iterators cause user-initiated
> compactions to fail, specifically in relation to the fate transactions. Is
> this the right list for this discussion?
>
> Thanks,
>
> - Logan
>

Re: [VOTE] 2.1.0-beta-1 or 2.1.0-alpha-1 Release

2022-06-08 Thread Christopher

Yeah, maybe reconsider in about a week to let some outstanding work settle
a bit?

On Mon, Jun 6, 2022, 19:16 dev1  wrote:

> There seems to be a few PRs that have mostly been resolved and need to be
> merged - the fate print command issues for one. I think we would be better
> off closing off open PRs before creating release candidates.
>
> Ed Coleman
>
> -Original Message-
> From: Christopher 
> Sent: Monday, June 6, 2022 6:49 PM
> To: accumulo-dev 
> Subject: Re: [VOTE] 2.1.0-beta-1 or 2.1.0-alpha-1 Release
>
> Unfortunately, I won't be creating one tonight, though. Maybe tomorrow?
>
> On Mon, Jun 6, 2022 at 4:08 PM Christopher  wrote:
> >
> > Yes, I can create one.
> >
> > On Mon, Jun 6, 2022 at 8:35 AM Mike Miller  wrote:
> > >
> > > Christopher, are you going to create a RC? Let me know if you can't
> > > and I can try to do something to get the release process moving.
> > >
> > > On Wed, Jun 1, 2022 at 3:41 PM Christopher 
> wrote:
> > >
> > > > Yes. I expect we'll do that when we prepare the RC for a beta.
> > > >
> > > > On Tue, May 31, 2022 at 11:20 PM Keith Turner 
> wrote:
> > > > >
> > > > > Should we also create a new 2.1 branch along with doing an alpha
> > > > > or beta release?
> > > > >
> > > > > On Tue, May 31, 2022 at 7:24 AM Christopher 
> wrote:
> > > > > >
> > > > > > I'm not sure it makes sense to vote on whether to do a vote.
> > > > > > I'm okay with creating a beta release later this week. When
> > > > > > we're ready to do that, we can just create that release
> > > > > > candidate and vote on that, though.
> > > > > >
> > > > > > On Tue, May 31, 2022 at 7:12 AM Mike Miller
> > > > > > 
> > > > wrote:
> > > > > > >
> > > > > > > I propose a vote to do a 2.1.0-beta-1 release ASAP. I think
> > > > > > > it would
> > > > be
> > > > > > > advantageous to do a release before folks start taking time
> > > > > > > off for
> > > > summer.
> > > > > > > This would allow better testing and free up the main branch
> > > > > > > for new features. This would be similar to what we did with
> > > > > > > the 2.0.0-alpha releases [1]
> > > > > > >
> > > > > > > My vote is +1 to do a release this week (or next week). Main
> > > > > > > branch released as 2.1.0-beta-1 or 2.1.0-alpha-1
> > > > > > >
> > > > > > > [1]:
> > > > > > > https://accumulo.apache.org/release/accumulo-2.0.0-alpha-1/
> > > >
>

Re: [VOTE] 2.1.0-beta-1 or 2.1.0-alpha-1 Release

2022-06-06 Thread Christopher

Unfortunately, I won't be creating one tonight, though. Maybe tomorrow?

On Mon, Jun 6, 2022 at 4:08 PM Christopher  wrote:
>
> Yes, I can create one.
>
> On Mon, Jun 6, 2022 at 8:35 AM Mike Miller  wrote:
> >
> > Christopher, are you going to create a RC? Let me know if you can't and I
> > can try to do something to get the release process moving.
> >
> > On Wed, Jun 1, 2022 at 3:41 PM Christopher  wrote:
> >
> > > Yes. I expect we'll do that when we prepare the RC for a beta.
> > >
> > > On Tue, May 31, 2022 at 11:20 PM Keith Turner  wrote:
> > > >
> > > > Should we also create a new 2.1 branch along with doing an alpha or
> > > > beta release?
> > > >
> > > > On Tue, May 31, 2022 at 7:24 AM Christopher  wrote:
> > > > >
> > > > > I'm not sure it makes sense to vote on whether to do a vote. I'm okay
> > > > > with creating a beta release later this week. When we're ready to do
> > > > > that, we can just create that release candidate and vote on that,
> > > > > though.
> > > > >
> > > > > On Tue, May 31, 2022 at 7:12 AM Mike Miller 
> > > wrote:
> > > > > >
> > > > > > I propose a vote to do a 2.1.0-beta-1 release ASAP. I think it would
> > > be
> > > > > > advantageous to do a release before folks start taking time off for
> > > summer.
> > > > > > This would allow better testing and free up the main branch for new
> > > > > > features. This would be similar to what we did with the 2.0.0-alpha
> > > > > > releases [1]
> > > > > >
> > > > > > My vote is +1 to do a release this week (or next week). Main branch
> > > > > > released as 2.1.0-beta-1 or 2.1.0-alpha-1
> > > > > >
> > > > > > [1]: https://accumulo.apache.org/release/accumulo-2.0.0-alpha-1/
> > >

Re: [VOTE] 2.1.0-beta-1 or 2.1.0-alpha-1 Release

2022-06-06 Thread Christopher

Yes, I can create one.

On Mon, Jun 6, 2022 at 8:35 AM Mike Miller  wrote:
>
> Christopher, are you going to create a RC? Let me know if you can't and I
> can try to do something to get the release process moving.
>
> On Wed, Jun 1, 2022 at 3:41 PM Christopher  wrote:
>
> > Yes. I expect we'll do that when we prepare the RC for a beta.
> >
> > On Tue, May 31, 2022 at 11:20 PM Keith Turner  wrote:
> > >
> > > Should we also create a new 2.1 branch along with doing an alpha or
> > > beta release?
> > >
> > > On Tue, May 31, 2022 at 7:24 AM Christopher  wrote:
> > > >
> > > > I'm not sure it makes sense to vote on whether to do a vote. I'm okay
> > > > with creating a beta release later this week. When we're ready to do
> > > > that, we can just create that release candidate and vote on that,
> > > > though.
> > > >
> > > > On Tue, May 31, 2022 at 7:12 AM Mike Miller 
> > wrote:
> > > > >
> > > > > I propose a vote to do a 2.1.0-beta-1 release ASAP. I think it would
> > be
> > > > > advantageous to do a release before folks start taking time off for
> > summer.
> > > > > This would allow better testing and free up the main branch for new
> > > > > features. This would be similar to what we did with the 2.0.0-alpha
> > > > > releases [1]
> > > > >
> > > > > My vote is +1 to do a release this week (or next week). Main branch
> > > > > released as 2.1.0-beta-1 or 2.1.0-alpha-1
> > > > >
> > > > > [1]: https://accumulo.apache.org/release/accumulo-2.0.0-alpha-1/
> >

Re: [VOTE] 2.1.0-beta-1 or 2.1.0-alpha-1 Release

2022-06-01 Thread Christopher

Yes. I expect we'll do that when we prepare the RC for a beta.

On Tue, May 31, 2022 at 11:20 PM Keith Turner  wrote:
>
> Should we also create a new 2.1 branch along with doing an alpha or
> beta release?
>
> On Tue, May 31, 2022 at 7:24 AM Christopher  wrote:
> >
> > I'm not sure it makes sense to vote on whether to do a vote. I'm okay
> > with creating a beta release later this week. When we're ready to do
> > that, we can just create that release candidate and vote on that,
> > though.
> >
> > On Tue, May 31, 2022 at 7:12 AM Mike Miller  wrote:
> > >
> > > I propose a vote to do a 2.1.0-beta-1 release ASAP. I think it would be
> > > advantageous to do a release before folks start taking time off for 
> > > summer.
> > > This would allow better testing and free up the main branch for new
> > > features. This would be similar to what we did with the 2.0.0-alpha
> > > releases [1]
> > >
> > > My vote is +1 to do a release this week (or next week). Main branch
> > > released as 2.1.0-beta-1 or 2.1.0-alpha-1
> > >
> > > [1]: https://accumulo.apache.org/release/accumulo-2.0.0-alpha-1/

Re: [VOTE] 2.1.0-beta-1 or 2.1.0-alpha-1 Release

2022-05-31 Thread Christopher

I'm not sure it makes sense to vote on whether to do a vote. I'm okay
with creating a beta release later this week. When we're ready to do
that, we can just create that release candidate and vote on that,
though.

On Tue, May 31, 2022 at 7:12 AM Mike Miller  wrote:
>
> I propose a vote to do a 2.1.0-beta-1 release ASAP. I think it would be
> advantageous to do a release before folks start taking time off for summer.
> This would allow better testing and free up the main branch for new
> features. This would be similar to what we did with the 2.0.0-alpha
> releases [1]
>
> My vote is +1 to do a release this week (or next week). Main branch
> released as 2.1.0-beta-1 or 2.1.0-alpha-1
>
> [1]: https://accumulo.apache.org/release/accumulo-2.0.0-alpha-1/

Re: Reconsider Hadoop 3.3.1 for 2.1.0 release

2022-05-19 Thread Christopher

Since Accumulo doesn't bundle Hadoop into the release, the only
difference this makes is whether or not it breaks our builds during
testing, which could indicate a bug in Hadoop, or an incompatibility
with that version of Hadoop. The version of Accumulo built with 3.3.0
should work perfectly fine with 3.3.1. If you want to actually build
with 3.3.1 on the classpath during compilation time, though, it's
trivial to add -Dhadoop.version=3.3.1 to the Maven command line.

I'm looking at the commit you referenced, and I don't see how that has
anything to do with addressing the test that was flaky.

If you are correct in your assessment that the test was overwhelming
MiniDFS, then that would still be a bug... perhaps a test bug... but
either way, the failure is not expected and should be addressed. If we
bump the version, we'd have to make sure it doesn't break the build...
and that means fixing whatever test issue was causing it. If you're
willing to contribute a pull request to do that work, we can consider
including it. However, as I said initially, it doesn't really
matter... the version specified in the POM is only the version being
built with/tested with by default... it's not bundled into Accumulo in
any way. It's just a fixed version whose API we expect to be
compatible with.

If we do bump it, it should be to 3.3.3, which is the latest 3.3 version.

On Thu, May 19, 2022 at 5:26 PM Chris Bevard  wrote:
>
> Hi,
>
> I was wondering if the dev team would reconsider using the Hadoop 3.3.1
> version for the next release version of Accumulo. I noticed that the hadoop
> dependency version was updated to 3.3.1 briefly by
> commit 3c3a91f7a4b6ea290a383a77844cabae34eaeb1f, but it was dropped back to
> 3.3.0 in commit 48679fef73e246de52fbeecad03f974f2116b97a shortly after.
> The explanation for undoing the change was that hadoop 3.3.1 was causing
> intermittent IT failures, most frequently in the CountNameNodeOpsBulkIT.
>
> I checked out that commit myself and also noticed that the
> CountNameNodeOpsBulkIT was failing often with an IOException "Unable to
> close file because the last block...does not have enough number of
> replicas", which I don't believe is indicative of a bug in hadoop or the
> accumulo code. I think what's more likely is that the multithreaded test
> was overwhelming the minidfs cluster with requests. I'm not sure which
> default value/behavior was updated in hadoop 3.3.1 that would cause the
> minicluster to blow up where it wasn't previously in 3.3.0, but I noticed
> in later commits the issue was resolved. It looks like the ClientContext
> changes in the very next code change (commit
> 4b66b96b8f6c65c390fc26c11acf8c51cb78d858) resolve the IT failures that were
> the reason for moving the hadoop version back to 3.3.0. If you check out
> that or any later commit, and update the hadoop dependency version in the
> parent pom to 3.3.1, then the IT failures are resolved.
>
> The reason I'd like the hadoop 3.3.1 version to make it into the next
> release version of Accumulo is because I've been experimenting with
> Accumulo using S3 as the underlying file system. This change (
> https://issues.apache.org/jira/browse/HADOOP-17597) that was added in
> hadoop 3.3.1 makes it possible to use the S3AFileSystem defined in
> hadoop-aws with Accumulo to replace HDFS with S3. The only change needed is
> to update the manager.walog.closer.implementation property and supply an
> S3LogCloser implementation on the classpath.
>
> Thanks,
> Chris

Re: Intro

2022-05-03 Thread Christopher

The slack invite process changed. We have to explicitly invite people now.
I can do it.

On Tue, May 3, 2022, 12:33 Dave Marion  wrote:

> See https://accumulo.apache.org/contact-us/#slack, there is an invite
> link.
>
> On Tue, May 3, 2022 at 12:32 PM Nikita S  wrote:
>
> > Tried to log in to the slack channel and got "nik...@thesirohis.com
> > doesn’t
> > have an account on this workspace." Do I need an invite or something?
> >
> > On Mon, May 2, 2022 at 4:57 AM Mike Miller  wrote:
> >
> > > Welcome! Don't hesitate to ask questions on this dev list or on our
> Slack
> > > channel.
> > > https://the-asf.slack.com/messages/CERNB8NDC
> > >
> > >
> > > On Sat, Apr 30, 2022 at 6:41 PM Nikita S 
> wrote:
> > >
> > > > Thanks for the warm welcome and the link!
> > > >
> > > > We are still fairly early in our Accumulo journey; still coming up to
> > > speed
> > > > in many areas and tailoring to our use cases. Probably premature to
> > > present
> > > > in depth. But we're confident that given what Accumulo is capable of,
> > > there
> > > > will be good content for us to share with the community in the
> future.
> > We
> > > > will continue to keep our eyes out for how we can engage, and excited
> > to
> > > do
> > > > so!
> > > >
> > > > On Wed, Apr 27, 2022 at 1:04 PM Dave Marion 
> > wrote:
> > > >
> > > > > Welcome! And +1 for hearing more about how you are using Accumulo.
> I
> > > > > visited the Ghost's website earlier this morning - awesome stuff.
> > > > >
> > > > > On Wed, Apr 27, 2022 at 3:59 PM Christopher 
> > > wrote:
> > > > >
> > > > > > Hi Nikita!
> > > > > >
> > > > > > Welcome to our community. I'm curious to hear more about how
> Ghost
> > is
> > > > > > using Accumulo. Have you considered giving a presentation at the
> > > > > > upcoming ApacheCon this year? I think they are still accepting
> > > > > > submissions (https://apachecon.com/acna2022/cfp.html) and that
> > > sounds
> > > > > > like it would make an interesting presentation.
> > > > > >
> > > > > > On Wed, Apr 27, 2022 at 3:51 PM Nikita S 
> > > > wrote:
> > > > > > >
> > > > > > > Hi Accumulo Devs,
> > > > > > >
> > > > > > > Was suggested I drop a note introducing myself after my last
> PR.
> > > :-)
> > > > > > >
> > > > > > > I'm Nikita; I work for Ghost <https://www.driveghost.com/>; we
> > are
> > > > > using
> > > > > > > Accumulo to store driving data and train models for driving
> cars.
> > > We
> > > > > hope
> > > > > > > to contribute to the community with bug fixes and extensions as
> > > > > > > appropriate.
> > > > > > >
> > > > > > > It was a great experience getting our first patch in; thanks
> for
> > > the
> > > > > help
> > > > > > > and looking forward to future collaboration!
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Intro

2022-04-27 Thread Christopher

Hi Nikita!

Welcome to our community. I'm curious to hear more about how Ghost is
using Accumulo. Have you considered giving a presentation at the
upcoming ApacheCon this year? I think they are still accepting
submissions (https://apachecon.com/acna2022/cfp.html) and that sounds
like it would make an interesting presentation.

On Wed, Apr 27, 2022 at 3:51 PM Nikita S  wrote:
>
> Hi Accumulo Devs,
>
> Was suggested I drop a note introducing myself after my last PR. :-)
>
> I'm Nikita; I work for Ghost ; we are using
> Accumulo to store driving data and train models for driving cars. We hope
> to contribute to the community with bug fixes and extensions as
> appropriate.
>
> It was a great experience getting our first patch in; thanks for the help
> and looking forward to future collaboration!

Re: Major compactions during map reduce

2022-04-19 Thread Christopher

Isolation should only give you consistency within a row, to ensure you're
not scanning over partial changes from a mutation that is currently being
written to a row. It shouldn't have anything to do with compactions or
missing data that has already been written before the MapReduce scan has
started.

Splits shouldn't cause you to miss data either. It's been awhile since I
looked, but I believe the MapReduce APIs simply break up a table into
separate ranges to scan based on current tablet boundaries. If there are
splits, then all that means is that some of the ranges will span across
more than one tablet, but that's fine... a scan is a scan... scans don't
need to be limited to a single tablet.

Compactions could cause missed data if they transform the data in some way,
but otherwise, I wouldn't expect them to.

Are you seeing any error messages anywhere?

On Mon, Apr 18, 2022, 15:23 Vincent Russell 
wrote:

> Hi Dave,
>
> Yes we are using the new MapReduce API, but we are not setting any
> settings for isolated scan so we are using whatever the default is.
>
> Thanks,
> Vincent
>
> On Mon, Apr 18, 2022 at 3:12 PM Dave Marion  wrote:
>
> > Major compactions should not move rows to new tablets, but a tablet split
> > could. Are you using the new MapReduce API introduced in 2.0? Are you
> > setting it to use an isolated scan?
> >
> > On Mon, Apr 18, 2022 at 3:01 PM Vincent Russell <
> vincent.russ...@gmail.com
> > >
> > wrote:
> >
> > > Hello All,
> > >
> > > Could major compactions that occur while a map reduce job is running
> > cause
> > > the map reduce job to miss records because rows have been moved to a
> > > different tablet?
> > >
> > > How does this work?
> > >
> > > I'm using accumulo 2.0.1
> > >
> > > Thank you,
> > > Vincent
> > >
> >
>

Second opinion on PR #2622

2022-04-14 Thread Christopher

Hi Accumulo Devs,

I would appreciate more opinions on the right way to handle
interrupted clients in blocking IO operations, where an IOException
(ClosedByInterrupt) is thrown instead of InterruptedException. I've
discussed some alternatives with the contributor, but am unsure on
which is the best path forward. Additional feedback would be helpful
in deciding on one of the proposed options, or a different path
entirely.

https://github.com/apache/accumulo/pull/2622

Thanks,
Christopher

Re: [DISCUSS] Draft Accumulo quarterly report - due Wednesday 4/13.

2022-04-08 Thread Christopher

On Fri, Apr 8, 2022 at 4:12 PM dev1  wrote:
>
> There does not seem to be a path forward for us or Aaron C. - without the 
> access to the account that created the domain, the owner cannot get anyone at 
> Amazon to respond. Maybe they would respond to a legal request from Apache - 
> but that's resources likely best devoted to other, more pressing issues.
>
> The current domain registration info:
>
> Registry Expiration: 2022-06-28 05:12:45 UTC
> Updated: 2021-05-24 23:22:33 UTC
> Created: 2017-06-28 05:12:45 UTC
>
> At some point, maybe the account / card that was used will expire and any 
> auto-renewal will fail.   We can revisit how to report this discuss it before 
> the next quarterly report is due in July - hopefully it will have failed to 
> renew and it will be a non-issue.

I'd be surprised if Aaron hasn't cancelled the card yet... or maybe
they're renewing without charging him? (or maybe they are queuing up a
larger invoice to drop on him later?)

In any case, I'm fine with including it for now. It's not like it's
currently taking much attention away from another issue to report
on... but it could in future if we have something more important to
communicate and it's still taking up as much space in the report.
Hopefully it's resolved by July.

Re: [DISCUSS] Draft Accumulo quarterly report - due Wednesday 4/13.

2022-04-08 Thread Christopher

Overall, it looks good to me.

Regarding the trademark issue with that domain, I think it is either a
problem we should pursue further (reach out to Amazon, the
registrar?), or it's not a problem worth our time, and we should drop
it. I'm not sure which is the case, I don't think it's worth tracking
indefinitely.

On Fri, Apr 8, 2022 at 3:11 PM Dominic Garguilo
 wrote:
>
> LGTM
>
> On Fri, Apr 8, 2022 at 2:08 PM Dave Marion  wrote:
>
> > LGTM.
> >
> > On Fri, Apr 8, 2022 at 12:57 PM dev1  wrote:
> >
> > > The Accumulo quarterly report is due Wednesday, April 13, 2022.  .  The
> > > community decided to publicly prepare the report on the dev mailing list.
> > > Below is the current draft.
> > >
> > > (note: This is a cut-n-paster from the report wizard, so there may be
> > > formatting differences that will not appear when the report is submitted
> > > via the apache reporting tool.)
> > >  draft report ---
> > >
> > > ## Description:
> > > The Apache Accumulo is a robust, scalable, distributed key/value store
> > with
> > > cell-based access control and customizable server-side processing.
> > >
> > > ## Issues:
> > > There are no new issues requiring board attention.
> > >
> > > The trademark issue with http:www.accumulodata.com is still open.
> > > Although the
> > > domain owner does not have access to the domain registration, the domain
> > > appears to have automatically renewed, and the expiration is now
> > > 2022-06-28.
> > > Email from the private list discussing this are at [1], [2] and [3]. No
> > > action
> > > has been required and allowing the domain to expire was deemed a viable
> > > option
> > > by Brand Management VP in Jan-2021 (private)[4] to minimize volunteer
> > > efforts.
> > >
> > >
> > > ## Membership Data:
> > > Apache Accumulo was founded 2012-03-20 (10 years ago)
> > > There are currently 40 committers and 40 PMC members in this project.
> > > The Committer-to-PMC ratio is 1:1.
> > >
> > > Community changes, past quarter:
> > > - No new PMC members. Last addition was Dominic Garguilo on 2021-07-28.
> > > - No new committers. Last addition was Dominic Garguilo on 2021-07-29.
> > >
> > > ## Project Activity:
> > > Project activity on the next release remains active with significant
> > > improvements to the current baseline. The remaining issues are being
> > > actively
> > > worked. Currently, Accumulo is targeting a June release of version 2.1.
> > > Current
> > > 2.1 progress is discussed in this thread [5] and includes:
> > >
> > >   - 15 pull requests that are currently in progress.
> > >   - 32 pull requests that are open as TODO. But a lot of these will get
> > > bumped
> > > to the next version.
> > >   - 1,025 pull requests have been merged.
> > >
> > > ## Community Health:
> > > Overall community health is good and GitHub activity remains consistent.
> > >
> > > - Community participation remains healthy with discussions on the mailing
> > > lists
> > >   and GitHub issues and pull-requests.
> > > - Accumulo continues to transition from Jira to GitHub issues. Jira
> > > activity
> > >   reflects transition to using GitHub issues as obsolete issues are
> > closed
> > > and
> > >   open issues are transitioned to GitHub issues.
> > >
> > > ## Links
> > > (private) [1]:
> > >
> > https://lists.apache.org/thread.html/r8c8ef5575b14accb6fc00d670764a313b91d76033f761c6e5c7eb29d%40%3Cprivate.accumulo.apache.org%3E
> > > (private) [2]:
> > >
> > https://lists.apache.org/thread.html/514d3cf9162e72f4aa13be1db5d6685999fc83755695308a529de4d6@%3Cprivate.accumulo.apache.org%3E
> > > (private) [3]:
> > >
> > https://lists.apache.org/thread.html/rcc8c07db43222e08b9992fd739b8f24d18569ba9af3decfdb52c4a3e%40%3Cprivate.accumulo.apache.org%3E
> > > (private) [4]:
> > >
> > https://lists.apache.org/thread.html/r408e3eed907e3ad24a7c84b5247f51973a4c965c891b01215e45ee17%40%3Cprivate.accumulo.apache.org%3E
> > > [5]: https://lists.apache.org/thread/0nx7ml312v13chdk6xgcwn0vryr5v0xc
> > >
> > >
> >
>
>
> --
> Dominic Garguilo

Re: 2.1 Release TODO

2022-04-06 Thread Christopher

After some additional consideration, and getting a better understanding of
how the code is expected to work from discussing it with Dave... I'm a
little more inclined to support #2422 in 2.1, provided:

1. There's time for me to review it,
2. It is sufficiently decoupled from the existing code and marked
experimental, so that we have the flexibility to alter its design, if it
seems appropriate after it gets some exposure after the release,
3. Unit tests and integration tests are reliably passing (as stable as, or
more stable than, they are currently),
4. No serious issues are discovered during review, and
5. It doesn't delay a release past early June, as I think this is a
reasonable target date.

This my wishlist before I can get behind it with a +1 for 2.1. If these
aren't met, I do not intend to veto, but I'd be a -0 on its inclusion to
2.1. Of course, once I review it, my thoughts may change a bit.

On Mon, Apr 4, 2022 at 7:07 PM Mike Miller  wrote:

> I think I can finish the FATE refactor PR [1] for 2.1. I had been keeping
> it up to date with the latest in main but stopped because it was too much
> work. I was waiting until the ZK property changes are completed before
> resolving the latest conflicts. I don't think it is much of a risk. It is
> mostly cleanup and refactoring to remove generics from the serialization
> code. It will be some work to revisit but I think the risk is pretty low.
> It would allow changing the serialization, which we may be able to get into
> 2.1 as well.
>
> [1] https://github.com/apache/accumulo/pull/2475
>
> On Mon, Apr 4, 2022 at 11:50 AM Keith Turner  wrote:
>
> > On Mon, Apr 4, 2022 at 11:17 AM Christopher  wrote:
> > >
> > > I haven't seen the metrics test fail very often lately. If it's
> stable, I
> > > don't mind removing the blocker on that issue, but I'd be reluctant to
> > > close it entirely just yet, until we can verify it doesn't happen
> > anymore.
> > >
> > > As for the original list of potential issues to include, I'm in favor
> of
> > > trying to get #2197 in. It was started awhile ago, is relatively simple
> > and
> > > well understood by several of us already... it just needs a bit of
> > > attention to finalize reviews so it can be merged.
> > >
> > > However, I'm reluctant to include #2422, because I don't think it's
> near
> > > ready enough, and by the time it is, it will be very last minute, and I
> > > don't want to delay 2.1 further for it. Even if it's included as an
> > > experimental feature, I think it has huge potential to be disruptive,
> or
> > to
> > > have a lot of churn by the time people actually have a chance to review
> > it
> > > thoroughly. Furthermore, I think there are possible alternatives (like
> a
> > > fully client-side implementation, based on offline scanners) that would
> > > avoid the tight coupling of a new service to Accumulo's core code. This
> >
> > There are some advantages to scan servers over direct file access to
> > consider.  One is scalability of computation, if a web server is
> > serving N client queries with scan servers those can potentially go to
> > different scan servers.  With direct file access, all N queries and
> > their iterator stacks would have to run in the web server.  Another is
> > scalability of caching/memory.  When web servers send queries to scan
> > servers using a sticky algorithm for assigning tablets to groups of
> > scan servers, it could lead to good cache utilization and sharing that
> > may not be possible when running scans directly in the web server. So
> > scan servers allow scaling cache and computations for queries
> > independently of web servers in way that may not be possible with
> > direct file access.
> >
> > Another advantage to consider is isolation.  With direct file access
> > and queries running directly in a web server, a bad query could bring
> > down a web server and lots of unrelated queries.  Having a bad query
> > bring down a scan server may be less disruptive.
> >
> > > thread isn't for discussing this in depth, so we can have that
> discussion
> > > in a separate thread, but I'm generally opposed to including it this
> late
> > > in 2.1's development, given the timing, size and scope, tight coupling,
> > and
> > > current state.
> > >
> > > I don't know enough about #2475 to have a strong opinion, but it looks
> > big,
> > > and possibly high-risk, given the critical code it touches. It
> currently
> > > has a substantial number of conflicts with the main branch. However, I
> > was
> > > thinking

Re: Scan Server discussion [WAS: Re: 2.1 Release TODO]

2022-04-04 Thread Christopher

On Mon, Apr 4, 2022 at 1:11 PM Dave Marion  wrote:
>
> I understand the desire to see less coupling for the optional features, but
> getting to that point for ScanServers (and less so for ExternalCompactions)
> would be a ton of work I think.

The likelihood of it being a lot of work doesn't mean it shouldn't be
done. It's much more likely you could get help doing the work, if
needed, after the 2.1 release, when some of the time that people are
putting in 2.1 quality controls / testing, is freed up to work on
other tasks.

> The concern that I brought up in the "2.1
> Release TODOs" thread regarding planning has not been addressed. If there
> was a defined path forward, then that might make it easier to see how this
> feature gets added in the near-future in whatever form it takes.

LOL, I forked the mailing list conversation because Keith started
discussing the feature merits in the release planning thread, and it
seemed like a separate topic. Now, you're talking about release
planning in this thread. I'm not sure where to reply! 臘 I guess the
lines are blurred. I'll reply here.

If we decide not to merge it into 2.1, what would be your preferred
path forward? What's your contingency plan for this feature if the
community doesn't include it in 2.1? Here's some possibilities:

* Could release in non-LTM 2.2 (release right away)
* Could release in non-LTM 3.0 (to include with other 3.0 changes)
* Could do some work to decouple, and create a separate optional
add-on jar for 2.1

By default, I would expect it to go into the next release (probably
named 3.0), unless there's another path proposed that we can get
consensus around. The more decoupled it is from Accumulo's internals,
though, makes it more likely to be usable with more versions, even
2.1. If completely decoupled, it could be released at any time on its
own schedule and be made to work with any version you like. It
wouldn't need to be released coupled to an Accumulo release.

>
> Regarding the concern about the readiness of the feature branch, Keith is
> doing a last pass review on the draft and then I believe we are ready to
> take it out of draft state. I think it will be before the end of this week.
> We have added six new integration tests and we have done some local and
> cluster testing.

Taking it out of draft state means that some people will begin to look
at it. But, others, whose focus is on polishing existing features and
testing for 2.1 won't have time, unless they prioritize your PR over
the existing prep for 2.1. I'm not sure how long that would take.

> Regarding the concern mentioned above, "availability of time to review/test
> such a big feature without delaying 2.1," I didn't realize that we had a
> schedule.

We don't have a strict schedule. We never have. It's informal. It's
just that 2.1 has been under development for so long already. I've
been feeling the pressure build to get it released. I think a lot of
our community has already started wanting a 2.x LTM release once we
started discussing the LTM concept, and they've been stuck with 2.0 as
the only 2.x version to work with. We only reluctantly did a CVE patch
for 2.0, when it was non-LTM, only because we were not able to get 2.1
out at that time. Mike also suggested in Slack months ago (January
27th), that we should consider branching a 2.2 for new features, so
2.1 doesn't get slowed down. So, I know I'm not the only one who has
felt 2.1 is getting a little fat with feature changes and overdue for
a release. You commented on that thread, so I know you had at least a
peripheral awareness at some point, that there was interest in
wrapping up 2.1 before you even started the ScanServer work, way back
when we were still cleaning up the tests from the breakages caused by
the big external compactions and thread pool changes.

>  Does it matter if it takes 2/4/6/8 weeks to test the 1000+
> completed issues in this release?

Yes. I think so.

First, most of those changes are trivial, and have already been
thoroughly tested and proven during development. While we do overall
testing near the end of development near a release, most of the
testing is done continuously throughout development, even for the
non-trivial changes. We often test features thoroughly as they are
added before we move on to the next feature to work on. So, it's
misleading to suggest that there are 1000+ untested features queued up
to be tested and that this is merely one more.

Second, while the specific duration 2 weeks vs. 8 weeks doesn't
necessarily matter, it does matter whether we're testing a moving
target or not. At this point, our testing target is still moving, but
it should be moving much more slowly than it was at the beginning of
the development cycle, and it should be slowing down, not speeding up
with new features. Many of us have already been thinking about a 2.1
release for several months and have been working on fixing tests,
making test improvements, wrapping up existing features, and doing
code

Scan Server discussion [WAS: Re: 2.1 Release TODO]

2022-04-04 Thread Christopher

On Mon, Apr 4, 2022 at 11:50 AM Keith Turner  wrote:
>
> On Mon, Apr 4, 2022 at 11:17 AM Christopher  wrote:
> >
> > However, I'm reluctant to include #2422, because I don't think it's near
> > ready enough, and by the time it is, it will be very last minute, and I
> > don't want to delay 2.1 further for it. Even if it's included as an
> > experimental feature, I think it has huge potential to be disruptive, or to
> > have a lot of churn by the time people actually have a chance to review it
> > thoroughly. Furthermore, I think there are possible alternatives (like a
> > fully client-side implementation, based on offline scanners) that would
> > avoid the tight coupling of a new service to Accumulo's core code. This
>
> There are some advantages to scan servers over direct file access to
> consider.  One is scalability of computation, if a web server is
> serving N client queries with scan servers those can potentially go to
> different scan servers.  With direct file access, all N queries and
> their iterator stacks would have to run in the web server.  Another is
> scalability of caching/memory.  When web servers send queries to scan
> servers using a sticky algorithm for assigning tablets to groups of
> scan servers, it could lead to good cache utilization and sharing that
> may not be possible when running scans directly in the web server. So
> scan servers allow scaling cache and computations for queries
> independently of web servers in way that may not be possible with
> direct file access.
>
> Another advantage to consider is isolation.  With direct file access
> and queries running directly in a web server, a bad query could bring
> down a web server and lots of unrelated queries.  Having a bad query
> bring down a scan server may be less disruptive.
>

I've forked this thread into its own discussion with a new subject
line, because, as I suggested in my original reply, my intent was not
to hijack the 2.1 planning thread with a discussion of the ScanServer
implementation details.

I'm fine with all those benefits (even if all the "could" and "may"
were turned into concrete "will"). My objection is not an objection to
the feature. It's an objection to including the feature in 2.1, based
on:

* readiness of the feature branch,
* availability of time to review/test such a big feature without delaying 2.1,
* its tight coupling to the core code in the implementation, and
* the possibility that solutions may exist with the above benefits
that are less tightly coupled has not yet been explored.

I would be more okay with including it if:

* it is ready,
* it has been tested and reviewed by the wider community,
* its coupling to the core Accumulo code is loosened, ideally if it's
designed to use only API/SPI, and could be released as a separate,
optional add-on. This might require improvements to API/SPI to expose
the features needed to help it function. This could also be done by
sub-classing the AccumuloClient. My concern here is the risk of
technical debt and the extra maintenance costs of increased complexity
for optional features that go unmaintained.

We've been hurt by premature inclusion of optional/experimental
features before that were rushed to release. No matter how awesome the
feature is... if it's niche and optional, we should consider these
risks and work to mitigate them. Otherwise, we'll be stuck with the
technical debt for years to come. With a little bit of caution, we can
make the feature available, without rushing, to satisfy the use case
while reducing the risks.

Also, one point of clarification: when I say "fully client side", I
only mean relative to Accumulo, not necessarily in the client process.
I'm lacking vocabulary to describe what I mean. As I understand it,
the current client code has been modified to connect to ScanServers
sitting off to the side of TabletServers, and the ScanServers are
basically modified TabletServers with less functionality. What I mean
is that instead of coupling the ScanServer to the TabletServer
implementation, and coupling the ScanServer client to the
AccumuloClient, there could be less coupling. The ScanServer itself
could behave like a client to Accumulo and/or HDFS (and maybe even
share some library code that we make public API, like RFile readers)
and it could have its own client (this is just one very rough outline
of an idea that could be explored). That way, the entire thing could
be removed without any change in Accumulo's code, to make it truly
optional (as in, optional to even have on the class path).

Re: 2.1 Release TODO

2022-04-04 Thread Christopher

I haven't seen the metrics test fail very often lately. If it's stable, I
don't mind removing the blocker on that issue, but I'd be reluctant to
close it entirely just yet, until we can verify it doesn't happen anymore.

As for the original list of potential issues to include, I'm in favor of
trying to get #2197 in. It was started awhile ago, is relatively simple and
well understood by several of us already... it just needs a bit of
attention to finalize reviews so it can be merged.

However, I'm reluctant to include #2422, because I don't think it's near
ready enough, and by the time it is, it will be very last minute, and I
don't want to delay 2.1 further for it. Even if it's included as an
experimental feature, I think it has huge potential to be disruptive, or to
have a lot of churn by the time people actually have a chance to review it
thoroughly. Furthermore, I think there are possible alternatives (like a
fully client-side implementation, based on offline scanners) that would
avoid the tight coupling of a new service to Accumulo's core code. This
thread isn't for discussing this in depth, so we can have that discussion
in a separate thread, but I'm generally opposed to including it this late
in 2.1's development, given the timing, size and scope, tight coupling, and
current state.

I don't know enough about #2475 to have a strong opinion, but it looks big,
and possibly high-risk, given the critical code it touches. It currently
has a substantial number of conflicts with the main branch. However, I was
thinking that *some* minimal refactoring (like low-risk automatic
refactoring, like moving packages) could be done. So, if that's all this
does, it might be okay. Otherwise, maybe it can be simplified? At the very
least, I was thinking it would be a good opportunity to move the
`org.apache.accumulo.fate` packages into an appropriate
`org.apache.accumulo.core` parent package (some would go to o.a.a.core.fate
and others might go to o.a.a.core.util or similar) to keep the package
namespaces standardized, which is helpful to avoid naming collisions and
jar sealing issues, as well as for less complicated jigsaw module
definitions in future. Since 2.1 FaTE is already incompatible with prior
versions, a rename at this time would be less disruptive.

Another task I had wanted to be done for 2.1, before I got distracted
fixing test failures during and after Christmas and trying to work through
the singleton manager zookeeper stuff to see what we could simplify. What I
had wanted done was to standardize the way we pass table identifiers (name,
IDs) across the RPC layer, since we currently do that inconsistently. I
don't remember if there's an existing ticket open for it, but I have a
working branch I had started working out of for it before Christmas. It's
relatively simple work, and would set us up for some much better APIs going
forward, as well as help with logging information about table actions. If
necessary, it could be bumped to a future version, but then we'd have more
churn in the thrift layer. So, I'd prefer to get it for 2.1 to avoid that.

As for planning, I was thinking early May for a code freeze (except bug
fixes and small improvements found during testing), so we can try to
release towards the end of May/early June. If we go with that timeline,
that's not a lot of time to wrap up features and have time for
review/testing, so we may need to be selective about what we hold off until
the next version, unless we want to further delay 2.1.

On Mon, Apr 4, 2022 at 9:13 AM Dave Marion  wrote:

> I think [3] is OBE and can be closed.
>
> On Mon, Apr 4, 2022 at 9:11 AM Mike Miller  wrote:
>
> > Yes I agree, that was the goal of this email thread. I found a few more
> > tickets that should be addressed for the next release.
> >
> > Ivan - There was some work done on this PR but it has been some time. Do
> > you want to take a look at it? Implement a Thread limit. [1]
> > Keith T - I think we should get this one merged to fix that consistency
> > check bug I found. It looks like it is finished. [2]
> > Dave & Dom - Were you guys able to figure out a fix for the new external
> > compaction metrics test? [3]
> >
> > FYI we have 6 blockers for 2.1:
> > https://github.com/apache/accumulo/labels/blocker
> >
> > This is almost definitely going into 2.1 [4]. Thanks Jeff!
> >
> > [1] https://github.com/apache/accumulo/pull/1487
> > [2] https://github.com/apache/accumulo/pull/2574
> > [3] https://github.com/apache/accumulo/issues/2406
> > [4] https://github.com/apache/accumulo/pull/2215
> >
> > On Fri, Apr 1, 2022 at 2:21 PM Dave Marion  wrote:
> >
> > > I think it would be useful to do some release planning so that we know
> > what
> > > features we are working towards and in which release they will be in.
> > This
> > > would be helpful for determining what existing PRs need to make it into
> > > 2.1.0. 2.1.0 is the LTM release, so patches for existing features will
> be
> > > backported (2.1.1, 2.1.2, 2.1.3,

Re: Query regarding migration accumulo from 1.9.3 to 1.10.2

2022-02-25 Thread Christopher

It's not clear what you're asking. 1.10.2 is backwards compatible with
1.9.3 (source compatible, not necessarily binary compatible across
minor versions). And requires Java 8.

On Fri, Feb 25, 2022 at 12:14 PM Dushyant Tankariya
 wrote:
>
> > Hi team,
> >
> > We are using Accumulo Database version 1.9.3 and now planning for migration 
> > (to Accumulo version 1.10.2).
> >
> > Query: Will it impact to consumer API/Job developed in JDK 8 (Spring boot 
> > framework 2.2.6)?
> >
> > Regards,
> > Dushyant
>
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 1.10.2

2022-02-14 Thread Christopher

I sent the announcement, but had to send it a second time, because the
announce mailing list rejected the first one, because it wasn't plain text
mode.


On Mon, Feb 14, 2022 at 9:47 AM Adam Lerman  wrote:

> +1
>
> On Mon, Feb 14, 2022 at 9:42 AM Mike Miller  wrote:
>
> > +1
> >
> > On Mon, Feb 14, 2022 at 9:21 AM Christopher  wrote:
> >
> > > This is also the tweet I have scheduled prepared:
> > >
> > > [image: ] Happy #ValentinesDay
> > > <https://twitter.com/search?q=%23ValentinesDay>! [image: ] Because
> we
> > > love you so much, we have prepared this new release, #Apache
> > > <https://twitter.com/search?q=%23Apache> Accumulo 1.10.2 (LTM)!
> > > https://accumulo.apache.org/release/accumulo-1.10.2/ #opensource
> > > <https://twitter.com/search?q=%23opensource> #BigData
> > > <https://twitter.com/search?q=%23BigData>
> > >
> > > On Mon, Feb 14, 2022 at 9:19 AM Mike Miller 
> wrote:
> > >
> > > > LGTM. The Release notes look good as well.
> > > >
> > > > On Sun, Feb 13, 2022 at 1:44 PM Christopher 
> > wrote:
> > > >
> > > > > The Apache Accumulo project is pleased to announce the release
> > > > > of Apache Accumulo 1.10.2! Apache Accumulo 1.10.2 is a bug fix
> > > > > release of the 1.10 LTM release line. Among other things, it
> > > > > removes the dependency on log4j 1.2 (using reload4j instead).
> > > > > See the release notes linked below for details.
> > > > >
> > > > > Users of 1.10.1 or earlier are encouraged to upgrade to take
> > > > > advantage of the included bug fixes and improvements. Users are
> > > > > also encouraged to consider migrating to a 2.x version when one
> > > > > that is suitable for their needs becomes available.
> > > > >
> > > > > ***
> > > > >
> > > > > Apache Accumulo® is a sorted, distributed key/value store that
> > > > > provides robust, scalable data storage and retrieval. With
> > > > > Apache Accumulo, users can store and manage large data sets
> > > > > across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> > > > > its data and Apache ZooKeeper for consensus.
> > > > >
> > > > > This version is now available in Maven Central, and at:
> > > > > https://accumulo.apache.org/downloads/
> > > > >
> > > > > The full release notes can be viewed at:
> > > > > https://accumulo.apache.org/release/accumulo-1.10.2/
> > > > >
> > > >
> > >
> >
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 1.10.2

2022-02-14 Thread Christopher

This is also the tweet I have scheduled prepared:

[image: ] Happy #ValentinesDay
<https://twitter.com/search?q=%23ValentinesDay>! [image: ] Because we
love you so much, we have prepared this new release, #Apache
<https://twitter.com/search?q=%23Apache> Accumulo 1.10.2 (LTM)!
https://accumulo.apache.org/release/accumulo-1.10.2/ #opensource
<https://twitter.com/search?q=%23opensource> #BigData
<https://twitter.com/search?q=%23BigData>

On Mon, Feb 14, 2022 at 9:19 AM Mike Miller  wrote:

> LGTM. The Release notes look good as well.
>
> On Sun, Feb 13, 2022 at 1:44 PM Christopher  wrote:
>
> > The Apache Accumulo project is pleased to announce the release
> > of Apache Accumulo 1.10.2! Apache Accumulo 1.10.2 is a bug fix
> > release of the 1.10 LTM release line. Among other things, it
> > removes the dependency on log4j 1.2 (using reload4j instead).
> > See the release notes linked below for details.
> >
> > Users of 1.10.1 or earlier are encouraged to upgrade to take
> > advantage of the included bug fixes and improvements. Users are
> > also encouraged to consider migrating to a 2.x version when one
> > that is suitable for their needs becomes available.
> >
> > ***
> >
> > Apache Accumulo® is a sorted, distributed key/value store that
> > provides robust, scalable data storage and retrieval. With
> > Apache Accumulo, users can store and manage large data sets
> > across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> > its data and Apache ZooKeeper for consensus.
> >
> > This version is now available in Maven Central, and at:
> > https://accumulo.apache.org/downloads/
> >
> > The full release notes can be viewed at:
> > https://accumulo.apache.org/release/accumulo-1.10.2/
> >
>

[DRAFT][ANNOUNCE] Apache Accumulo 1.10.2

2022-02-13 Thread Christopher

The Apache Accumulo project is pleased to announce the release
of Apache Accumulo 1.10.2! Apache Accumulo 1.10.2 is a bug fix
release of the 1.10 LTM release line. Among other things, it
removes the dependency on log4j 1.2 (using reload4j instead).
See the release notes linked below for details.

Users of 1.10.1 or earlier are encouraged to upgrade to take
advantage of the included bug fixes and improvements. Users are
also encouraged to consider migrating to a 2.x version when one
that is suitable for their needs becomes available.

***

Apache Accumulo® is a sorted, distributed key/value store that
provides robust, scalable data storage and retrieval. With
Apache Accumulo, users can store and manage large data sets
across a cluster. Accumulo uses Apache Hadoop's HDFS to store
its data and Apache ZooKeeper for consensus.

This version is now available in Maven Central, and at:
https://accumulo.apache.org/downloads/

The full release notes can be viewed at:
https://accumulo.apache.org/release/accumulo-1.10.2/

[RESULT][VOTE] Apache Accumulo 1.10.2-rc1

2022-02-12 Thread Christopher

This vote passes. I count:

* 11 binding +1s
* No other votes

I will continue the process to release this tomorrow and Monday (if not
done tomorrow). Release checklist being tracked at:
https://github.com/apache/accumulo/issues/2468

On Sat, Feb 12, 2022 at 5:07 PM Christopher  wrote:

> +1 from me.
>
> I checked:
> * matching source and javadoc jar for each regular jar
> * source tarball content matches git commit
> * binary tarball contains expected jars
> * jars are sealed
> * jars manifest point to correct git commit
> * manually reviewed diff from 1.10.1
> * built and ran all ITs with hadoop 2 profile and hadoop 3 profile
> * verified Nexus md5/sha1 checksums, tarball SHA-512s, and GPG signatures
> * manually ran in fluo-uno
>
>
> On Fri, Feb 11, 2022 at 9:57 AM Marc  wrote:
>
>> +1 , checked hashes built and also ran sunny tests. ran some automated
>> tests last night that I typically use to validate Accumulo thrift RPC
>> calls
>> and encountered no issues there either.
>>
>> thanks,
>> marc
>>
>> On Fri, Feb 11, 2022 at 9:44 AM Keith Turner  wrote:
>>
>> > +1
>> >
>> > Looked through the diffs from 1.10.1 to 1.10.2 using the following.
>> >
>> > https://github.com/apache/accumulo/compare/rel/1.10.1...1.10.2-rc1
>> >
>> > On Tue, Feb 8, 2022 at 10:08 AM Christopher 
>> wrote:
>> > >
>> > > Accumulo Developers,
>> > >
>> > > Please consider the following candidate for Apache Accumulo 1.10.2.
>> > >
>> > > Git Commit:
>> > > db2baf1706c0721e25438d5329ef1bba5159c24d
>> > > Branch:
>> > > 1.10.2-rc1
>> > >
>> > > If this vote passes, a gpg-signed tag will be created using:
>> > > git tag -f -s -m 'Apache Accumulo 1.10.2' rel/1.10.2 \
>> > > db2baf1706c0721e25438d5329ef1bba5159c24d
>> > >
>> > > Staging repo:
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092
>> > > Source (official release artifact):
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/org/apache/accumulo/accumulo/1.10.2/accumulo-1.10.2-src.tar.gz
>> > > Binary:
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/org/apache/accumulo/accumulo/1.10.2/accumulo-1.10.2-bin.tar.gz
>> > >
>> > > Append ".asc" to download the cryptographic signature for a given
>> > artifact.
>> > > (You can also append ".sha1" or ".md5" instead in order to verify the
>> > > checksums
>> > > generated by Maven to verify the integrity of the Nexus repository
>> > staging
>> > > area.)
>> > >
>> > > Signing keys are available at
>> https://www.apache.org/dist/accumulo/KEYS
>> > > (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>> > >
>> > > In addition to the tarballs and their signatures, the following
>> checksum
>> > > files will be added to the dist/release SVN area after release:
>> > > accumulo-1.10.2-src.tar.gz.sha512 will contain:
>> > > SHA512 (accumulo-1.10.2-src.tar.gz) =
>> > >
>> >
>> d9963856a49f43e37d9b64fce440497c7b3621ba8cf2c6f56e4ce1e061edde11882905e99a21555c83f5c9c82d14e4323dd1246a344a8b9af55da863dfc55c1d
>> > > accumulo-1.10.2-bin.tar.gz.sha512 will contain:
>> > > SHA512 (accumulo-1.10.2-bin.tar.gz) =
>> > >
>> >
>> 9b6e9286133120588f4c682e3fadee33704cf5983ef7e3c0be84c1678bd9bcf283a2c6efa6d4074c73f30032f5c0aa1073c15ec39f90d215505e028a1cf0a739
>> > >
>> > > Release notes (in progress) can be found at:
>> > > https://accumulo.staged.apache.org/release/accumulo-1.10.2
>> > >
>> > > Release testing instructions:
>> > > https://accumulo.apache.org/contributor/verifying-release
>> > >
>> > > Please vote one of:
>> > > [ ] +1 - I have verified and accept...
>> > > [ ] +0 - I have reservations, but not strong enough to vote against...
>> > > [ ] -1 - Because..., I do not accept...
>> > > ... these artifacts as the 1.10.2 release of Apache Accumulo.
>> > >
>> > > This vote will remain open until at least Fri Feb 11 03:30:00 PM UTC
>> > 2022.
>> > > (Fri Feb 11 10:30:00 AM EST 2022 / Fri Feb 11 07:30:00 AM PST 2022)
>> > > Voting can continue after this deadline until the release manager
>> > > sends an email ending the vote.
>> > >
>> > > Thanks!
>> > >
>> > > P.S. Hint: download the whole staging repo with
>> > > wget -erobots=off -r -l inf -np -nH \
>> > >
>> > >
>> >
>> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/
>> > > # note the trailing slash is needed
>> >
>>
>

Re: [VOTE] Apache Accumulo 1.10.2-rc1

2022-02-12 Thread Christopher

+1 from me.

I checked:
* matching source and javadoc jar for each regular jar
* source tarball content matches git commit
* binary tarball contains expected jars
* jars are sealed
* jars manifest point to correct git commit
* manually reviewed diff from 1.10.1
* built and ran all ITs with hadoop 2 profile and hadoop 3 profile
* verified Nexus md5/sha1 checksums, tarball SHA-512s, and GPG signatures
* manually ran in fluo-uno


On Fri, Feb 11, 2022 at 9:57 AM Marc  wrote:

> +1 , checked hashes built and also ran sunny tests. ran some automated
> tests last night that I typically use to validate Accumulo thrift RPC calls
> and encountered no issues there either.
>
> thanks,
> marc
>
> On Fri, Feb 11, 2022 at 9:44 AM Keith Turner  wrote:
>
> > +1
> >
> > Looked through the diffs from 1.10.1 to 1.10.2 using the following.
> >
> > https://github.com/apache/accumulo/compare/rel/1.10.1...1.10.2-rc1
> >
> > On Tue, Feb 8, 2022 at 10:08 AM Christopher  wrote:
> > >
> > > Accumulo Developers,
> > >
> > > Please consider the following candidate for Apache Accumulo 1.10.2.
> > >
> > > Git Commit:
> > > db2baf1706c0721e25438d5329ef1bba5159c24d
> > > Branch:
> > > 1.10.2-rc1
> > >
> > > If this vote passes, a gpg-signed tag will be created using:
> > > git tag -f -s -m 'Apache Accumulo 1.10.2' rel/1.10.2 \
> > > db2baf1706c0721e25438d5329ef1bba5159c24d
> > >
> > > Staging repo:
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092
> > > Source (official release artifact):
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/org/apache/accumulo/accumulo/1.10.2/accumulo-1.10.2-src.tar.gz
> > > Binary:
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/org/apache/accumulo/accumulo/1.10.2/accumulo-1.10.2-bin.tar.gz
> > >
> > > Append ".asc" to download the cryptographic signature for a given
> > artifact.
> > > (You can also append ".sha1" or ".md5" instead in order to verify the
> > > checksums
> > > generated by Maven to verify the integrity of the Nexus repository
> > staging
> > > area.)
> > >
> > > Signing keys are available at
> https://www.apache.org/dist/accumulo/KEYS
> > > (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
> > >
> > > In addition to the tarballs and their signatures, the following
> checksum
> > > files will be added to the dist/release SVN area after release:
> > > accumulo-1.10.2-src.tar.gz.sha512 will contain:
> > > SHA512 (accumulo-1.10.2-src.tar.gz) =
> > >
> >
> d9963856a49f43e37d9b64fce440497c7b3621ba8cf2c6f56e4ce1e061edde11882905e99a21555c83f5c9c82d14e4323dd1246a344a8b9af55da863dfc55c1d
> > > accumulo-1.10.2-bin.tar.gz.sha512 will contain:
> > > SHA512 (accumulo-1.10.2-bin.tar.gz) =
> > >
> >
> 9b6e9286133120588f4c682e3fadee33704cf5983ef7e3c0be84c1678bd9bcf283a2c6efa6d4074c73f30032f5c0aa1073c15ec39f90d215505e028a1cf0a739
> > >
> > > Release notes (in progress) can be found at:
> > > https://accumulo.staged.apache.org/release/accumulo-1.10.2
> > >
> > > Release testing instructions:
> > > https://accumulo.apache.org/contributor/verifying-release
> > >
> > > Please vote one of:
> > > [ ] +1 - I have verified and accept...
> > > [ ] +0 - I have reservations, but not strong enough to vote against...
> > > [ ] -1 - Because..., I do not accept...
> > > ... these artifacts as the 1.10.2 release of Apache Accumulo.
> > >
> > > This vote will remain open until at least Fri Feb 11 03:30:00 PM UTC
> > 2022.
> > > (Fri Feb 11 10:30:00 AM EST 2022 / Fri Feb 11 07:30:00 AM PST 2022)
> > > Voting can continue after this deadline until the release manager
> > > sends an email ending the vote.
> > >
> > > Thanks!
> > >
> > > P.S. Hint: download the whole staging repo with
> > > wget -erobots=off -r -l inf -np -nH \
> > >
> > >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/
> > > # note the trailing slash is needed
> >
>

[VOTE] Apache Accumulo 1.10.2-rc1

2022-02-08 Thread Christopher

Accumulo Developers,

Please consider the following candidate for Apache Accumulo 1.10.2.

Git Commit:
db2baf1706c0721e25438d5329ef1bba5159c24d
Branch:
1.10.2-rc1

If this vote passes, a gpg-signed tag will be created using:
git tag -f -s -m 'Apache Accumulo 1.10.2' rel/1.10.2 \
db2baf1706c0721e25438d5329ef1bba5159c24d

Staging repo:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1092
Source (official release artifact):
https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/org/apache/accumulo/accumulo/1.10.2/accumulo-1.10.2-src.tar.gz
Binary:
https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/org/apache/accumulo/accumulo/1.10.2/accumulo-1.10.2-bin.tar.gz

Append ".asc" to download the cryptographic signature for a given artifact.
(You can also append ".sha1" or ".md5" instead in order to verify the
checksums
generated by Maven to verify the integrity of the Nexus repository staging
area.)

Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
(Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)

In addition to the tarballs and their signatures, the following checksum
files will be added to the dist/release SVN area after release:
accumulo-1.10.2-src.tar.gz.sha512 will contain:
SHA512 (accumulo-1.10.2-src.tar.gz) =
d9963856a49f43e37d9b64fce440497c7b3621ba8cf2c6f56e4ce1e061edde11882905e99a21555c83f5c9c82d14e4323dd1246a344a8b9af55da863dfc55c1d
accumulo-1.10.2-bin.tar.gz.sha512 will contain:
SHA512 (accumulo-1.10.2-bin.tar.gz) =
9b6e9286133120588f4c682e3fadee33704cf5983ef7e3c0be84c1678bd9bcf283a2c6efa6d4074c73f30032f5c0aa1073c15ec39f90d215505e028a1cf0a739

Release notes (in progress) can be found at:
https://accumulo.staged.apache.org/release/accumulo-1.10.2

Release testing instructions:
https://accumulo.apache.org/contributor/verifying-release

Please vote one of:
[ ] +1 - I have verified and accept...
[ ] +0 - I have reservations, but not strong enough to vote against...
[ ] -1 - Because..., I do not accept...
... these artifacts as the 1.10.2 release of Apache Accumulo.

This vote will remain open until at least Fri Feb 11 03:30:00 PM UTC 2022.
(Fri Feb 11 10:30:00 AM EST 2022 / Fri Feb 11 07:30:00 AM PST 2022)
Voting can continue after this deadline until the release manager
sends an email ending the vote.

Thanks!

P.S. Hint: download the whole staging repo with
wget -erobots=off -r -l inf -np -nH \

https://repository.apache.org/content/repositories/orgapacheaccumulo-1092/
# note the trailing slash is needed

Re: [DISCUSS] 1.10.2 release with reload4j

2022-02-04 Thread Christopher

If those changes are tested and ready to go for 1.10, I don't have a
problem including them. However, from what I understand, they need some
additional testing/polishing. I wouldn't want that to hold up this release.
We could easily include them in the next (1.10.3) if they are ready after
this though.

On Fri, Feb 4, 2022 at 12:36 PM Michael Wall  wrote:

> I have some GC fixes I am working on that I would like to get into the
> 1.10.2 release.
>
> https://github.com/apache/accumulo/issues/1377
> https://github.com/apache/accumulo/issues/2322
>
> On Thu, Feb 3, 2022 at 4:25 PM Christopher  wrote:
> >
> > That PR adds a new feature that is currently forward-incompatible in
> > behavior with 1.10.0 and 1.10.1. New features are supposed to go into the
> > next release, not patched into a bugfix on a release line that is
> intended
> > to be stable long term.
> >
> > While our semver and LTM guidelines are just guidelines, and we can break
> > them when we want/need to, I think the project is better served if that
> > were rare. Every time we stretch/break those guidelines, we normalize
> > violating them, and the resulting reduced confidence in our software's
> > stability can create a feedback loop where the instability creates
> upgrade
> > aversion, and the upgrade aversion increases the demand for backporting
> > features. That's not sustainable, and it creates an unnecessary burden on
> > the development side of things. I think having boundaries that resist
> > against backporting features to stable branches creates a healthier
> > relationship between the devs and the users.
> >
> > At this point, I'm a "soft" (not yet a veto) -1 to including that in
> 1.10.
> > I could be convinced if it were A) 100% forward compatible with
> > 1.10.0/1.10.1 *and* either B) there was greater consensus for it among
> the
> > PMC or C) a good argument was made to justify adding the feature to a
> patch
> > release [A&(B|C)].
> >
> > As for the schedule, I was thinking about creating a release candidate on
> > Monday if there weren't any issues.
> >
> >
> > On Thu, Feb 3, 2022 at 2:56 PM Dave Marion  wrote:
> >
> > > I'd like to try and include
> https://github.com/apache/accumulo/pull/2221.
> > > A
> > > little more testing needs to be done, do you have a schedule for the
> 1.10.2
> > > release?
> > >
> > > On Thu, Feb 3, 2022 at 1:55 PM Christopher 
> wrote:
> > >
> > > > I'm interested in putting together a 1.10.2 release with the changes
> in
> > > > https://github.com/apache/accumulo/pull/2458 so that the 1.10 line
> no
> > > > longer requires log4j1, which has several vulnerabilities. Reload4j
> was
> > > > created as a fork from log4j1 from Apache by its original author in
> order
> > > > to provide a transition away from the CVE-riddled log4j1 jars.
> > > >
> > > > I'm sure we have a couple of other small bugfixes and improvements in
> > > 1.10
> > > > that could benefit from being released as well.
> > > >
> > > > If there are any objections or last-minute tweaks that should be
> included
> > > > in 1.10.2, please discuss here.
> > > >
> > > > Thanks,
> > > > Christopher
> > > >
> > >
>

Re: [DISCUSS] 1.10.2 release with reload4j

2022-02-03 Thread Christopher

That PR adds a new feature that is currently forward-incompatible in
behavior with 1.10.0 and 1.10.1. New features are supposed to go into the
next release, not patched into a bugfix on a release line that is intended
to be stable long term.

While our semver and LTM guidelines are just guidelines, and we can break
them when we want/need to, I think the project is better served if that
were rare. Every time we stretch/break those guidelines, we normalize
violating them, and the resulting reduced confidence in our software's
stability can create a feedback loop where the instability creates upgrade
aversion, and the upgrade aversion increases the demand for backporting
features. That's not sustainable, and it creates an unnecessary burden on
the development side of things. I think having boundaries that resist
against backporting features to stable branches creates a healthier
relationship between the devs and the users.

At this point, I'm a "soft" (not yet a veto) -1 to including that in 1.10.
I could be convinced if it were A) 100% forward compatible with
1.10.0/1.10.1 *and* either B) there was greater consensus for it among the
PMC or C) a good argument was made to justify adding the feature to a patch
release [A&(B|C)].

As for the schedule, I was thinking about creating a release candidate on
Monday if there weren't any issues.

On Thu, Feb 3, 2022 at 2:56 PM Dave Marion  wrote:

> I'd like to try and include https://github.com/apache/accumulo/pull/2221.
> A
> little more testing needs to be done, do you have a schedule for the 1.10.2
> release?
>
> On Thu, Feb 3, 2022 at 1:55 PM Christopher  wrote:
>
> > I'm interested in putting together a 1.10.2 release with the changes in
> > https://github.com/apache/accumulo/pull/2458 so that the 1.10 line no
> > longer requires log4j1, which has several vulnerabilities. Reload4j was
> > created as a fork from log4j1 from Apache by its original author in order
> > to provide a transition away from the CVE-riddled log4j1 jars.
> >
> > I'm sure we have a couple of other small bugfixes and improvements in
> 1.10
> > that could benefit from being released as well.
> >
> > If there are any objections or last-minute tweaks that should be included
> > in 1.10.2, please discuss here.
> >
> > Thanks,
> > Christopher
> >
>

[DISCUSS] 1.10.2 release with reload4j

2022-02-03 Thread Christopher

I'm interested in putting together a 1.10.2 release with the changes in
https://github.com/apache/accumulo/pull/2458 so that the 1.10 line no
longer requires log4j1, which has several vulnerabilities. Reload4j was
created as a fork from log4j1 from Apache by its original author in order
to provide a transition away from the CVE-riddled log4j1 jars.

I'm sure we have a couple of other small bugfixes and improvements in 1.10
that could benefit from being released as well.

If there are any objections or last-minute tweaks that should be included
in 1.10.2, please discuss here.

Thanks,
Christopher

Re: specifiy hostname to bind to for accumulo servers

2022-01-25 Thread Christopher

You can specify the bind address on the command line when starting Accumulo
services:

bin/accumulo tserver -a 127.0.0.1
bin/accumulo tserver --address 127.0.0.1

I believe this will also work if you specify a hostname, but the IP address
that it will bind to will be the IP address of whatever the hostname
resolves to locally. There is not a way to specify the local bind IP
address and the advertised address in the cluster separately, however. So,
you should ensure that whatever name service you use (typically DNS/rDNS)
is configured so that any address resolved locally is consistent with how
that server's address resolves elsewhere on the same cluster.

On Mon, Jan 24, 2022 at 10:09 AM Vincent Russell 
wrote:

> Hello,
>
> Is there any way to specify what hostname an accumulo server should bind to
> when it starts up?
>
> For instance with  hadoop you can specify: dfs.namenode.rpc-address
> or dfs.datanode.http.address?
>
> We have some servers with multiple interfaces and this is causing issues.
>
> Thank you,
> Vincent
>

JIRA roles/permissions

2021-12-30 Thread Christopher

Hi Accumulo Devs,

I spent a few minutes today updating the roles in our JIRA project.

Although we don't use JIRA anymore for new issues, there are still a
few old tickets that remain, that are slowly being triaged and closed
as work continues. Some of our new committers didn't have access to
close some of those JIRA issues, because we had been manually
maintaining a list of users with access as we voted people in and
asked for access. However, now that JIRA uses LDAP, it's much easier
to maintain access to JIRA without manual effort. I set this up by
assigning JIRA roles based on their corresponding LDAP group
memberships. So, as long as you sign in to JIRA using your Apache user
ID, you should have the correct accesses.

The LDAP groups and their corresponding roles in the ACCUMULO JIRA
project are now:

accumulo -> Committers
accumulo-pmc -> PMC

Since all our PMC members are also committers, we all have the same
roles/permissions.
In addition, the ASF-maintained bots have the Contributors role.

I removed all other individual entries that had been added manually,
as they aren't needed anymore.

Regards,
Christopher

Re: Accumulo 2.0.1 init with hdfs running with SSL

2021-12-14 Thread Christopher

I have not personally tested HDFS configured for SSL/TLS, but `new
Configuration()` will load the core-default.xml and core-site.xml
files it finds on the class path. So, it looks like it should work.
Have you tried it? Did you get an error?


On Tue, Dec 14, 2021 at 1:54 PM Vincent Russell
 wrote:
>
> Thank you Mike,
>
> but it appears that accumulo uses those settings to connect accumulo, but
> not to connect to hdfs.
>
> For instance the VolumeManagementImpl just does this:
>
> VolumeConfiguration.create(new Path(volumeUriOrDir), hadoopConf));
>
> where the hadoopConf is just instantiated in the Initialize class:
>
> Configuration hadoopConfig = new Configuration();
> VolumeManager fs = VolumeManagerImpl.get(siteConfig, hadoopConfig);
>
> Thanks,
> Vincent
>
> On Tue, Dec 14, 2021 at 12:18 PM Mike Miller  wrote:
>
> > Checkout the accumulo client properties that start with the "ssl" prefix.
> > https://accumulo.apache.org/docs/2.x/configuration/client-properties
> > This blog post from a few years ago may help:
> >
> > https://accumulo.apache.org/blog/2014/09/02/generating-keystores-for-configuring-accumulo-with-ssl.html
> >
> > On Tue, Dec 14, 2021 at 9:58 AM Vincent Russell  > >
> > wrote:
> >
> > > Hello,
> > >
> > > I am trying to init a test accumulo instance with an hdfs running with
> > > SSL.Is this possible?  I am looking at the code and it doesn't look
> > > like this is possible.
> > >
> > > The Initialize class just instantiates a Hadoop config and passes that
> > into
> > > the VolumeManager without sending over any hadoop configs from the
> > core.xml
> > > file.
> > >
> > > Am I missing something?
> > >
> > > Thanks in advance for your help,
> > > Vincent
> > >
> >

Re: Consistent IT tests failures on Linux ARM64

2021-12-02 Thread Christopher

org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 2512   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 2513   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 2514   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 2515   at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 2516   at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 2517   at
> >> org.apache.accumulo.server.ServerInfo.getPrincipal(ServerInfo.java:148)
> >> 2518   at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:169)
> >> 2519   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> 2520   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> 2521   at
> >> org.apache.accumulo.server.metadata.ServerAmpleImpl.getGcCandidates(ServerAmpleImpl.java:180)
> >> 2522   at
> >> org.apache.accumulo.gc.SimpleGarbageCollector$GCEnv.getCandidates(SimpleGarbageCollector.java:199)
> >> 2523   at
> >> org.apache.accumulo.gc.GarbageCollectionAlgorithm.collect(GarbageCollectionAlgorithm.java:302)
> >> 2524   at
> >> org.apache.accumulo.gc.SimpleGarbageCollector.run(SimpleGarbageCollector.java:502)
> >> 2525   at io.opentelemetry.context.Context.lambda$wrap$1(Context.java:207)
> >> 2526   at
> >> io.opentelemetry.context.Context$$Lambda$209/0x000100357840.run(Unknown
> >> Source)
> >> 2527   at java.lang.Thread.run(java.base@11.0.11/Thread.java:829)
> >>
> >>
> >> 3151 "gc" #31 prio=5 os_prio=0 cpu=15982.95ms elapsed=218.59s
> >> tid=0x28295800 nid=0x32dac5 runnable  [0x3a5fb000]
> >> 3152java.lang.Thread.State: RUNNABLE
> >> 3153   at java.util.Arrays.hashCode(java.base@11.0.11/Arrays.java:4685)
> >> 3154   at java.util.Objects.hash(java.base@11.0.11/Objects.java:146)
> >> 3155   at java.security.Provider$ServiceKey.hashCode(java.base@11.0.11
> >> /Provider.java:1107)
> >> 3156   at java.util.concurrent.ConcurrentHashMap.get(java.base@11.0.11
> >> /ConcurrentHashMap.java:936)
> >> 3157   at java.security.Provider.getService(java.base@11.0.11
> >> /Provider.java:1282)
> >> 3158   at sun.security.jca.ProviderList.getService(java.base@11.0.11
> >> /ProviderList.java:380)
> >> 3159   at sun.security.jca.GetInstance.getInstance(java.base@11.0.11
> >> /GetInstance.java:157)
> >> 3160   at java.security.Security.getImpl(java.base@11.0.11
> >> /Security.java:700)
> >> 3161   at java.security.MessageDigest.getInstance(java.base@11.0.11
> >> /MessageDigest.java:178)
> >> 3162   at
> >> org.apache.commons.codec.digest.DigestUtils.getDigest(DigestUtils.java:170)
> >> 3163   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha2Crypt(Sha2Crypt.java:395)
> >> 3164   at
> >> org.apache.commons.codec.digest.Sha2Crypt.sha512Crypt(Sha2Crypt.java:585)
> >> 3165   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:78)
> >> 3166   at org.apache.commons.codec.digest.Crypt.crypt(Crypt.java:167)
> >> 3167   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.hashInstanceConfigs(SystemCredentials.java:120)
> >> 3168   at
> >> org.apache.accumulo.server.security.SystemCredentials$SystemToken.generate(SystemCredentials.java:125)
> >> 3169   at
> >> org.apache.accumulo.server.security.SystemCredentials.get(SystemCredentials.java:66)
> >> 3170   at
> >> org.apache.accumulo.server.ServerInfo.getCredentials(ServerInfo.java:179)
> >> 3171   at
> >> org.apache.accumulo.server.ServerInfo.getAuthenticationToken(ServerInfo.java:153)
> >> 3172   at
> >> org.apache.accumulo.server.ServerInfo.getProperties(ServerInfo.java:168)
> >> 3173   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.getProperties(ClientContext.java:236)
> >> 3174   at
> >> org.apache.accumulo.core.clientImpl.ClientContext.createScanner(ClientContext.java:635)
> >> 3175   at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.buildNonRoot(TabletsMetadata.java:177)
> >> 3176   at
> >> org.apache.accumulo.core.metadata.schema.TabletsMetadata$Builder.build(TabletsMetadata.java:125)
> >> 3177   at
> >> org.apache.accumulo.gc.

Re: Consistent IT tests failures on Linux ARM64

2021-11-30 Thread Christopher

It looks like the tests are timing out. This happens frequently when
running on resource-constrained systems. You can give the test more
time by increasing the timeout factor: `mvn clean verify
-Dcheckstyle.skip -Dspotbugs.skip -Dit.test=ConcurrentDeleteTableIT
-Dtimeout.factor=3`

There's nothing we know of that would change the way our tests work
due to ARM64, but you may have issues because of limited RAM, slow CPU
speeds, slow disk I/O, busy background processes, or other
resource-related issues. I don't think most of the currently active
developers use ARM64, or have access to a test machine to reproduce or
experiment with Accumulo there, so you may have to do some of your own
troubleshooting. If you can rule out resource-constraint issues, and
it isn't already a known flaky test (ConcurrentDeleteTableIT is known
flaky and sometimes times out on x86_64 as well), you could create a
bug ticket with more details at
https://github.com/apache/accumulo/issues ; there is an issue template
specifically for broken and/or flaky tests that you can select when
creating a new ticket.

On Tue, Nov 30, 2021 at 9:34 AM Mark Jens  wrote:
>
> Hi dev1,
>
> On Tue, 30 Nov 2021 at 16:21, dev1  wrote:
>
> > Some of those tests are trying to stress conditions that require a lot of
> > resources to replicate specific conditions. Have you tried to run those
> > individual tests in isolation so that you are not competing for resources?
> > Do they always fail, or are the failures transient?
> >
>
> Q: Have you tried to run those individual tests in isolation so that you
> are not competing for resources?
> A: This is what I mean with the following:
> -
> The tests fail even when executed separately, e.g.:
> mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test
> -
>
> Q: Do they always fail, or are the failures transient?
> A: I also tried to explain that with "These tests fail consistently at
> every build attempt!"
>
> Mark
>
> >
> > -Original Message-
> > From: Mark Jens 
> > Sent: Tuesday, November 30, 2021 4:05 AM
> > To: dev@accumulo.apache.org
> > Subject: Consistent IT tests failures on Linux ARM64
> >
> > Hello Accumulo community,
> >
> > At my job we consider using Linux ARM64 servers and I've been tasked to
> > test Accumulo.
> >
> > I face some timeout related issues with several IT tests:
> >
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >  Time elapsed: 420.122 s  <<< ERROR!
> > org.junit.runners.model.TestTimedOutException: test timed out after 420
> > seconds at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method)
> > at java.base@11.0.11
> > /java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.get(FutureTask.java:190)
> > at
> >
> > app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
> > at java.base@11.0.11
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at java.base@11.0.11
> >
> > /jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> > at
> >
> > app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> > at
> >
> > app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> > at
> >
> > app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> > at
> >
> > app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> > at
> >
> > app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> > at
> >
> > app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> > at java.base@11.0.11
> > /java.util.concurrent.FutureTask.run(FutureTask.java:264)
> > at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)
> >
> > [ERROR]
> >
> > org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
> >  Time elapsed: 420.122 s  <<< ERROR!
> > java.lang.Exception: Appears to be stuck in thread Time-limited
> > test-SendThread(localhost:44251)
> > at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at
> > java.base@11.0.11
> > /sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
> > at java.base@11.0.11
> >

Re: Accumulo 2.0.1 and zookeeper 3.5+

2021-10-29 Thread Christopher

I'm not familiar with how to set up SSL/TLS for ZooKeeper, but if you
want to use ZooKeeper 3.5 with Accumulo 2.0.1, it should still be
possible. You may need to modify some scripts and config files to
ensure your classpath is set up correctly, since the ZK directory
names may have changed from what our scripts assume. You may also need
to change some ZK configuration to ensure it works the same as it did
in 3.4; I would check with the ZK project for a migration guide.

On Fri, Oct 29, 2021 at 3:41 PM Vincent Russell
 wrote:
>
> Hello,
>
> I'm interested in setting up an accumulo cluster with ssl set up with
> zookeeper, hadoop and accumulo.
>
> It looks like accumulo 2.0.1 is set up to work with zookeeper 3.4.14, but
> it looks like I would need zookeeper 3.5+ in order to get this to work with
> zookeeper at least.  Is that so or am I missing something?
>
> Thanks,
> Vincent

Re: 1.10 <-> 2.0 shim

2021-10-25 Thread Christopher

Thanks Dave. It'd be really great if that plugin allowed us to filter
out removals of things previously deprecated. Its options seem limited
in that way.

On Fri, Oct 22, 2021 at 8:03 AM Dave Marion  wrote:
>
> I created a simple maven pom file that when run with *mvn clean package* will
> generate a report that shows the differences in the public API between
> 2.0.0 and 1.10.0.
>
> https://gist.github.com/dlmarion/b1063c334d519f637cc78d81ba9e15ef
>
> On Wed, Oct 20, 2021 at 6:20 PM Jeremy Kepner  wrote:
>
> > Seeme like there should be document that is kept whereby everytime a
> > breaking change is made it gets documented at the time it is committed.
> >
> > On Wed, Oct 20, 2021 at 04:55:42PM -0400, Christopher wrote:
> > > If somebody were to volunteer to create such a document, they could do so
> > > from some of the many 3rd party java API comparison tools. I'm not sure
> > > which tool would work best for this purpose, though.
> > >
> > > If anybody does this, let us know which one worked best for you. We could
> > > also amend the release notes with whatever you find. That could be
> > useful.
> > >
> > > On Wed, Oct 20, 2021, 15:14 Jeremy Kepner  wrote:
> > >
> > > > There should be a document that clearly states 1.10 functions will not
> > > > work in 2.0
> > > > so folks can grep their code to check.  Otherwise you have to install
> > 2.0
> > > > and then just
> > > > work through the errors one-by-one.
> > > >
> > > > On Tue, Oct 19, 2021 at 11:17:09AM -0400, Christopher wrote:
> > > > > The best reference is the release notes:
> > > > > https://accumulo.apache.org/release/accumulo-2.0.0/
> > > > >
> > > > > On Tue, Oct 19, 2021, 09:15 Jeremy Kepner  wrote:
> > > > >
> > > > > > Is there a list of things in 1.10 that will no longer work in 2.0.
> > > > > >
> > > > > > On Tue, Oct 19, 2021 at 08:59:58AM -0400, Christopher wrote:
> > > > > > > Hi Vincent,
> > > > > > >
> > > > > > > To supplement what Mike said, it's possible some stuff that was
> > > > > > > deprecated in 1.10 was dropped in 2.0. I don't have a
> > comprehensive
> > > > > > > list of what that might include, but anything marked as
> > deprecated in
> > > > > > > 1.10 is subject to removal in 2.0. If I recall, we did try to
> > limit
> > > > it
> > > > > > > somewhat. It wouldn't really make sense to create a shim to
> > restore
> > > > > > > those APIs, though, because that would just reintroduce code we
> > > > > > > explicitly dropped, which defeats the purpose of a major version
> > > > bump.
> > > > > > > In semantic versioning, the entire point of a major version bump
> > is
> > > > to
> > > > > > > declare a break in the backwards compatibility of the public API.
> > > > > > >
> > > > > > > If you need the code that was dropped, you probably aren't ready
> > to
> > > > > > > move to 2.x. 1.10 is an LTM release, so that means we intend to
> > keep
> > > > > > > patching important bugs until a year after our next LTM (which
> > hasn't
> > > > > > > yet been released). So, if you need to stay on 1.10, you have
> > plenty
> > > > > > > of time to update your code to stop using deprecated APIs and
> > avoid
> > > > > > > non-public APIs.
> > > > > > >
> > > > > > > On Tue, Oct 19, 2021 at 8:10 AM Mike Miller 
> > > > wrote:
> > > > > > > >
> > > > > > > > If the library was written using only the public API then it
> > > > shouldn't
> > > > > > be a
> > > > > > > > problem. See https://accumulo.apache.org/api/
> > > > > > > > Accumulo follows SemVer to maintain compatibility of the
> > public API
> > > > > > between
> > > > > > > > versions. There are a lot of changes between 1.10 and 2.0 but
> > > > anything
> > > > > > in
> > > > > > > > the public API in 1.10 should still exist in 2.0, even if
> > > > deprecated.
> > > > > > > > If the library is calling internal methods or extending
> > internal
> > > > > > classes,
> > > > > > > > then that is a different story. If it uses internals then I
> > > > recommend
> > > > > > > > refactoring to use the public API if possible.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Oct 18, 2021 at 3:38 PM Vincent Russell <
> > > > > > vincent.russ...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > I am interested in using an accumulo query and storage
> > library
> > > > that
> > > > > > was
> > > > > > > > > written against accumulo version 1.10 and I am interested in
> > > > using
> > > > > > it with
> > > > > > > > > accumulo 2.0.
> > > > > > > > >
> > > > > > > > > Is there a shim that exists that would allow the library to
> > be
> > > > used
> > > > > > for
> > > > > > > > > both versions that could be activated at compile time via a
> > maven
> > > > > > profile
> > > > > > > > > or something?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Vincent
> > > > > > > > >
> > > > > >
> > > >
> >

Re: 1.10 <-> 2.0 shim

2021-10-21 Thread Christopher

To be clear, we don't guarantee any kind of binary compatibility across
major versions (I'm not sure it's reasonable for us to try to offer binary
compatibility guarantees across minor versions either). It is expected,
therefore, that users will recompile their code to move across major
versions, if not also minor versions (bugfix/patch releases should
definitely be drop-in compatible, though).

I mention the lack of binary compatibility guarantee for major versions
mainly to point out that the process of checking errors one-by-one that you
suggested should be something downstream developers do in their IDE or CI
environment, and not just being surprised at runtime. Dealing with errors
during CI isn't a fun experience, but it's not as bad as being surprised by
them at runtime. Migration across major versions is not expected to be
effortless. The point of bumping the major version is to signal that some
effort might fall on downstream users.

That said, I agree there's room for improvement in documenting the details
of compatibility-related changes to the API. I don't think documenting them
as we go, like you suggest, is sufficient, though. We more-or-less tried to
do that with the 2.0 release notes, and did, in fact, mention breaking
changes there. However, I think you're suggesting a more dedicated section
in the notes, perhaps in a tabular format, rather than mixed in with the
rest of the notes. In order to ensure that we don't miss things that broke
unintentionally during the development process, or to catch changes where
the docs weren't updated as we go, we'd need to run some sort of comparison
tool, like previously suggested, even if we had been tracking things during
development.

There are two ways a comparison tool could be useful if somebody were to
find a good one and leverage it to contribute to the project:

1. As a code quality measure to catch breakages that would block a release
(or trigger a major version bump for the release). A good time to do this
would be during the release candidate voting period, and we encourage
anybody interested in specific measures of code quality to participate in
that voting process. This is one of many kinds of checks somebody could do
to get involved to ensure the project meets their expectations.
Contributors with non-binding votes can still provide useful feedback here
to affect binding votes and improve the quality of the release.

2. As a communication aid, to generate reports to include in the release
notes to inform users of changes. After every release vote, we have a
period of several days where we are finalizing the release notes
collaboratively on the website, before we announce the release. These
release notes can also be updated after a release. This is a collaborative
process, and another way somebody could get involved in the project. If a
contributor feels the release notes can be better, by including such
things, contributing in this area would be a great way for them to get
involved.

On Wed, Oct 20, 2021, 18:20 Jeremy Kepner  wrote:

> Seeme like there should be document that is kept whereby everytime a
> breaking change is made it gets documented at the time it is committed.
>
> On Wed, Oct 20, 2021 at 04:55:42PM -0400, Christopher wrote:
> > If somebody were to volunteer to create such a document, they could do so
> > from some of the many 3rd party java API comparison tools. I'm not sure
> > which tool would work best for this purpose, though.
> >
> > If anybody does this, let us know which one worked best for you. We could
> > also amend the release notes with whatever you find. That could be
> useful.
> >
> > On Wed, Oct 20, 2021, 15:14 Jeremy Kepner  wrote:
> >
> > > There should be a document that clearly states 1.10 functions will not
> > > work in 2.0
> > > so folks can grep their code to check.  Otherwise you have to install
> 2.0
> > > and then just
> > > work through the errors one-by-one.
> > >
> > > On Tue, Oct 19, 2021 at 11:17:09AM -0400, Christopher wrote:
> > > > The best reference is the release notes:
> > > > https://accumulo.apache.org/release/accumulo-2.0.0/
> > > >
> > > > On Tue, Oct 19, 2021, 09:15 Jeremy Kepner  wrote:
> > > >
> > > > > Is there a list of things in 1.10 that will no longer work in 2.0.
> > > > >
> > > > > On Tue, Oct 19, 2021 at 08:59:58AM -0400, Christopher wrote:
> > > > > > Hi Vincent,
> > > > > >
> > > > > > To supplement what Mike said, it's possible some stuff that was
> > > > > > deprecated in 1.10 was dropped in 2.0. I don't have a
> comprehensive
> > > > > > list of what that might include, but anything marked as
> deprecated in
> > > > >

Re: 1.10 <-> 2.0 shim

2021-10-20 Thread Christopher

If somebody were to volunteer to create such a document, they could do so
from some of the many 3rd party java API comparison tools. I'm not sure
which tool would work best for this purpose, though.

If anybody does this, let us know which one worked best for you. We could
also amend the release notes with whatever you find. That could be useful.

On Wed, Oct 20, 2021, 15:14 Jeremy Kepner  wrote:

> There should be a document that clearly states 1.10 functions will not
> work in 2.0
> so folks can grep their code to check.  Otherwise you have to install 2.0
> and then just
> work through the errors one-by-one.
>
> On Tue, Oct 19, 2021 at 11:17:09AM -0400, Christopher wrote:
> > The best reference is the release notes:
> > https://accumulo.apache.org/release/accumulo-2.0.0/
> >
> > On Tue, Oct 19, 2021, 09:15 Jeremy Kepner  wrote:
> >
> > > Is there a list of things in 1.10 that will no longer work in 2.0.
> > >
> > > On Tue, Oct 19, 2021 at 08:59:58AM -0400, Christopher wrote:
> > > > Hi Vincent,
> > > >
> > > > To supplement what Mike said, it's possible some stuff that was
> > > > deprecated in 1.10 was dropped in 2.0. I don't have a comprehensive
> > > > list of what that might include, but anything marked as deprecated in
> > > > 1.10 is subject to removal in 2.0. If I recall, we did try to limit
> it
> > > > somewhat. It wouldn't really make sense to create a shim to restore
> > > > those APIs, though, because that would just reintroduce code we
> > > > explicitly dropped, which defeats the purpose of a major version
> bump.
> > > > In semantic versioning, the entire point of a major version bump is
> to
> > > > declare a break in the backwards compatibility of the public API.
> > > >
> > > > If you need the code that was dropped, you probably aren't ready to
> > > > move to 2.x. 1.10 is an LTM release, so that means we intend to keep
> > > > patching important bugs until a year after our next LTM (which hasn't
> > > > yet been released). So, if you need to stay on 1.10, you have plenty
> > > > of time to update your code to stop using deprecated APIs and avoid
> > > > non-public APIs.
> > > >
> > > > On Tue, Oct 19, 2021 at 8:10 AM Mike Miller 
> wrote:
> > > > >
> > > > > If the library was written using only the public API then it
> shouldn't
> > > be a
> > > > > problem. See https://accumulo.apache.org/api/
> > > > > Accumulo follows SemVer to maintain compatibility of the public API
> > > between
> > > > > versions. There are a lot of changes between 1.10 and 2.0 but
> anything
> > > in
> > > > > the public API in 1.10 should still exist in 2.0, even if
> deprecated.
> > > > > If the library is calling internal methods or extending internal
> > > classes,
> > > > > then that is a different story. If it uses internals then I
> recommend
> > > > > refactoring to use the public API if possible.
> > > > >
> > > > >
> > > > > On Mon, Oct 18, 2021 at 3:38 PM Vincent Russell <
> > > vincent.russ...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I am interested in using an accumulo query and storage library
> that
> > > was
> > > > > > written against accumulo version 1.10 and I am interested in
> using
> > > it with
> > > > > > accumulo 2.0.
> > > > > >
> > > > > > Is there a shim that exists that would allow the library to be
> used
> > > for
> > > > > > both versions that could be activated at compile time via a maven
> > > profile
> > > > > > or something?
> > > > > >
> > > > > > Thanks,
> > > > > > Vincent
> > > > > >
> > >
>

Re: [DISCUSS] Version number of next release?

2021-10-19 Thread Christopher

We wouldn't *have* to remove additional deprecations if we did name it
3.0, but it might be a good opportunity to do some cleanup for some
stuff that deprecated prior to 2.0, but left in there to ease the
transition to 2.0. Then again, removing anything else might make the
transition from 1.10 LTM to 3.0 LTM more challenging.

Unless we find a clear compatibility issue in our public API that
forces us to bump to 3.0 because of semver, I'd be okay with either
version, so long as we make a decision. I do think the substantial
metrics/property name/tracing changes are compelling reasons to go to
3.0, because even if they don't cause problems with our public API,
the changes may still cause headaches for sysadmins.

On Tue, Oct 19, 2021 at 8:16 PM Ed Coleman  wrote:
>
> I stared a general thread concerning topics for the next release. One major 
> topic raised was what should the next version number be?  I stared this 
> thread so that version discussions can occur in a single thread for 
> continuity.  From the general email thread:
>
> Version number:  There have been substantial changes since 2.0 was released.  
>  The next version was expected to be 2.1, but with the number and the scope 
> of changes that have been made and some that are in the pipeline, maybe we 
> should signal this with a major version bump to 3.0?
>
> -   With semver, we might be able to go either way, depending on 
> interpretation.
> -   With the adoption of LTM releases, whatever the next version is 
> numbered, it will be a LTM release candidate.
> -   There have been over 800 changes committed.
> -   Notable major changes:
>oName changes to inclusive language (Manager instead of Master,…)
>oEnabling external compactions.
>oChanges in the storage of properties in ZooKeeper to reduce watchers 
> (in progress, issues #1225, #1809)
>oChange tracing to use OpenTracing instead of HTrace (PR #2259)
>oChange metrics to use micrometer.io instead of Hadoop-metrics2 (PR 
> #2305)
>oChanges to enable per-table encryption and other improvements (PR 
> #2197)
>o???
>
>
>

Re: [DISCUSS] The current state of replication and the way forward?

2021-10-19 Thread Christopher

For reference, our last conversation about the state of replication
was 
https://lists.apache.org/thread.html/ra65ecbfcdb26af2672b7a064d313c0db0285b7d9f228c09559a14842%40%3Cdev.accumulo.apache.org%3E
; in that, I tried to make the community aware of the issues involving
the long-running and frequently broken ITs that were becoming a burden
and interfering with progress in other areas of our code. After that
discussion, we disabled the consistently failing tests, with a call
for somebody to volunteer to pick up the maintenance burden. Since
that discussion, nobody has volunteered.

I do think we need to:
1. Communicate to users the current state, so they don't have high
expectations for its reliability when we know differently, and
2. Make a plan to deprecate and remove the feature (as it currently
exists, anyway), from Accumulo, in order to prevent the technical debt
and tight coupling to critical WAL code from inhibiting other
development work in Accumulo.

We can do #1 by updating the properties for the feature to
Experimental and/or Deprecated. Both states are reversible if the
status quo changes, but I think it's important users aren't misled
into thinking the feature is more stable and well-maintained than we
know it to be.

For #2, I think it would be okay to deprecate it in the next minor
release, and remove it in the next major release after that. Again,
the deprecated state can be reversed if the status quo substantially
changes.

On Tue, Oct 19, 2021 at 8:19 PM Ed Coleman  wrote:
>
> I stared a general thread concerning topics for the next release. One major 
> topic raised was the state of replication and trying to determine if there is 
> consensus for a way forward.  I stared this thread so that replication 
> discussions can occur in a single thread for continuity.  From the general 
> email thread:
>
> It is hard to know what the state of replication is and maybe we need to mark 
> it as either experimental or deprecated to convey that to users. The 
> replication tests have been unstable and failing with transient errors and 
> have been removed from the regular build process – this reduced the automated 
> build time by over 2 hours.   A recent example is accumulo-testing issue #164 
> (https://github.com/apache/accumulo-testing/issues/164) Without the test 
> running regularly, it is hard to state with any confidence that replication 
> works reliably in a production environment.   This should not be interpreted 
> as advocating that we remove replication at this point, but we need a way 
> forward. Maybe someone volunteers to examine the tests and fixes them so that 
> they run reliably and in a reasonable time, or maybe we begin to explore 
> other approaches – for example, maybe some  kind of NiFi connector or 
> something else entirely.  I really don’t know, but it seems we need to 
> clearly communicate so
>  mething to any users that may be using or considering using replication in 
> the next release the current state and to signal possible future intentions.

Re: 1.10 <-> 2.0 shim

2021-10-19 Thread Christopher

The best reference is the release notes:
https://accumulo.apache.org/release/accumulo-2.0.0/

On Tue, Oct 19, 2021, 09:15 Jeremy Kepner  wrote:

> Is there a list of things in 1.10 that will no longer work in 2.0.
>
> On Tue, Oct 19, 2021 at 08:59:58AM -0400, Christopher wrote:
> > Hi Vincent,
> >
> > To supplement what Mike said, it's possible some stuff that was
> > deprecated in 1.10 was dropped in 2.0. I don't have a comprehensive
> > list of what that might include, but anything marked as deprecated in
> > 1.10 is subject to removal in 2.0. If I recall, we did try to limit it
> > somewhat. It wouldn't really make sense to create a shim to restore
> > those APIs, though, because that would just reintroduce code we
> > explicitly dropped, which defeats the purpose of a major version bump.
> > In semantic versioning, the entire point of a major version bump is to
> > declare a break in the backwards compatibility of the public API.
> >
> > If you need the code that was dropped, you probably aren't ready to
> > move to 2.x. 1.10 is an LTM release, so that means we intend to keep
> > patching important bugs until a year after our next LTM (which hasn't
> > yet been released). So, if you need to stay on 1.10, you have plenty
> > of time to update your code to stop using deprecated APIs and avoid
> > non-public APIs.
> >
> > On Tue, Oct 19, 2021 at 8:10 AM Mike Miller  wrote:
> > >
> > > If the library was written using only the public API then it shouldn't
> be a
> > > problem. See https://accumulo.apache.org/api/
> > > Accumulo follows SemVer to maintain compatibility of the public API
> between
> > > versions. There are a lot of changes between 1.10 and 2.0 but anything
> in
> > > the public API in 1.10 should still exist in 2.0, even if deprecated.
> > > If the library is calling internal methods or extending internal
> classes,
> > > then that is a different story. If it uses internals then I recommend
> > > refactoring to use the public API if possible.
> > >
> > >
> > > On Mon, Oct 18, 2021 at 3:38 PM Vincent Russell <
> vincent.russ...@gmail.com>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I am interested in using an accumulo query and storage library that
> was
> > > > written against accumulo version 1.10 and I am interested in using
> it with
> > > > accumulo 2.0.
> > > >
> > > > Is there a shim that exists that would allow the library to be used
> for
> > > > both versions that could be activated at compile time via a maven
> profile
> > > > or something?
> > > >
> > > > Thanks,
> > > > Vincent
> > > >
>

Re: 1.10 <-> 2.0 shim

2021-10-19 Thread Christopher

Hi Vincent,

To supplement what Mike said, it's possible some stuff that was
deprecated in 1.10 was dropped in 2.0. I don't have a comprehensive
list of what that might include, but anything marked as deprecated in
1.10 is subject to removal in 2.0. If I recall, we did try to limit it
somewhat. It wouldn't really make sense to create a shim to restore
those APIs, though, because that would just reintroduce code we
explicitly dropped, which defeats the purpose of a major version bump.
In semantic versioning, the entire point of a major version bump is to
declare a break in the backwards compatibility of the public API.

If you need the code that was dropped, you probably aren't ready to
move to 2.x. 1.10 is an LTM release, so that means we intend to keep
patching important bugs until a year after our next LTM (which hasn't
yet been released). So, if you need to stay on 1.10, you have plenty
of time to update your code to stop using deprecated APIs and avoid
non-public APIs.

On Tue, Oct 19, 2021 at 8:10 AM Mike Miller  wrote:
>
> If the library was written using only the public API then it shouldn't be a
> problem. See https://accumulo.apache.org/api/
> Accumulo follows SemVer to maintain compatibility of the public API between
> versions. There are a lot of changes between 1.10 and 2.0 but anything in
> the public API in 1.10 should still exist in 2.0, even if deprecated.
> If the library is calling internal methods or extending internal classes,
> then that is a different story. If it uses internals then I recommend
> refactoring to use the public API if possible.
>
>
> On Mon, Oct 18, 2021 at 3:38 PM Vincent Russell 
> wrote:
>
> > Hello,
> >
> > I am interested in using an accumulo query and storage library that was
> > written against accumulo version 1.10 and I am interested in using it with
> > accumulo 2.0.
> >
> > Is there a shim that exists that would allow the library to be used for
> > both versions that could be activated at compile time via a maven profile
> > or something?
> >
> > Thanks,
> > Vincent
> >

Re: accumulo LTM

2021-10-19 Thread Christopher

LTM stands for "long-term maintenance". Its purpose is to communicate
what we are focusing our long-term maintenance efforts on, such as
backporting bug fixes and testing upgrade paths from older releases,
so we can balance the need to support previous releases against our
need to move forward and develop new features and enhancements. LTM is
our way of communicating our intentions for how we are going to
balance that. It will help us manage our time and resources as
developers, and will help manage certain user expectations for
patching older versions and for how long.

LTM does not communicate whether a particular version is
production-ready, or stable, or anything like that. We try to ensure
every version is of high quality and suitable for use in production.
It also doesn't determine compatibility (the version number does that
through semantic versioning).

What LTM does is help optimize our contributor efficacy by helping
developers avoid wasting their time patching, merging, and testing
countless previous versions concurrently, avoid exponential growth in
the number of possible upgrade paths to test, and avoid wasting time
troubleshooting bugs in older versions that were already fixed. It
also helps provide predictable upgrade paths for users and encourage
them to upgrade with greater confidence by following well-tested
upgrade paths.

By communicating a release as LTM, and linking to what that means
(https://accumulo.apache.org/contributor/versioning.html#LTM), we hope
users will be able to make informed decisions about when to upgrade
and to which version. For example, if a user deploys 2.0.1 (non-LTM),
they will know that there is not expected to be any further 2.0.2 or
later patches. Instead, they can expect to either backport bug fixes
themselves to 2.0, or they can upgrade to a minor release to get the
bug fixes rolled up alongside newer features. However, if they want to
use 1.10.1 (LTM), they can expect that important bug fixes will be
patched in a 1.10.2, etc., without any additional risks that newer
features might bring.

In summary, the theme here is effective use of contributor resources,
managed user expectations, and informed user choices. It is not a
statement about production-readiness, just a data point to help users
make decisions for themselves by understanding what the upstream
project's intentions are for a given release.

Apologies for my lack of brevity, but I hope this clears up our
intentions around LTM. We anticipate the next LTM will be the branch
currently under development, and our next release. It may be called
2.1.0, or 3.0.0, depending on decisions yet to be finalized. I had
hoped we would have released it earlier this year, but we're not done
with some things we want it to include. It's close, I think (within a
few months), but I don't want to make predictions right now.

On Tue, Oct 19, 2021 at 7:13 AM Ligade, Shailesh [USA]
 wrote:
>
> Hello,
>
> On the accumulo download page, only LTM version is 1.10.1, does that mean 
> accumulo 2.x should not be used for production yet?- Any timeline when 2.x 
> will be LTM?
>
> Thanks
>
> -S

Re: Accumulo quarterly report. Due 10/13/2021

2021-09-28 Thread Christopher

Our repos should still be labeled for Hacktoberfest, so we should
still be ready for that. I didn't see any substantial rules changes
that would require additional steps from us.

On Tue, Sep 28, 2021 at 2:59 PM Michael Wall  wrote:
>
> I think links 3 and 4 have a space that is breaking up the url.
>
> Are we doing Hacktoberfest again this year?  Could mention that.
>
> Thanks Ed
>
> On Tue, Sep 28, 2021 at 2:50 PM Mike Miller  wrote:
>
> > Other than the typo Dave found, LGTM.
> >
> > On Tue, Sep 28, 2021 at 9:01 AM Dave Marion  wrote:
> >
> > > Typo in last line, "Jira to Gibhub"
> > >
> > > On Tue, Sep 28, 2021 at 8:10 AM dev1  wrote:
> > >
> > > > The Accumulo community quarterly report for October is due Wednesday
> > > > 10/13/2021.  The community decided to publicly prepare the report on
> > the
> > > > dev mailing list.  Below is the current draft.
> > > >
> > > > Ed Coleman
> > > >
> > > > --- Draft report ---
> > > >
> > > > ## Description:
> > > > The Apache Accumulo is a robust, scalable, distributed key/value store
> > > > with cell-based access control and customizable server-side processing.
> > > >
> > > > ## Issues:
> > > > There are no new issues requiring board attention.
> > > >
> > > > The trademark issue with http:www.accumulodata.com is still open.
> > > > Although the domain owner does not have access to the domain
> > > registration,
> > > > the domain appears to have automatically renewed, and the expiration is
> > > now
> > > > 2022-06-28.  Email from the private list discussing this are at [1],
> > [2]
> > > > and [3]. No action has been required and allowing the domain to expire
> > > was
> > > > deemed a viable option by Brand Management VP in Jan-2021 (private)[4]
> > to
> > > > minimize volunteer efforts.
> > > >
> > > > ## Membership Data:
> > > > Apache Accumulo was founded 2012-03-21 (10 years ago)
> > > > There are currently 40 committers and 40 PMC members in this project.
> > > > The Committer-to-PMC ratio is 1:1.
> > > >
> > > > Community changes, past quarter:
> > > > - Dominic Garguilo was added to the PMC on 2021-07-29
> > > > - Dominic Garguilo was added as committer on 2021-07-29
> > > >
> > > > ## Project Activity:
> > > > No new releases this reporting period. Last release dates:
> > > > - accumulo-2.0.1 was released on 2020-12-24.
> > > > - accumulo-1.10.1 was released on 2020-12-22.
> > > >
> > > > Project activity on the next release remains active with significant
> > > > improvements to the current baseline. The remaining issues are being
> > > > actively worked.
> > > >
> > > > ## Community Health:
> > > > Overall community health is good and GitHub activity remains
> > consistent.
> > > >
> > > > - Community participation remains healthy with discussions on the
> > mailing
> > > > lists and GitHub issues and pull-requests.
> > > > - Accumulo continues to transition from Jira to GibHub issues. Jira
> > > > activity reflects transition to using GitHub issues as obsolete issues
> > > are
> > > > closed and open issues are transitioned to GitHub issues.
> > > >
> > > >
> > > > ## Links
> > > > (private) [1]:
> > > >
> > >
> > https://lists.apache.org/thread.html/r8c8ef5575b14accb6fc00d670764a313b91d76033f761c6e5c7eb29d%40%3Cprivate.accumulo.apache.org%3E
> > > > (private) [2]:
> > > >
> > >
> > https://lists.apache.org/thread.html/514d3cf9162e72f4aa13be1db5d6685999fc83755695308a529de4d6@%3Cprivate.accumulo.apache.org%3E
> > > > (private) [3]:https://lists.apache.org/thread.html/rcc8c07db43222e0
> > > > 8b9992fd739b8f24d18569ba9af3decfdb52c4a3e%40%
> > > 3Cprivate.accumulo.apache.org
> > > > %3E
> > > > (private) [4]:https://lists.apache.org/thread.html/r408e3eed907e3ad
> > > > 24a7c84b5247f51973a4c965c891b01215e45ee17%40%
> > > 3Cprivate.accumulo.apache.org
> > > > %3E
> > > > ~
> > > >
> > >
> >

Re: Metrics Replacement

2021-09-23 Thread Christopher

+1 to everything Ed wrote. :)

On Wed, Sep 22, 2021 at 10:03 AM  wrote:
>
> The information provided by micrometer instrumentation should be consistent 
> with the values produced by Hadoop metrics.  Things like gauges and counters 
> are straight forward and should match 1:1.  Things that collect / calculate 
> statics may be slightly different due to implementation details - say the way 
> binning for histograms is performed - they will still be mathematically 
> correct and the values they report should still be consistent, but they might 
> be "different".
>
> An issue with metrics is that each collection system seems to have slight 
> variations in the way they want things collected and reported. Micrometer 
> supports various monitoring systems and a way to implement others if a 
> particular system is not currently supported.  In micrometer, each registry 
> provides for converting / supporting a specific monitoring system.  This 
> includes things like name conversions, rate aggregation (client vs. server) 
> and push vs. pull. Our current metrics were named with a specific metrics 
> system and a naming convention - rather than trying to match our current 
> names exactly we could follow the micrometer naming convention and then rely 
> on the micrometer registry conversion to match the user's defined collection 
> system.
>
> Adopting and following the micrometer conventions should increase our 
> compatibility with other collection systems and ease user implementations.  
> In places where this might result in a name change, I think we should 
> prioritize constancy and normalizing names with conventions. That would seem 
> to provide the least surprise to end users and increase their flexibility to 
> meet their needs. We should also look to take advantage of tagging to allow 
> for aggregation and dimensional drill down to increase utility to end users. 
> To the extent that this changes a reported metric name, the increased utility 
> and flexibility provided would benefit end-users.  While any name change 
> would increase friction for current metric consumers, the degree of friction 
> seems independent of the amount of change - any change might be disruptive.  
> I am not advocating that we should change names just to change them - rather 
> we should seek to provide uniform names and consistent naming conventions 
> across our codebase as primary consideration and allow the reported names 
> fall out from there.
>
> The configuration of each monitoring system will depend on the system chosen 
> by the user.  We should provide a select set of examples (I advocate 
> Prometheus, some flavor of statsd and logging) to guide users if one of those 
> do not fit their requirements and they elect to use a different micrometer 
> module / collection system.
>
> I agree that we should supply documentation mapping current names to their 
> micrometer equivalents -  the specific name reported will be dependent on the 
> conversions performed by the target system - but those should be documented 
> in each module and is not within our scope.
>
> -Original Message-
> From: Keith Turner 
> Sent: Tuesday, September 21, 2021 5:07 PM
> To: Accumulo Dev List 
> Subject: Re: Metrics Replacement
>
> On Tue, Sep 21, 2021 at 3:45 PM Dave Marion  wrote:
> >
> > There is a WIP pull request against 2.1.0-SNAPSHOT for replacing the
> > Hadoop
> > Metrics2 framework with Micrometer[1]. Micrometer suggests using a
> > naming pattern[2] for the metrics internally where words are all
> > lowercase separated by a period. Micrometer output formats then
> > rewrite the metric names to the destination specific format. It's
> > possible that we may not be able to produce metrics in the same exact
> > way as the Hadoop Metrics2
>
> Is it only the naming pattern that will cause incompatibility, or is it more 
> than that?  Like would a timer, guage, etc in micrometer produce different 
> information/metrics than a timer,gauge,etc in hadoop metrics?  I suspect 
> these would differ and that would also impact compat.  Will the way in which 
> accumulo is configured to report metrics also change?  I can't imagine it 
> would be the same, but I have not looked at the PR.
>
> Can you provide an example of a naming incompat where it has to change?
>
> > framework. Metrics are not part of the public API, but we do want to
> > try and retain as much backwards compatibility as possible. In the
> > event that we cannot get that compatibility it has been suggested that
> > we document how things are different. As I have limited knowledge of
> > how the metrics are
>
> Is there a reasonable path to achieving compatibility?  If not, it seems like 
> documenting what has changed is a good way to go.  Could possibly explain it 
> in detail in the 2.1.0 release notes and have a link to that in the user 
> manual.
>
> > being used today, I'm looking for some feedback from the community as
> > to how painful it would be if metric

Re: Metrics Replacement

2021-09-21 Thread Christopher

My impression is that the metrics names are a big pain point already,
and that they appear differently, depending on which Hadoop Metrics 2
sink the user has configured, and what happens to them after that
(InfluxDB naming conventions seems to be a problem sometimes while
using fluo-uno). Since the names are not public API, we should be able
to change them easily, but the changes may still cause users
headaches.

My thoughts are that it's probably worth causing the disruption, if we
can get to a better place to keep the names stable, intuitive, and
useful with better naming conventions, but that it would be good to
try to document the new names very clearly, so it's easy for people to
use the new metrics in place of whatever is removed. I would prefer we
work to establish a good naming convention from scratch, than to try
to do anything to preserve previous naming compatibility.

On Tue, Sep 21, 2021 at 3:45 PM Dave Marion  wrote:
>
> There is a WIP pull request against 2.1.0-SNAPSHOT for replacing the Hadoop
> Metrics2 framework with Micrometer[1]. Micrometer suggests using a naming
> pattern[2] for the metrics internally where words are all lowercase
> separated by a period. Micrometer output formats then rewrite the metric
> names to the destination specific format. It's possible that we may not be
> able to produce metrics in the same exact way as the Hadoop Metrics2
> framework. Metrics are not part of the public API, but we do want to try
> and retain as much backwards compatibility as possible. In the event that
> we cannot get that compatibility it has been suggested that we document how
> things are different. As I have limited knowledge of how the metrics are
> being used today, I'm looking for some feedback from the community as to
> how painful it would be if metric names changed in a minor release.
>
> [1] https://micrometer.io/
> [2] https://micrometer.io/docs/concepts#_naming_meters

Re: Accumulo metrics using haddop_mertisc2

2021-09-09 Thread Christopher

I could be wrong, but I don't think Accumulo has changed anything about the
way it is emitting metrics in 1.10 that would be substantially different
from 1.8. It's possible that the GraphiteSink has changed how it works in
your version of Hadoop. Or maybe newer versions of InfluxDB stores things
differently or interacts with the Grafana dashboard differently.

It's very hard to say, since none of those projects are owned, developed,
or directly supported by the Accumulo PMC. We just emit metrics using the
Hadoop APIs, and rely on that dependency to do the rest. As Accumulo
developers, we're not necessarily experts on configuring the possible sinks
or what happens to the metrics data after.

I've actually found configuring metrics to be very confusing and error
prone, when I've done it for testing (the conversation you referenced was
one such occasion where I found a sort of half solution, just to get it to
work well enough for the testing I was doing at the time).

My suggestion would be to reach out to our user list instead of the
developer list, in case other users have experience with those components
in their Accumulo deployments. Or, you could seek assistance directly from
the developers for the relevant metrics component, whether it's from the
Hadoop team for the GraphiteSink, InfluxDB's developers, or another
component's developers. Sorry, I don't mean to pass the buck here, just
trying to manage expectations based on the kind of expertise you can expect
on this list, and suggest alternative resources in case you don't find an
answer here.

If you do find a solution, feel free to propose an updated blog post or
documentation update for the website.

On Thu, Sep 9, 2021, 12:59 Ligade, Shailesh [USA]
 wrote:

> Thanks for reply,
>
>
>   1.  I am using influxdb 1.8.1 and Grafana 7.6 the blog post is using
> older versions. - InfluxDB v0.9.4.2 and Grafana v2.5.0.
> The hadoop-metrics2-accumulo.properties has the same content as from the
> blog
>
> *.period=30
> accumulo.sink.graphite.class=org.apache.hadoop.metrics2.sink.GraphiteSink
> accumulo.sink.graphite.server_host=
> accumulo.sink.graphite.server_port=2003
> accumulo.sink.graphite.metrics_prefix=accumulo
> If I setup file sink it is working as well,
>
> As far as measurements in influxdb are concerned are very different names..
>
> e.g. the dashboard is looking for measurement
> 'accumulo_tserver_general_IngestRate' and the measurement I can see is
> accumulo.tserver.general.Context=tserver.ProcessName=TabletServer.Hostname=.ingestRate
> and that measurement has no data...not sure how to fix that.
>
> -S
>

Re: Accumulo Feathercast interview request

2021-09-07 Thread Christopher

We haven't had anybody express an interest yet. I suggest moving on down
the list, and we can try to reach back to you if somebody is able to step
up and do it.

On Tue, Sep 7, 2021, 09:41 Rich Bowen  wrote:

> FWIW, I need to hear from you today if you want this week's Friday
> interview spot. I'm about to go down the list to the next project. Are
> you still a possibility for this week, or is it further out than that?
>
> --Rich, for Feathercast.
>
> On 9/1/21 11:50 AM, Christopher wrote:
> > Hi Rich,
> >
> > I started a conversation on our private list to see if anybody is
> > interested in volunteering for this. Some of our committers may have
> > restrictions on interview requests imposed by their employers that may
> > limit who is willing to volunteer, so we'll try to discuss it, and see
> > if anybody wants to step up. I have seen all the previous Feathercast
> > interviews on the YouTube channel, and think this definitely has
> > value, so hopefully somebody will be willing and able to do this with
> > you.
> >
> > Thanks for reaching out,
> > Christopher
> >
> > On Fri, Aug 27, 2021 at 12:54 PM Rich Bowen  wrote:
> >> Hi, Accumulo,
> >>
> >> For those that don't know me, I'm Rich Bowen, the voice behind many of
> the Feathercast recordings. For those not familiar with Feathercast, it's a
> not-very-regular podcast/videocast about Apache topics. You can see some of
> my past episodes here -
> https://www.youtube.com/playlist?list=PLU2OcwpQkYCzs8261KxC4BoB2ptHJyMQt
> - and, yes, I've been inactive for a year.
> >>
> >> I'm trying really hard to get through all of the Apache projects (yes,
> this is going to take years) and do a "What is it" kind of a podcast. (If
> you want to talk about something else, that's fair game too.)
> >>
> >> Is there someone from this project who would be available and willing
> to talk with me about Accumulo? The basic script is here -
> https://docs.google.com/document/d/1FZzloEiCf2qxm9Q4Ipq6i-2mkykdKfvuNaqOpRf901w/edit#
> - but it's very flexible, and can be whatever you need it to be at this
> point in your project's life.
> >>
> >> I'll do a video interview on Google Meet. This should take anywhere
> from a few minutes to a half hour, depending on what you want to talk
> about. I'm looking for about 10-15 minutes of actual content after edits.
> The final cut gets posted on YouTube and on feathercast.org, and then
> probably makes appearances elsewhere, like Twitter and the Apache weekly
> newsletter.
> >>
> >> If you're interested, please discuss amongst yourselves who the right
> spokesperson (or more than one?) is, and let me know, at rbo...@apache.org,
> and we'll set up a time. I *try* to do interviews on Fridays, and then post
> them the following Tuesday, schedule permitting.
> >>
> >> Thanks!
> >>
> >> --Rich
>
> --
> Rich Bowen, VP Conferences
> The Apache Software Foundation
> https://apachecon.com/
> @apachecon
>
>

Re: Accumulo Feathercast interview request

2021-09-01 Thread Christopher

Hi Rich,

I started a conversation on our private list to see if anybody is
interested in volunteering for this. Some of our committers may have
restrictions on interview requests imposed by their employers that may
limit who is willing to volunteer, so we'll try to discuss it, and see
if anybody wants to step up. I have seen all the previous Feathercast
interviews on the YouTube channel, and think this definitely has
value, so hopefully somebody will be willing and able to do this with
you.

Thanks for reaching out,
Christopher

On Fri, Aug 27, 2021 at 12:54 PM Rich Bowen  wrote:
>
> Hi, Accumulo,
>
> For those that don't know me, I'm Rich Bowen, the voice behind many of the 
> Feathercast recordings. For those not familiar with Feathercast, it's a 
> not-very-regular podcast/videocast about Apache topics. You can see some of 
> my past episodes here - 
> https://www.youtube.com/playlist?list=PLU2OcwpQkYCzs8261KxC4BoB2ptHJyMQt - 
> and, yes, I've been inactive for a year.
>
> I'm trying really hard to get through all of the Apache projects (yes, this 
> is going to take years) and do a "What is it" kind of a podcast. (If you want 
> to talk about something else, that's fair game too.)
>
> Is there someone from this project who would be available and willing to talk 
> with me about Accumulo? The basic script is here - 
> https://docs.google.com/document/d/1FZzloEiCf2qxm9Q4Ipq6i-2mkykdKfvuNaqOpRf901w/edit#
>  - but it's very flexible, and can be whatever you need it to be at this 
> point in your project's life.
>
> I'll do a video interview on Google Meet. This should take anywhere from a 
> few minutes to a half hour, depending on what you want to talk about. I'm 
> looking for about 10-15 minutes of actual content after edits. The final cut 
> gets posted on YouTube and on feathercast.org, and then probably makes 
> appearances elsewhere, like Twitter and the Apache weekly newsletter.
>
> If you're interested, please discuss amongst yourselves who the right 
> spokesperson (or more than one?) is, and let me know, at rbo...@apache.org, 
> and we'll set up a time. I *try* to do interviews on Fridays, and then post 
> them the following Tuesday, schedule permitting.
>
> Thanks!
>
> --Rich

Re: [accumulo] branch 1.10 updated: removes extraneous code from TabletIteratorTest

2021-08-06 Thread Christopher

Seems reasonable. Thanks for the explanation and satisfying my curiosity 

On Fri, Aug 6, 2021, 13:31 Keith Turner  wrote:

> Was looking at this test with Mike Wall and we were getting very
> confused by it.  We determined the confusion was caused by the test
> having extra code that was not needed.  Since we had done the work to
> decipher this I thought it would be worthwhile to push it.  Usually I
> would do a PR, but I thought it was only 3 lines so why bother?  After
> pushing the commit I realized I had accidentally pushed a few files
> that Eclipse had changed that I did not want to push.  So I did
> another commit to revert those unintended changes.
>
> As for the sours merge, the test does not exist in 2.x.  I did check
> that my two commits were the only unmerged commits before doing the
> sours merge.
>
> On Thu, Aug 5, 2021 at 7:30 PM Christopher  wrote:
> >
> > Hey Keith,
> >
> > Just curious because of all the activity around this change in 1.10
> > (the subsequent partial revert and the merge commits to main, which
> > seem to be -sours), what motivated the change to TabletIteratorTest in
> > the older branch?
> >
> > On Thu, Aug 5, 2021 at 6:40 PM  wrote:
> > >
> > > This is an automated email from the ASF dual-hosted git repository.
> > >
> > > kturner pushed a commit to branch 1.10
> > > in repository https://gitbox.apache.org/repos/asf/accumulo.git
> > >
> > >
> > > The following commit(s) were added to refs/heads/1.10 by this push:
> > >  new 5d475b0  removes extraneous code from TabletIteratorTest
> > > 5d475b0 is described below
> > >
> > > commit 5d475b00eabf9aa419dbc49d5a49465633a61815
> > > Author: Keith Turner 
> > > AuthorDate: Thu Aug 5 18:37:12 2021 -0400
> > >
> > > removes extraneous code from TabletIteratorTest
> > > ---
> > >  server/base/.gitignore  |
> 1 +
> > >  .../java/org/apache/accumulo/server/util/TabletIteratorTest.java|
> 6 +-
> > >  server/tserver/.gitignore   |
> 1 +
> > >  test/.gitignore |
> 1 +
> > >  4 files changed, 4 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/server/base/.gitignore b/server/base/.gitignore
> > > index e77a822..320dd27 100644
> > > --- a/server/base/.gitignore
> > > +++ b/server/base/.gitignore
> > > @@ -26,3 +26,4 @@
> > >  /nbproject/
> > >  /nbactions.xml
> > >  /nb-configuration.xml
> > > +/bin/
> > > diff --git
> a/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> b/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> > > index 43888af..b60630d 100644
> > > ---
> a/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> > > +++
> b/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> > > @@ -121,12 +121,8 @@ public class TabletIteratorTest {
> > >  createTabletData(data1, "3", "c", null);
> > >  createTabletData(data1, "3", "n", "c");
> > >
> > > -TreeMap data2 = new TreeMap<>(data1);
> > > -
> > > -createTabletData(data2, "3", null, "n");
> > > -
> > >  assertThrows(IllegalStateException.class,
> > > -() -> runTest(Arrays.asList(data1, data2),
> Arrays.asList("3;c", "3;n", "3<")));
> > > +() -> runTest(Arrays.asList(data1), Arrays.asList("3;c",
> "3;n")));
> > >}
> > >
> > >@Test
> > > diff --git a/server/tserver/.gitignore b/server/tserver/.gitignore
> > > index e77a822..320dd27 100644
> > > --- a/server/tserver/.gitignore
> > > +++ b/server/tserver/.gitignore
> > > @@ -26,3 +26,4 @@
> > >  /nbproject/
> > >  /nbactions.xml
> > >  /nb-configuration.xml
> > > +/bin/
> > > diff --git a/test/.gitignore b/test/.gitignore
> > > index 87da2f9..c92e5e7 100644
> > > --- a/test/.gitignore
> > > +++ b/test/.gitignore
> > > @@ -30,3 +30,4 @@
> > >  # python ignores
> > >  *.pyc
> > >
> > > +/bin/
>

Re: [accumulo] branch 1.10 updated: removes extraneous code from TabletIteratorTest

2021-08-05 Thread Christopher

Hey Keith,

Just curious because of all the activity around this change in 1.10
(the subsequent partial revert and the merge commits to main, which
seem to be -sours), what motivated the change to TabletIteratorTest in
the older branch?

On Thu, Aug 5, 2021 at 6:40 PM  wrote:
>
> This is an automated email from the ASF dual-hosted git repository.
>
> kturner pushed a commit to branch 1.10
> in repository https://gitbox.apache.org/repos/asf/accumulo.git
>
>
> The following commit(s) were added to refs/heads/1.10 by this push:
>  new 5d475b0  removes extraneous code from TabletIteratorTest
> 5d475b0 is described below
>
> commit 5d475b00eabf9aa419dbc49d5a49465633a61815
> Author: Keith Turner 
> AuthorDate: Thu Aug 5 18:37:12 2021 -0400
>
> removes extraneous code from TabletIteratorTest
> ---
>  server/base/.gitignore  | 1 +
>  .../java/org/apache/accumulo/server/util/TabletIteratorTest.java| 6 
> +-
>  server/tserver/.gitignore   | 1 +
>  test/.gitignore | 1 +
>  4 files changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/server/base/.gitignore b/server/base/.gitignore
> index e77a822..320dd27 100644
> --- a/server/base/.gitignore
> +++ b/server/base/.gitignore
> @@ -26,3 +26,4 @@
>  /nbproject/
>  /nbactions.xml
>  /nb-configuration.xml
> +/bin/
> diff --git 
> a/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
>  
> b/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> index 43888af..b60630d 100644
> --- 
> a/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> +++ 
> b/server/base/src/test/java/org/apache/accumulo/server/util/TabletIteratorTest.java
> @@ -121,12 +121,8 @@ public class TabletIteratorTest {
>  createTabletData(data1, "3", "c", null);
>  createTabletData(data1, "3", "n", "c");
>
> -TreeMap data2 = new TreeMap<>(data1);
> -
> -createTabletData(data2, "3", null, "n");
> -
>  assertThrows(IllegalStateException.class,
> -() -> runTest(Arrays.asList(data1, data2), Arrays.asList("3;c", 
> "3;n", "3<")));
> +() -> runTest(Arrays.asList(data1), Arrays.asList("3;c", "3;n")));
>}
>
>@Test
> diff --git a/server/tserver/.gitignore b/server/tserver/.gitignore
> index e77a822..320dd27 100644
> --- a/server/tserver/.gitignore
> +++ b/server/tserver/.gitignore
> @@ -26,3 +26,4 @@
>  /nbproject/
>  /nbactions.xml
>  /nb-configuration.xml
> +/bin/
> diff --git a/test/.gitignore b/test/.gitignore
> index 87da2f9..c92e5e7 100644
> --- a/test/.gitignore
> +++ b/test/.gitignore
> @@ -30,3 +30,4 @@
>  # python ignores
>  *.pyc
>
> +/bin/

Re: [EXTERNAL] Accumulo with Native S3 Support

2021-08-04 Thread Christopher

ites for a tablet, when to 
> the rest of the system its lock is gone.

I would be interested in seeing those changes rebased onto the current
main branch, and submitted as a separate PR to be considered on their
own, since they do modify existing Accumulo code. If we can
incorporate these changes so that they not only help support the S3
FileSystem implementations, but also enhance Accumulo more generally
without being tightly coupled to those implementations, I think that's
probably the best way forward for the ZooLease stuff.

>
> On 2021/07/28 17:41:10, Christopher  wrote:
> > From what I saw from looking at the changes in Chris Milbert's fork,>
> > the fork contains a couple S3 implementations of Hadoop's FileSystem>
> > interface in a separate module (similar to s3a:// and abfss://>
> > implementations). It seems to add accS3mo:// and accS3nf://>
> > implementations, which, in spite of their names, do not appear to be>
> > Accumulo-specific (that's a good thing... as these could be reused by>
> > other projects as well!).>
> >
> > In addition, these FileSystem implementations seem to be accompanied>
> > by a few changes to Accumulo code itself, but I couldn't tell if these>
> > were necessary to improve compatibility with these new FileSystems or>
> > if they were unrelated additional enhancements to Accumulo. They also>
> > appeared to be based on an older 2.0 branch, rather than the latest>
> > 2.1 / main branch, and conflict with some of the changes in 2.1>
> > branch. So those changes will need to be rebased.>
> >
> > So, I suggest isolating the FileSystem implementations from the>
> > changes to Accumulo. The FileSystem implementations don't need to be>
> > merged into Accumulo's code base, or built as part of Accumulo at all.>
> > They are completely independent from Accumulo and can exist in their>
> > own repo, for use by any other user, just like s3a:// or abfss:// .>
> > The Accumulo PMC could decide to accept responsibility for these>
> > FileSystem implementations, but I don't think the Accumulo project at>
> > the ASF is the best home for them, as they are not Accumulo-specific.>
> > It might make more sense as a subproject of Hadoop instead of>
> > Accumulo, since they are Hadoop FileSystem implementations, or remain>
> > as a 3rd party repository on GitHub as part of the larger Hadoop>
> > ecosystem. Finding the best home for these may take some additional>
> > research on the part of its developers.>
> >
> > The changes to Accumulo itself, separate from the S3 FileSystem>
> > implementations, will be easiest to incorporate into the 2.1 / main>
> > branch if they are rebased first, and submitted from a fork on GitHub>
> > (Chris Milbert's repo does not appear to be a "fork", but a>
> > disconnected clone, so creating a PR using GitHub's UI won't be>
> > possible without first recreating the repo using the "fork" feature on>
> > GitHub). If there are multiple, discrete changes, serving independent>
> > purposes, the changes should be teased apart and submitted as separate>
> > PRs against the main branch, so they can be evaluated on their own>
> > merits through the code review process. It is hard to consider their>
> > merits without a pull request for those changes.>
> >
> > I think the discussion of abstracting the storage layer in Accumulo is>
> > a worthy one, but I think it can be set aside for now. Abstracting the>
> > storage layer from Hadoop would involve creating Accumulo-specific>
> > storage APIs, and corralling Hadoop FileSystem API calls behind an>
> > implementation of that Accumulo storage API. However, that's not>
> > necessary for this. We currently use Hadoop's FileSystem APIs>
> > throughout our own code, and Hadoop's FileSystem already provides>
> > sufficient abstraction for the purposes of adding S3 support to>
> > Accumulo, and that's what appears to have been done by Chris Milbert.>
> > So, there's no need to complicate the discussion with additional>
> > potential future work to further abstract Hadoop FileSystem API calls.>
> > That abstraction doesn't appear to be a necessary prerequisite to>
> > considering the work done by Chris in his repo.>
> >
> > To me, the main questions are:>
> >
> > 1. Can the new FileSystem implementations be used as easily as other>
> > drop-in implementations, like s3a:// and abfss:// ?>
> > 2. Where is the best home for these FileSystem implementations?>
> > 3. What benefits do

new committer: Dominic Garguilo

2021-07-29 Thread Christopher

The Project Management Committee (PMC) for Apache Accumulo has invited
Dominic Garguilo to become a committer and PMC member and we are
pleased to announce that they have accepted.

Dominic has been contributing various fixes and improvements to
Accumulo since Fall 2020.

Being a committer enables easier contribution to the project since
there is no need to go via the patch submission process. This should
enable better productivity. A PMC member helps manage and guide the
direction of the project.

Welcome, Dominic!

Re: [EXTERNAL] Accumulo with Native S3 Support

2021-07-28 Thread Christopher

>From what I saw from looking at the changes in Chris Milbert's fork,
the fork contains a couple S3 implementations of Hadoop's FileSystem
interface in a separate module (similar to s3a:// and abfss://
implementations). It seems to add accS3mo:// and accS3nf://
implementations, which, in spite of their names, do not appear to be
Accumulo-specific (that's a good thing... as these could be reused by
other projects as well!).

In addition, these FileSystem implementations seem to be accompanied
by a few changes to Accumulo code itself, but I couldn't tell if these
were necessary to improve compatibility with these new FileSystems or
if they were unrelated additional enhancements to Accumulo. They also
appeared to be based on an older 2.0 branch, rather than the latest
2.1 / main branch, and conflict with some of the changes in 2.1
branch. So those changes will need to be rebased.

So, I suggest isolating the FileSystem implementations from the
changes to Accumulo. The FileSystem implementations don't need to be
merged into Accumulo's code base, or built as part of Accumulo at all.
They are completely independent from Accumulo and can exist in their
own repo, for use by any other user, just like s3a:// or abfss:// .
The Accumulo PMC could decide to accept responsibility for these
FileSystem implementations, but I don't think the Accumulo project at
the ASF is the best home for them, as they are not Accumulo-specific.
It might make more sense as a subproject of Hadoop instead of
Accumulo, since they are Hadoop FileSystem implementations, or remain
as a 3rd party repository on GitHub as part of the larger Hadoop
ecosystem. Finding the best home for these may take some additional
research on the part of its developers.

The changes to Accumulo itself, separate from the S3 FileSystem
implementations, will be easiest to incorporate into the 2.1 / main
branch if they are rebased first, and submitted from a fork on GitHub
(Chris Milbert's repo does not appear to be a "fork", but a
disconnected clone, so creating a PR using GitHub's UI won't be
possible without first recreating the repo using the "fork" feature on
GitHub). If there are multiple, discrete changes, serving independent
purposes, the changes should be teased apart and submitted as separate
PRs against the main branch, so they can be evaluated on their own
merits through the code review process. It is hard to consider their
merits without a pull request for those changes.

I think the discussion of abstracting the storage layer in Accumulo is
a worthy one, but I think it can be set aside for now. Abstracting the
storage layer from Hadoop would involve creating Accumulo-specific
storage APIs, and corralling Hadoop FileSystem API calls behind an
implementation of that Accumulo storage API. However, that's not
necessary for this. We currently use Hadoop's FileSystem APIs
throughout our own code, and Hadoop's FileSystem already provides
sufficient abstraction for the purposes of adding S3 support to
Accumulo, and that's what appears to have been done by Chris Milbert.
So, there's no need to complicate the discussion with additional
potential future work to further abstract Hadoop FileSystem API calls.
That abstraction doesn't appear to be a necessary prerequisite to
considering the work done by Chris in his repo.

To me, the main questions are:

1. Can the new FileSystem implementations be used as easily as other
drop-in implementations, like s3a:// and abfss:// ?
2. Where is the best home for these FileSystem implementations?
3. What benefits do the other changes to Accumulo serve, and can they
be rebased and submitted as separate PRs against Accumulo's main
branch?

On Tue, Jul 27, 2021 at 2:00 PM Arvind Shyamsundar
 wrote:
>
> Hi Jeff, what would be the difference between this path, and what can be 
> accomplished by using a Hadoop FileSystem interface based connector to talk 
> to S3? Is it because of the consistency limitations with s3a:// 
> (https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html)?
>
> As you probably know for Azure, we went with the abfss:// connector provided 
> as part of hadoop-azure 
> (https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html) with minimal 
> effort. Just wondering what the key difference here is for S3.
>
> Thanks!
>
> Arvind.
>
> -Original Message-
> From: Jeff Kubina 
> Sent: Tuesday, July 27, 2021 10:16 AM
> To: dev@accumulo.apache.org
> Subject: [EXTERNAL] Accumulo with Native S3 Support
>
> All,
>
> Some of AWS's back end services use a version of Accumulo modified to use 
> Amazon's S3 as its storage system. Amazon engineers forked Accumulo 2.0 and 
> merged that S3 support into it 
>

Re: Re: Cannot run program "make"

2021-07-01 Thread Christopher

Mike,

`make`, `gcc`, `javac`, `mvn`, and `bash` are bare-bones build tools I
think it is more than reasonable to expect developers to have
installed.

`make`, as well as `gcc` are required to build and test critical
functionality of Accumulo: native maps. You can't run `mvn clean
verify -Psunny` without testing native maps. Furthermore, we actually
ship a small tarball and Makefile for users to build their own native
maps locally according to their own CPU architecture, as part of our
distribution tarball. In order to test that artifact that we ship, we
need `make`. Running `make` on our shipped Makefile is effectively
what we're doing for the integration tests anyway.

So, I don't think this is a problem to address.

However, if we were to address it, we'd have to categorize all our ITs
to determine which ones require native maps and which don't. We could
also separate out the native map library to its own repo on its own
release schedule. However, I don't think it's worth it. It's trivial
to install `make` on any modern operating system that we can expect
contributors to be developing on. It's far more effort to cater to the
edge case of a developer not having these installed than it is to
simply provide instructions for installing them.

On Thu, Jul 1, 2021 at 11:59 AM Michael Wall  wrote:
>
> As much as I love make, I had the thought that maybe we shouldn't require
> devs to have it installed to build and contribute to Accumulo.  Didn't see
> an existing ticket, can write one tonight.
>
> On Thu, Jul 1, 2021 at 11:10 AM Christine Buss 
> wrote:
>
> >
> >
> > Thanks so much! Yes that was the problem. And thanks to everyone else that
> > took their time to help me.
> >
> >
> > Gesendet: Donnerstag, 01. Juli 2021 um 16:38 Uhr
> > Von: "Christopher" 
> > An: "accumulo-dev" 
> > Betreff: Re: Cannot run program "make"
> > The error message looks like it's saying you don't have the `make`
> > command installed on your machine. Based on the word "ubuntu" in your
> > OpenJDK build version, I think you're on an Ubuntu-based machine. I
> > found a StackOverflow answer (https://askubuntu.com/a/272020) that
> > said you can do:
> >
> > sudo apt-get install build-essential
> >
> > On Thu, Jul 1, 2021 at 10:24 AM Christine Buss 
> > wrote:
> > >
> > > the directory does exist:
> > >
> > >
> > >
> > accumulo/server/native/target/accumulo-native-2.1.0-SNAPSHOT/accumulo-native-2.1.0-SNAPSHOT$
> > ls
> > > javah LICENSE Makefile nativeMap NOTICE testNativeMap
> > >
> > > So why can the program "make" not be run?
> > > What file or directory is missing?
> > >
> > >
> > > Hello,
> > > I am trying to learn how to contribute.
> > > However I cloned and forked the accumulo repository.
> > > When I run the command :
> > > mvn clean verify -DskipITs
> > > I am getting this Error:
> > >
> > > [ERROR] Failed to execute goal
> > org.codehaus.mojo:exec-maven-plugin:3.0.0:exec (test-native-libs) on
> > project accumulo-native: Command execution failed.: Cannot run program
> > "make" (in directory
> > "/home/christine/accumulo/server/native/target/accumulo-native-2.1.0-SNAPSHOT/accumulo-native-2.1.0-SNAPSHOT"):
> > error=2, No such file or directory -> [Help 1]
> > >
> > > I can't find the reason. How can I solve this?
> > > Thanks so much in advanve for any and all help.
> > >
> > >
> > > I am using Java 11
> > > >java -version
> > > openjdk version "11.0.11" 2021-04-20
> > > OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
> > > OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed
> > mode, sharing)
> > >
> >

Re: Cannot run program "make"

2021-07-01 Thread Christopher

The error message looks like it's saying you don't have the `make`
command installed on your machine. Based on the word "ubuntu" in your
OpenJDK build version, I think you're on an Ubuntu-based machine. I
found a StackOverflow answer (https://askubuntu.com/a/272020) that
said you can do:

sudo apt-get install build-essential

On Thu, Jul 1, 2021 at 10:24 AM Christine Buss  wrote:
>
> the directory does exist:
>
>
> accumulo/server/native/target/accumulo-native-2.1.0-SNAPSHOT/accumulo-native-2.1.0-SNAPSHOT$
>  ls
> javah  LICENSE  Makefile  nativeMap  NOTICE  testNativeMap
>
> So why can the program "make" not be run?
> What file or directory is missing?
>
>
> Hello,
> I am trying to learn how to contribute.
> However I cloned and forked the accumulo repository.
> When I run  the command :
>mvn clean verify -DskipITs
> I am getting this Error:
>
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:3.0.0:exec 
> (test-native-libs) on project accumulo-native: Command execution failed.: 
> Cannot run program "make" (in directory 
> "/home/christine/accumulo/server/native/target/accumulo-native-2.1.0-SNAPSHOT/accumulo-native-2.1.0-SNAPSHOT"):
>  error=2, No such file or directory -> [Help 1]
>
> I can't find the reason. How can I solve this?
> Thanks so much in advanve for any and all help.
>
>
> I am using Java 11
> >java -version
> openjdk version "11.0.11" 2021-04-20
> OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
> OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed mode, 
> sharing)
>

Broken Eclipse 2021-06

2021-06-23 Thread Christopher

Just a heads-up, if you use Eclipse to do Accumulo development, you'll
probably want to skip the June release, Eclipse 2021-06.
It doesn't work with Accumulo for some reason. I filed a bug at:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=574425

Re: thrift incompatibility

2021-05-04 Thread Christopher

Hi Junwen,

It looks like the issues that your checker found were only problems in some
old alpha releases, and we're already fixed by the time we created a final
release.


On Tue, May 4, 2021, 02:15 junwen yang  wrote:

> Deal all,
>
>
> Regarding the issue caused by the incompatibility of thrift message such
> as IMPALA-8243  , we
> have created a static checker which keeps track of the thrift file change,
> and detects potential incompatibility:
>
>1. Add/delete required field.  The thrift guidelines suggest *Any new
>fields that you add should be optional*.
>2. The tag number of a field has been changed. Also, the thrift
>guidelines suggest *Don’t change the numeric tags for any existing
>fields*.
>3. A  required field has been changed to optional, or an optional
>field has been changed to required. According to the guidelines, *Required
>Is Forever** You should be very careful about marking fields as
>required. If at some point you wish to stop writing or sending a required
>field, it will be problematic to change the field to an optional field —
>old readers will consider messages without this field to be incomplete and
>may reject or drop them unintentionally. You should consider writing
>application-specific custom validation routines for your buffers instead.
>Some have come to the conclusion that using required does more harm than
>good; they prefer to use only optional. However, this view is not
>universal.*
>
> We have applied our checker on the frequently maintained Accumulo
> versions: 1.7.0, rel/1.10.0, rel/2.0.0, rel/2.0.0-alpha-1,
> rel/2.0.0-alpha-2, rel/2.0.1, main, we found more than 10 problems as
> attached.
>
> The results reported by our checker got confirmed by developers of HBASE
> and our checker is requested by them, which can be found at HBASE-25340
> .
>
>
> Best,
>
> Junwen
>

Re: Quarterly Community Report due 4/16.

2021-04-14 Thread Christopher

Note: removed user@ list and added dev@ list. Preparing the board
report is a PMC responsibility, and we decided to prepare it publicly
on the dev mailing list, but it's probably not something most users
care about, and we should avoid spamming the user list with PMC
business, so that way users subscribed to that list for announcements,
bugs, and peer help won't feel the need to unsubscribe due to
unrelated activity.

The report looks fine overall to me. Two notes:

* If we have a link to the hackathon, that might be good to include,
in case the board wants more details on that.

* The phrase "The mission of Apache X is the creation and maintenance
of software related to" that prefaces the project description is
probably a bit redundant and unnecessarily, since it applies to every
ASF project, and delays the point. Have we had such a prefacing phrase
before? It might be good to try to get directly to the point in the
description, since the board has to look at dozens of these each
month.

On Wed, Apr 14, 2021 at 5:10 PM Ed Coleman  wrote:
>
> The Accumulo community has agreed to draft the quaterly reports using the 
> maling list.
>
> Sorry for the late notice on this - I had a vague notion that this was 
> upcoming, but didn't realize until the first email today from Apache that it 
> was due so soon.  Please, if you have any suggestions on Project Activity or 
> Community Health - I find writing those sections particularly difficult. I go 
> from things are fine to writing release notes and have trouble striking the 
> right balance or even if I've included the relevant info.
>
> I'll incorporate comments as received and will submit on Friday to meet the 
> deadline.
>
> Thanks.
>
> Ed Coleman
>
>  begin report 
>
> ## Description:
> The mission of Apache Accumulo is the creation and maintenance of software
> related to a robust, scalable, distributed key/value store with cell-based
> access control and customizable server-side processing.
>
> ## Issues:
> There are no new issues requiring board attention.
>
> The trademark issue with http:www.accumulodata.com is still open until the
> domain expires on 2021-06-28.  No action has been required and allowing the
> domain to expire was deemed a viable option by Brand Management VP to minimize
> volunteer efforts.
>
> ## Membership Data:
> Apache Accumulo was founded 2012-03-21 (9 years ago)
> There are currently 39 committers and 39 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
>
> Community changes, past quarter:
> - Karthick Narendran was added to the PMC on 2021-01-22
> - Karthick Narendran was added as committer on 2021-01-22
>
> ## Project Activity:
> No new releases this reporting period. Last release dates:
> - accumulo-2.0.1 was released on 2020-12-24.
> - accumulo-1.10.1 was released on 2020-12-22.
>
> Project activity on the next release remains active with significant
> improvements to the current baseline. The remaining issues are being actively
> worked.  Highlights of changes this reporting period:
> - Removal of problematic process names to more inclusive terms.
> - Formalization of public API. Previously, certain internal classes were
>   necessary for users for some functions (iterators) - the interface has been
>   extracted and formally declared as part of the Accumulo public API.
> - Internal improvements handling threads and exceptions to improve reliability
> - Accumulo community agreed to participate in an upcoming OSS virtual
>   hackathon (April 29-30) The goal of the hackathon is to kick-start
>   involvement in the OSS community and foster an environment for contributions
>   and increase the diversity of the OSS communities.
>
> ## Community Health:
> Overall community health is good and activity remains consistent.  Decreases
> in Jira (-65%) and dev mailing list activity (-53%) reflect the community
> transition to using GitHub as a focal point for development.  This is
> reflected in the increased activity for PRs and GotHub issues.
> - contributions from 14 individuals reflecting continued community involvement
>   and consistent participation.
>
>

Re: Accumulo Slack: I'd like to join

2021-03-01 Thread Christopher

An invitation was sent to your email address.

On Mon, Mar 1, 2021 at 2:52 PM wschultz  wrote:
>
> Hi,
>
> I would like to join ASF slack.
>
> - Walter

Git log formatting recommendations/tips

2021-02-25 Thread Christopher

Hi Accumulo Devs!

Every once in a while, I share this link with people (I think I have
shared it on this list before as well), with helpful tips on writing
good git commit log messages. I'm sharing it again today because we
have had new contributors in the last year, some may find this useful:

https://chris.beams.io/posts/git-commit/

There's a few key points in this:

1. Separate subject from body with a blank line
2. Limit the subject line to 50 characters
3. Capitalize the subject line
4. Do not end the subject line with a period
5. Use the imperative mood in the subject line
6. Wrap the body at 72 characters
7. Use the body to explain what and why vs. how

There's a complete example in the link for reference.

*

In addition, I've made some observations of my own regarding Accumulo
practices that I think we should strive to improve upon (many of which
I'm guilty of):

* We should avoid "Closes #N" or "Fixes #N" or "Close #N" or "Fix #N"
or other closing keywords in the subject line of the commit. This
saves valuable space that could be used to summarize the change in the
subject line. Instead, place those in the body of the message
somewhere. The example in the article is to place them at the end of
the body, but I think they can often be helpful inline inside the body
instead.

* Use "Fix" in the subject line, not "Fixes" (see imperative mood
recommendation above).

* GitHub has a convention of appending PR number at the end of the
subject when you merge from the web interface, in parenthesis, as in:
`Fix the Test (#)`. This is a nice convention. If you have to
merge a PR manually from the CLI (like to resolve merge conflicts or
to cherry-pick to another branch), it's good to adopt the same
convention if you're closing a PR.

* Use markdown syntax in the body where appropriate, including bullet
points, paragraphs, and backticks for code (especially use backticks
if you reference something with an `@` in it, to avoid tagging a user
by that name on GitHub. Like, `@Test` or `@Deprecated`)

* Use the "Squash and merge" option in GitHub's interface under most
circumstances. This keeps the git history nice and clear. If we need
to reference individual changes that occurred on a PR during the
review process, we can go back to the PR itself to look at those, but
this is almost never necessary. Sometimes, "Create a merge commit" is
needed, to preserve separate commits for independent changes that
happened in the same PR, but these are rare. (Two examples I can think
of is if we update the formatter config or thrift config, and then
want to have a separate commit for the resulting formatting/generated
code changes, but want to put them in a single PR.)

* When using GitHub to merge (and really, even when using the CLI),
always review the commit log messages and make any final changes
before submitting. This is your last chance to edit and/or polish, and
it's an opportunity to remove intermediate messages from the log like
"fix typo" or "fix test" or "address code review comments" that don't
add value to the overall message, and to include changes you may have
forgotten, like co-author acknowledgements or correct any deviation
from when the commit messages were first authored and the final
result. When you merge, the commit message should reflect what the
change actually does in its final form, after all code reviews and
updates to the PR, not your first draft and the steps it took you to
get there.

Anyway, I hope these tips help somebody.

Thanks,
Christopher

Re: Which String deduplication option?

2021-02-10 Thread Christopher

Yeah, I saw that, and replied to that point in my PR at
https://github.com/apache/accumulo/pull/1920#issuecomment-776795695

On Wed, Feb 10, 2021 at 12:16 PM David  wrote:
>
> Hey,
>
> For what it's worth, the Guava team added that bit about "String.intern()
> has some well-known performance limitations, and should generally be
> avoided" was added on Oct 13, 2020.
>
> Thanks.
>
> On Mon, Feb 8, 2021 at 5:06 PM Christopher  wrote:
>
> > Guava's argument in the linked comment appears to be based on
> > pre-Java8, before the PermGen space was consolidated with the main
> > heap and had a fixed size.
> >
> > In response to the other observations: a stress test here seems
> > particularly difficult to compare between String.intern and G1GC
> > deduplication, since the latter will deduplicate across the JVM, and
> > not just the TabletLocator stuff. So, we wouldn't be able to get a
> > good direct comparison between the overall impact between the two.
> >
> > As for running a stress test between WeakHashMap and String.intern,
> > just for TabletLocator, I'm not going to bother because others have
> > already done that work and determined them to be similar in
> > performance, given an adequate string table size (one is at
> > http://java-performance.info/string-intern-in-java-6-7-8/)
> >
> > So, what I will do is create a PR to replace the non-tunable
> > WeakHashMap with the tunable String.intern for TabletLocator, on the
> > basis that they have comparable performance, but the latter is
> > user-tunable if they need to, uses less memory, and involves less
> > code. I will not do anything with the G1GC settings, leaving that up
> > to users to experiment with and tune on their own, if they wish.
> >
> > On Mon, Feb 8, 2021 at 4:31 PM David  wrote:
> > >
> > > Guava argues for the use of a weak hashmap.
> > >
> > >
> > https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/Interner.java#L28-L30
> > >
> > > On Mon, Feb 8, 2021 at 3:57 PM Brian Loss  wrote:
> > >
> > > > It might make sense to do both approaches.
> > > >
> > > > It seems there are limits to when -XX:+UseStringDeduplication takes
> > > > effect. By default, it only interns objects that have survived 3 GC
> > cycles,
> > > > although that number can be changed. If the objects in question are
> > > > short-lived, then it wouldn’t make sense to call String.intern on them
> > > > either. However, if we know based on the usage pattern that we’d get a
> > lot
> > > > of deduplication on long-lived strings, then String.intern is better
> > > > because it will happen right away and will also save more memory since
> > the
> > > > String object itself is de-duped (vs just the internal char array for
> > the
> > > > automatic de-duplication). It wasn’t completely clear from my reading,
> > but
> > > > if I understood correctly the other potential downside to
> > > > UseStringDeduplication is that it happens after GC if there’s time. On
> > a
> > > > heavily loaded system that doesn’t time left in the pause time goal
> > window
> > > > after completing GC, the string de-duplication might not happen at all.
> > > >
> > > > Adding -XX:+UseStringDeduplication wouldn’t hurt and could potentially
> > > > provide some benefit, so I’d be in favor of adding it. For
> > TabletLocator
> > > > specifically, if we know that’s an area where string de-duplication
> > will
> > > > help, then we should probably use String.intern there. As Keith
> > suggested,
> > > > a stress test might help determine whether it makes sense. In the
> > absence
> > > > of that, if we assume the previous WeakHashMap was there to solve a
> > > > specific problem (vs an uninformed attempt to save memory) then
> > > > String.intern sounds to me like the way to go as well.
> > > >
> > > > > On Feb 8, 2021, at 3:34 PM, Dave Marion  wrote:
> > > > >
> > > > > String.intern() would seem to provide better coverage considering
> > that
> > > > some
> > > > > users may not use the G1 collector.
> > > > >
> > > > > On Mon, Feb 8, 2021 at 3:19 PM Keith Turner 
> > wrote:
> > > > >
> > > > >> Recently while running some large map reduce jobs I learned that
> > > > >> Hadoop uses String.intern() in

Re: Which String deduplication option?

2021-02-08 Thread Christopher

Guava's argument in the linked comment appears to be based on
pre-Java8, before the PermGen space was consolidated with the main
heap and had a fixed size.

In response to the other observations: a stress test here seems
particularly difficult to compare between String.intern and G1GC
deduplication, since the latter will deduplicate across the JVM, and
not just the TabletLocator stuff. So, we wouldn't be able to get a
good direct comparison between the overall impact between the two.

As for running a stress test between WeakHashMap and String.intern,
just for TabletLocator, I'm not going to bother because others have
already done that work and determined them to be similar in
performance, given an adequate string table size (one is at
http://java-performance.info/string-intern-in-java-6-7-8/)

So, what I will do is create a PR to replace the non-tunable
WeakHashMap with the tunable String.intern for TabletLocator, on the
basis that they have comparable performance, but the latter is
user-tunable if they need to, uses less memory, and involves less
code. I will not do anything with the G1GC settings, leaving that up
to users to experiment with and tune on their own, if they wish.

On Mon, Feb 8, 2021 at 4:31 PM David  wrote:
>
> Guava argues for the use of a weak hashmap.
>
> https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/Interner.java#L28-L30
>
> On Mon, Feb 8, 2021 at 3:57 PM Brian Loss  wrote:
>
> > It might make sense to do both approaches.
> >
> > It seems there are limits to when -XX:+UseStringDeduplication takes
> > effect. By default, it only interns objects that have survived 3 GC cycles,
> > although that number can be changed. If the objects in question are
> > short-lived, then it wouldn’t make sense to call String.intern on them
> > either. However, if we know based on the usage pattern that we’d get a lot
> > of deduplication on long-lived strings, then String.intern is better
> > because it will happen right away and will also save more memory since the
> > String object itself is de-duped (vs just the internal char array for the
> > automatic de-duplication). It wasn’t completely clear from my reading, but
> > if I understood correctly the other potential downside to
> > UseStringDeduplication is that it happens after GC if there’s time. On a
> > heavily loaded system that doesn’t time left in the pause time goal window
> > after completing GC, the string de-duplication might not happen at all.
> >
> > Adding -XX:+UseStringDeduplication wouldn’t hurt and could potentially
> > provide some benefit, so I’d be in favor of adding it. For TabletLocator
> > specifically, if we know that’s an area where string de-duplication will
> > help, then we should probably use String.intern there. As Keith suggested,
> > a stress test might help determine whether it makes sense. In the absence
> > of that, if we assume the previous WeakHashMap was there to solve a
> > specific problem (vs an uninformed attempt to save memory) then
> > String.intern sounds to me like the way to go as well.
> >
> > > On Feb 8, 2021, at 3:34 PM, Dave Marion  wrote:
> > >
> > > String.intern() would seem to provide better coverage considering that
> > some
> > > users may not use the G1 collector.
> > >
> > > On Mon, Feb 8, 2021 at 3:19 PM Keith Turner  wrote:
> > >
> > >> Recently while running some large map reduce jobs I learned that
> > >> Hadoop uses String.intern() in its RPC code (below is a link to an
> > >> example on one place where Hadoop does this).  I learned this because
> > >> when I ran jstack on NN, RM, and/or AM that were under distress
> > >> sometimes I kept seeing RPC server threads that were in
> > >> String.intern().  I never was quite sure if it was a problem though.
> > >> Not saying String.intern() is bad or good, just sharing something I
> > >> observed that I was uncertain about.
> > >>
> > >> May make sense to create some sort of stress test that could simulate
> > >> the usage pattern of the TabletLocator and try the different options
> > >> and see what happens.  If any long pauses or problems happen in the
> > >> simulation, they may happen in the real environment.
> > >>
> > >>
> > >>
> > https://github.com/apache/hadoop/blob/ba631c436b806728f8ec2f54ab1e289526c90579/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/TaskStatus.java#L481
> > >>
> > >>
> > https://github.com/apache/hadoop/blob/ba631c436b806728f8ec2f54ab1e2895

Which String deduplication option?

2021-02-01 Thread Christopher

While code reviewing, I saw that
core/src/main/java/org/apache/accumulo/core/clientImpl/TabletLocator.java
was using a WeakHashMap to deduplicate some strings.

This code can probably be removed in favor of one of the following two options:

1. Just explicitly use String.intern() - As of Java 7, there is no
longer a separate, fixed-size PermGen space, so intern'd strings will
be in the main heap, no longer constrained to a limited size pool.
These strings are still subject to garbage collection. It is
implemented as a HashMap internally (native implementation), with a
default bucket size of more than 60K, plenty big enough for the
interning that TabletLocator is doing... but this is configurable by
the user with JVM flags if it's not. Interning will use less memory as
WeakHashMap and similar performance, as long as the bucket size is big
enough.

2. Just use -XX:+UseStringDeduplication JVM flag - as of Java 9, G1 is
the new default Java garbage collector. This garbage collector has the
option to automatically attempt to deduplicate all strings behind the
scenes, by swapping out their underlying char arrays (so, it likely
won't affect == equality because the String object references
themselves won't change, unlike option 1). This is more passive than
option 1, but would apply to the entire JVM. G1GC also implements some
heuristics to prevent too much overhead.

With both options, it's possible to output statistics.

If I remove the WeakHashMap for the string deduplication in
TabletLocator, does anybody have an opinion on which option I should
replace it with? I'm leaning towards option 2 (adding it to
assemble/conf/accumulo-env.sh as one of the default flags).

New committer/PMC member: Karthick Narendran

2021-01-22 Thread Christopher

The Project Management Committee (PMC) for Apache Accumulo
has invited Karthick Narendran to become a committer and PMC
member and we are pleased to announce that they have accepted.

Karthick has contributed several bug fixes and quality improvements
to Accumulo, has written blog posts for our website, has been very
helpful in engaging users with questions on various issues, and
has helped in testing and providing feedback on release candidates.
We are happy to have them contributing to the Accumulo
community!

Being a committer enables easier contribution to the
project since there is no need to go via the patch
submission process. This should enable better productivity.
Being a PMC member enables assistance with the management
and to guide the direction of the project.

Welcome Karthick!

Re: [DRAFT][REPORT] Apache Accumulo - Jan 2021

2021-01-07 Thread Christopher

The report looks good to me, but I don't think we should use the GitHub PR
mechanism to draft these. I provided some of my reasoning on the PR, but
basically:

1. I don't think it adds value to be archived to the website, since the
reports are already canonically archived elsewhere in ASF's infrastructure,
and these have little to no value to website visitors. It's strange to use
the GitHub PR mechanism to draft something that isn't targeted to be
published on the website, and I don't think these should be published to
the website, as that would be redundant, confusing, and potentially more
work to maintain (keeping sync'd to mailing list).

2. Drafting them only on the PR excludes people who follow the mailing list
but not GitHub issues, and while that's probably okay for day-to-day
activity, it's probably not okay for discussion points that stand apart
from bugs/code issues. The alternative is to do what you did here and post
in both places, but that makes it harder to follow, because now everybody
needs to follow a discussion in two places instead of one. It's also hard
to have an interactive conversation about anything we might be discussing,
because you have to merge the chronology of the activity of both places in
order to follow the discussion.

On Thu, Jan 7, 2021 at 5:21 AM Ed Coleman  wrote:

> The Apache Accumulo PMC decided to draft its quarterly board reports on
> the dev list. Here is a draft of our report which is due Wednesday, Jan 13,
> 1 week before the board meeting on Wednesday, Oct 20.
>
> To facilitate collaboration, this report is also a draft PR at
> https://github.com/apache/accumulo-website/pull/256.
>
> Please let me know if you have any feedback and your thoughts on using PRs
> as an approach for drafting these reports.
>
> Some more detailed metrics are at
> https://reporter.apache.org/wizard/statistics?accumulo, which appears to
> require a committer login.
>
> Ed Coleman
>
> 
> [REPORT] Apache Accumulo - January 2021
>
> ## Description:
>
> The Apache Accumulo sorted, distributed key/value store is a robust,
> scalable,
> high performance data storage system that features cell-based access
> control
> and customizable server-side processing. It is based on Google's BigTable
> design and is built on top of Apache Hadoop, Zookeeper, and Thrift.
>
> ## Issues:
>
> No change since last report, Oct 2020.  Still waiting for the owner of
> http://www.accumulodata.com to work with Amazon to reactive the account
> used for
> hosting, so it can be pointed to https://accumulo.apache.org or shutdown.
> Initial emails are at [1].
>
> ## Membership Data:
>
> Apache Accumulo was founded 2012-03-20 (9 years ago)
> There are currently 38 committers and 38 PMC members in this project.
> The Committer-to-PMC ratio is 1:1.
>
> Community changes, past quarter:
> - Jeffrey Manno was added to the PMC on 2020-11-01
> - Jeffrey Manno was added as committer on 2020-11-02
>
> ## Project Release Activity
>
> - accumulo-1.10.1 was released on 2020-12-22 [2]
> - accumulo-2.0.1 was released on 2020-12-24. [3]
>
> ## Project Activity:
>
> - [CVE-2020-17533](
> https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-17533),
> authenticated
>   users could perform certain administrative operations without having the
> appropriate permissions,
>   was reported on 2020-12-09 and resolved with the accumulo-2.0.1
> (2020-12-24) and accumulo-1.10.1
>   (2020-12-22) releases.
> - GitHub activity summary over the past quarter (as of 2021-01-07)
>   - 48 GitHub issues created / 44 issues closed.
>   - 120 GitHib PRs opened / 97 PRs closed.
>   - 145 commits from 18 committers.
>
> ## Community Health:
>
> - Mailing list participation and github issues are consistent.
> - 3 new contributors:
>   - Szabolcs Bukros [Cloudera](https://www.cloudera.com/)
>   - Seth Falco [Elypia](https://elypia.org/en-US/)
>   - Dominic Garguilo [Arctic Slope Regional Corp](https://www.asrc.com/)
>
> [1]:
> https://lists.apache.org/thread.html/514d3cf9162e72f4aa13be1db5d6685999fc83755695308a529de4d6@%3Cprivate.accumulo.apache.org%3E
> [2]:
> https://lists.apache.org/thread.html/r947a56c98d0a8e009fa93df3b19e93761bfea8b236f30fb0c21b1992%40%3Cuser.accumulo.apache.org%3E
> [3]:
> https://lists.apache.org/thread.html/r38b0920499c9c88de282ca783debb9fbb8dc8ed88f5fc0ad9981bf97%40%3Cuser.accumulo.apache.org%3E
>
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 2.0.1

2020-12-28 Thread Christopher

We avoided a specific recommendation on 2.0.0's release notes (LTM hadn't
been ironed out by then). In general, it's probably not a good idea to
provide recommendations based on potential future actions (because the
future is uncertain), but if we do add something like that for this, I
think it'd probably be better to include it on the linked release notes
(which we can update as the future becomes clearer), rather than in the
announcement (where it is immutable in the mailing list).

On Mon, Dec 28, 2020 at 10:36 AM Mike Miller  wrote:

> Looks good as is.  Was just wondering if we should mention that users of
> 1.10 may want to wait until the 2.x LTM release line?  Say that 2.1 is
> likely to be LTM?
>
> On Sun, Dec 27, 2020 at 11:48 PM Christopher  wrote:
>
> > Karthick, tweeting is part of the checklist. Somebody typically does it
> > after these announcements go out via email first.
> >
> > On Sun, Dec 27, 2020, 17:19 karthick rn 
> > wrote:
> >
> > > Hi Billie,
> > > The draft looks good.
> > > Do we tweet for these releases?
> > >
> > > Thanks,
> > > Karthick
> > >
> > >
> > > On Sat, 26 Dec 2020 at 18:49, Billie Rinaldi 
> wrote:
> > >
> > > > The following is a DRAFT announcement for the 2.0.1 release. Please
> > > review
> > > > and provide feedback. I intend to publish this announcement on Dec
> 28.
> > > > *
> > > >
> > > > The Apache Accumulo project is pleased to announce the release
> > > > of Apache Accumulo 2.0.1! Apache Accumulo 2.0.1 contains critical
> > > > bug fixes for 2.0.0. (See the release notes linked below for
> details.)
> > > >
> > > > Since 2.0 is a non-LTM release line, and since an LTM release line
> > > > has not yet been made available for 2.x, this patch backports
> important
> > > > bug fixes to 2.0 that could affect any existing 2.0.0 users. Users
> that
> > > > have already migrated to 2.0.0 are urged to upgrade to 2.0.1 as soon
> > > > as possible, and users of 1.10 who wish to upgrade to 2.0 should
> > > > upgrade directly to 2.0.1, bypassing 2.0.0.
> > > >
> > > > ***
> > > >
> > > > Apache Accumulo® is a sorted, distributed key/value store that
> > > > provides robust, scalable data storage and retrieval. With
> > > > Apache Accumulo, users can store and manage large data sets
> > > > across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> > > > its data and Apache ZooKeeper for consensus.
> > > >
> > > > This version is now available in Maven Central, and at:
> > > > https://accumulo.apache.org/downloads/
> > > >
> > > > The full release notes can be viewed at:
> > > > https://accumulo.apache.org/release/accumulo-2.0.1/
> > > >
> > > > --
> > > > The Apache Accumulo Team
> > > >
> > >
> >
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 2.0.1

2020-12-27 Thread Christopher

Karthick, tweeting is part of the checklist. Somebody typically does it
after these announcements go out via email first.

On Sun, Dec 27, 2020, 17:19 karthick rn 
wrote:

> Hi Billie,
> The draft looks good.
> Do we tweet for these releases?
>
> Thanks,
> Karthick
>
>
> On Sat, 26 Dec 2020 at 18:49, Billie Rinaldi  wrote:
>
> > The following is a DRAFT announcement for the 2.0.1 release. Please
> review
> > and provide feedback. I intend to publish this announcement on Dec 28.
> > *
> >
> > The Apache Accumulo project is pleased to announce the release
> > of Apache Accumulo 2.0.1! Apache Accumulo 2.0.1 contains critical
> > bug fixes for 2.0.0. (See the release notes linked below for details.)
> >
> > Since 2.0 is a non-LTM release line, and since an LTM release line
> > has not yet been made available for 2.x, this patch backports important
> > bug fixes to 2.0 that could affect any existing 2.0.0 users. Users that
> > have already migrated to 2.0.0 are urged to upgrade to 2.0.1 as soon
> > as possible, and users of 1.10 who wish to upgrade to 2.0 should
> > upgrade directly to 2.0.1, bypassing 2.0.0.
> >
> > ***
> >
> > Apache Accumulo® is a sorted, distributed key/value store that
> > provides robust, scalable data storage and retrieval. With
> > Apache Accumulo, users can store and manage large data sets
> > across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> > its data and Apache ZooKeeper for consensus.
> >
> > This version is now available in Maven Central, and at:
> > https://accumulo.apache.org/downloads/
> >
> > The full release notes can be viewed at:
> > https://accumulo.apache.org/release/accumulo-2.0.1/
> >
> > --
> > The Apache Accumulo Team
> >
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 2.0.1

2020-12-26 Thread Christopher

LGTM

On Sat, Dec 26, 2020, 13:49 Billie Rinaldi  wrote:

> The following is a DRAFT announcement for the 2.0.1 release. Please review
> and provide feedback. I intend to publish this announcement on Dec 28.
> *
>
> The Apache Accumulo project is pleased to announce the release
> of Apache Accumulo 2.0.1! Apache Accumulo 2.0.1 contains critical
> bug fixes for 2.0.0. (See the release notes linked below for details.)
>
> Since 2.0 is a non-LTM release line, and since an LTM release line
> has not yet been made available for 2.x, this patch backports important
> bug fixes to 2.0 that could affect any existing 2.0.0 users. Users that
> have already migrated to 2.0.0 are urged to upgrade to 2.0.1 as soon
> as possible, and users of 1.10 who wish to upgrade to 2.0 should
> upgrade directly to 2.0.1, bypassing 2.0.0.
>
> ***
>
> Apache Accumulo® is a sorted, distributed key/value store that
> provides robust, scalable data storage and retrieval. With
> Apache Accumulo, users can store and manage large data sets
> across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> its data and Apache ZooKeeper for consensus.
>
> This version is now available in Maven Central, and at:
> https://accumulo.apache.org/downloads/
>
> The full release notes can be viewed at:
> https://accumulo.apache.org/release/accumulo-2.0.1/
>
> --
> The Apache Accumulo Team
>

Re: [DRAFT][ANNOUNCE] Apache Accumulo 1.10.1

2020-12-26 Thread Christopher

LGTM

On Sat, Dec 26, 2020, 13:43 Billie Rinaldi  wrote:

> The following is a DRAFT announcement for the 1.10.1 release. Please review
> and provide feedback. I intend to publish this announcement on Dec 28.
> *
>
> The Apache Accumulo project is pleased to announce the release
> of Apache Accumulo 1.10.1! Apache Accumulo 1.10.1 is a bug fix
> release of the 1.10 LTM release line. (See the release notes linked
> below for details.)
>
> Users of 1.10.0 or earlier are urged to upgrade to 1.10.1 as soon as
> possible, as this is a continuation of the 1.10 LTM release line with
> important bug fixes. Users are also encouraged to consider migrating
> to a 2.x version when one that is suitable for their needs becomes
> available.
>
> ***
>
> Apache Accumulo® is a sorted, distributed key/value store that
> provides robust, scalable data storage and retrieval. With
> Apache Accumulo, users can store and manage large data sets
> across a cluster. Accumulo uses Apache Hadoop's HDFS to store
> its data and Apache ZooKeeper for consensus.
>
> This version is now available in Maven Central, and at:
> https://accumulo.apache.org/downloads/
>
> The full release notes can be viewed at:
> https://accumulo.apache.org/release/accumulo-1.10.1/
>
> --
> The Apache Accumulo Team
>

[RESULT][VOTE] Apache Accumulo 2.0.1-rc1

2020-12-24 Thread Christopher

This vote passes with:
4 +1s (binding): Jeffrey, Billie, Christopher, Mike Miller, Ed
1 +1s (non-binding): Karthick

The post-vote release checklist is being tracked at
https://github.com/apache/accumulo/issues/1846

On Thu, Dec 24, 2020 at 5:12 PM  wrote:

> +1
>
> * Verified commit hashes
> * Verified signatures
> * Passed sunny day ITs.
>
> -Original Message-
> From: Christopher 
> Sent: Monday, December 21, 2020 1:39 PM
> To: accumulo-dev 
> Subject: [VOTE] Apache Accumulo 2.0.1-rc1
>
> Accumulo Developers,
>
> Please consider the following candidate for Apache Accumulo 2.0.1.
>
> Git Commit:
> 76247b1739dd3042cb2d959a7a99f0cf1bcb1324
> Branch:
> 2.0.1-rc1
>
> If this vote passes, a gpg-signed tag will be created using:
> git tag -f -s -m 'Apache Accumulo 2.0.1' rel/2.0.1 \
> 76247b1739dd3042cb2d959a7a99f0cf1bcb1324
>
> Staging repo:
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091
> Source (official release artifact):
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091/org/apache/accumulo/accumulo/2.0.1/accumulo-2.0.1-src.tar.gz
> Binary:
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091/org/apache/accumulo/accumulo/2.0.1/accumulo-2.0.1-bin.tar.gz
>
> Append ".asc" to download the cryptographic signature for a given artifact.
> (You can also append ".sha1" or ".md5" instead in order to verify the
> checksums generated by Maven to verify the integrity of the Nexus
> repository staging
> area.)
>
> Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
>
> In addition to the tarballs and their signatures, the following checksum
> files will be added to the dist/release SVN area after release:
> accumulo-2.0.1-src.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.0.1-src.tar.gz) =
>
> 4dbd765a234b87c46be6f92100c928e8402b2b03e707e666ec83531ab096de3871140b17e9ff1b1bce79c864771c9e908a79637c3be88f210f6f09a5e48b75fd
> accumulo-2.0.1-bin.tar.gz.sha512 will contain:
> SHA512 (accumulo-2.0.1-bin.tar.gz) =
>
> b443839443a9f5098b55bc5c54be10c11fedbaea554ee6cd42eaa9311068c70bd611d7fc67698c91ec73da0e85b9907aa72b98d5eb4d49ea3a5d51b0c6c5785f
>
> Release notes (in progress) can be found at:
> https://accumulo.staged.apache.org/release/accumulo-2.0.1
>
> Release testing instructions:
> https://accumulo.apache.org/contributor/verifying-release
>
> Please vote one of:
> [ ] +1 - I have verified and accept...
> [ ] +0 - I have reservations, but not strong enough to vote against...
> [ ] -1 - Because..., I do not accept...
> ... these artifacts as the 2.0.1 release of Apache Accumulo.
>
> This vote will remain open until at least Thu 24 Dec 2020 07:00:00 PM UTC.
> (Thu 24 Dec 2020 02:00:00 PM EST / Thu 24 Dec 2020 11:00:00 AM PST) Voting
> can continue after this deadline until the release manager sends an email
> ending the vote.
>
> Thanks!
>
> P.S. Hint: download the whole staging repo with
> wget -erobots=off -r -l inf -np -nH \
>
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091/
> # note the trailing slash is needed
>
>

Re: [VOTE] Apache Accumulo 2.0.1-rc1

2020-12-23 Thread Christopher

+1 from me

* Tested all ITs
* Verified sigs/hashes
* Every jar has source jar and javadoc jar
* Jars are sealed
* Jar manifests match the git commit
* Jars in the binary tarball's lib dir matches the staged jars
* Verified basic functionality with fluo-uno, Hadoop 3.3.0, and ZK 3.4.14

Two notes that I'll make sure get added to the release notes:
1. Accumulo 2.0.x still assumes the CMS garbage collector in its default
config, and hard-coded in minicluster, which causes a problem trying to run
with JDK 15+. In 1.10, a constraint was added to the build to at least fail
fast if JDK 15+ was used to try to build, but that was not ported to 2.0.1.
2.1.0-SNAPSHOT has better support for JDK 15+, so it's a non-issue in
future, but something to note in the release notes.
2. Accumulo 1.10 included many classpath improvements to work better with
ZK 3.5+, but Accumulo 2.0.x does not include those changes, and still works
best with ZK 3.4.14. I'm sure it's possible to manually tweak the scripts
and make other changes to better support ZK 3.5+, but it just hasn't been
done for 2.0.x... only for 2.1.0-SNAPSHOT and 1.10.x. I think this is
acceptable, since 2.0.1 is a targeted backport, and not an LTM branch.
2.1.0 will have better support for these when we can release that
(hopefully early 2021).



On Wed, Dec 23, 2020 at 1:08 PM Billie Rinaldi  wrote:

> +1
> - checksums and signatures are good
> - diff between 2.0.0 and 2.0.1-rc1 looks good
> - unit tests and sunny ITs pass
>
> I noticed we need to do a license and notice review of the main branch, but
> I am not inclined to delay this release due to that.
>
> On Mon, Dec 21, 2020 at 1:39 PM Christopher  wrote:
>
> > Accumulo Developers,
> >
> > Please consider the following candidate for Apache Accumulo 2.0.1.
> >
> > Git Commit:
> > 76247b1739dd3042cb2d959a7a99f0cf1bcb1324
> > Branch:
> > 2.0.1-rc1
> >
> > If this vote passes, a gpg-signed tag will be created using:
> > git tag -f -s -m 'Apache Accumulo 2.0.1' rel/2.0.1 \
> > 76247b1739dd3042cb2d959a7a99f0cf1bcb1324
> >
> > Staging repo:
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091
> > Source (official release artifact):
> >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091/org/apache/accumulo/accumulo/2.0.1/accumulo-2.0.1-src.tar.gz
> > Binary:
> >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091/org/apache/accumulo/accumulo/2.0.1/accumulo-2.0.1-bin.tar.gz
> >
> > Append ".asc" to download the cryptographic signature for a given
> artifact.
> > (You can also append ".sha1" or ".md5" instead in order to verify the
> > checksums
> > generated by Maven to verify the integrity of the Nexus repository
> staging
> > area.)
> >
> > Signing keys are available at https://www.apache.org/dist/accumulo/KEYS
> > (Expected fingerprint: 8CC4F8A2B29C2B040F2B835D6F0CDAE700B6899D)
> >
> > In addition to the tarballs and their signatures, the following checksum
> > files will be added to the dist/release SVN area after release:
> > accumulo-2.0.1-src.tar.gz.sha512 will contain:
> > SHA512 (accumulo-2.0.1-src.tar.gz) =
> >
> >
> 4dbd765a234b87c46be6f92100c928e8402b2b03e707e666ec83531ab096de3871140b17e9ff1b1bce79c864771c9e908a79637c3be88f210f6f09a5e48b75fd
> > accumulo-2.0.1-bin.tar.gz.sha512 will contain:
> > SHA512 (accumulo-2.0.1-bin.tar.gz) =
> >
> >
> b443839443a9f5098b55bc5c54be10c11fedbaea554ee6cd42eaa9311068c70bd611d7fc67698c91ec73da0e85b9907aa72b98d5eb4d49ea3a5d51b0c6c5785f
> >
> > Release notes (in progress) can be found at:
> > https://accumulo.staged.apache.org/release/accumulo-2.0.1
> >
> > Release testing instructions:
> > https://accumulo.apache.org/contributor/verifying-release
> >
> > Please vote one of:
> > [ ] +1 - I have verified and accept...
> > [ ] +0 - I have reservations, but not strong enough to vote against...
> > [ ] -1 - Because..., I do not accept...
> > ... these artifacts as the 2.0.1 release of Apache Accumulo.
> >
> > This vote will remain open until at least Thu 24 Dec 2020 07:00:00 PM
> UTC.
> > (Thu 24 Dec 2020 02:00:00 PM EST / Thu 24 Dec 2020 11:00:00 AM PST)
> > Voting can continue after this deadline until the release manager
> > sends an email ending the vote.
> >
> > Thanks!
> >
> > P.S. Hint: download the whole staging repo with
> > wget -erobots=off -r -l inf -np -nH \
> >
> >
> https://repository.apache.org/content/repositories/orgapacheaccumulo-1091/
> > # note the trailing slash is needed
> >
>

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 1859 matches

Mail list logo