RE: [External] RE: accumulo 1.10 replication issue

2021-09-24 Thread Ligade, Shailesh [USA]
Thank you all,

I think replication itself is working but I am still not sure what triggers it. 
If I restart both cluster (master and tserver on the both), I can see data 
getting replicated. This indicates that if properly triggered the replication 
will work, needed plumbing is there and configs are correct.

I understand, other triggers are wal closing or wal aging. Since those are set 
pretty high, I can't reliably state replication worked ☹ Are there any other 
triggers? I did flush or/and compact  source table but that didn't work.

Also, I noticed that I had some bad configuration  at one point (not anymore) 
e.g., in correct peer name etc (that peer name do not exist under zookeeper), 
but even after tserver/master restart, that incorrect config still shows up 
under in-progress replication..not sure how to clear all that..May be over the 
time it will fix itself.

-S

-Original Message-
From: Christopher  
Sent: Thursday, September 23, 2021 2:23 PM
To: accumulo-user 
Subject: Re: [External] RE: accumulo 1.10 replication issue

The design of replication is intentionally passive and "eventually consistent" 
for efficiency. Batching efficiency is one reason why the feature is tightly 
coupled to WALs. If you need immediate replication, or want greater control 
over the batching process, you can create a layer on top of two Accumulo 
clients that will coordinate sending each mutation to both instances, rather 
than rely on the built-in replication features tied to WALs. I've even seen 
some developers try using Kafka to deliver updates, and having two different 
Accumulo clusters subscribe to the desired Kafka topics and do ingest into 
Accumulo that way.

Also, please be aware that the existing replication features have not been 
maintained in some time, and many of their test cases are known to be buggy. 
There are many known bugs and potential bugs with the replication as currently 
implemented, but there has not been active development maintaining that feature 
in many years. It is unclear what the future of the current implementation is. 
Like all open source software, you use it at your own risk. Please ensure that 
you've tested the feature to ensure it is suitable for your use case, before 
using it in production. And, if you find any fixes to problems as you go, we 
very much would welcome pull requests to fix them. If you find you were able to 
get it to work for your use case, that's great, and we welcome success stories 
as well! :)


On Thu, Sep 23, 2021 at 2:05 PM Ligade, Shailesh [USA] 
 wrote:
>
> Thanks, Appreciate your help.
>
>
>
> It can be confusing as I am waiting here but it is not replicating I will 
> reduce those values temporarily and see what happens. Interesting part is 
> Files Needing replication is stuck at 3 so it is possible that problem is 
> some where else.
>
>
>
> Thanks again
>
> -S
>
>
>
> From: Adam J. Shook 
> Sent: Thursday, September 23, 2021 1:50 PM
> To: user@accumulo.apache.org
> Subject: Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Yes, if it is not heavily used then you may see a significant delay. You can 
> change the defaults using tserver.walog.max.age [1] and 
> tserver.walog.max.size [2]. If I recall you can change these via the shell 
> and a restart is not required.
>
>
>
> If you aren't seeing much ingestion, then the max age would be what you want 
> to set to ensure data is replicated within the window you want it to be 
> replicated.  Please keep in mind that setting either of these values to very 
> low thresholds will cause the WALs to roll over frequently and could 
> negatively impact the system, particularly for large Accumulo clusters.
>
>
>
> In my experience, using Accumulo replication is a good fit when you have 
> longer SLAs on replication.  If you are looking for anything in the 
> near-real-time realm (milliseconds to seconds to maybe even a few minutes), 
> you'd be better off double writing to multiple Accumulo instances.
>
>
>
> --Adam
>
>
>
> [1] 
> https://urldefense.com/v3/__https://accumulo.apache.org/1.10/accumulo_
> user_manual.html*_tserver_walog_max_age__;Iw!!May37g!dY-gMWjsv8yNC79p6
> sh1ATiOgPTOUuVscS8i9ZprCzRfhnqmMEnX316CntKg5XqA_Q$
>
> [2] 
> https://urldefense.com/v3/__https://accumulo.apache.org/1.10/accumulo_
> user_manual.html*_tserver_walog_max_size__;Iw!!May37g!dY-gMWjsv8yNC79p
> 6sh1ATiOgPTOUuVscS8i9ZprCzRfhnqmMEnX316CntLaFRowyw$
>
>
>
> On Thu, Sep 23, 2021 at 1:31 PM Ligade, Shailesh [USA] 
>  wrote:
>
> Thanks Adam,
>
>
>
> System is not heavily used, Does that mean it will wait for 1G data in wal 
> file, (or 24 hours) before it will replicate?
>
>
>
> I don’t see any error In any log source master, tserver or target 
> master,tserver
>
&g

Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Christopher
The design of replication is intentionally passive and "eventually
consistent" for efficiency. Batching efficiency is one reason why the
feature is tightly coupled to WALs. If you need immediate replication,
or want greater control over the batching process, you can create a
layer on top of two Accumulo clients that will coordinate sending each
mutation to both instances, rather than rely on the built-in
replication features tied to WALs. I've even seen some developers try
using Kafka to deliver updates, and having two different Accumulo
clusters subscribe to the desired Kafka topics and do ingest into
Accumulo that way.

Also, please be aware that the existing replication features have not
been maintained in some time, and many of their test cases are known
to be buggy. There are many known bugs and potential bugs with the
replication as currently implemented, but there has not been active
development maintaining that feature in many years. It is unclear what
the future of the current implementation is. Like all open source
software, you use it at your own risk. Please ensure that you've
tested the feature to ensure it is suitable for your use case, before
using it in production. And, if you find any fixes to problems as you
go, we very much would welcome pull requests to fix them. If you find
you were able to get it to work for your use case, that's great, and
we welcome success stories as well! :)


On Thu, Sep 23, 2021 at 2:05 PM Ligade, Shailesh [USA]
 wrote:
>
> Thanks, Appreciate your help.
>
>
>
> It can be confusing as I am waiting here but it is not replicating I will 
> reduce those values temporarily and see what happens. Interesting part is 
> Files Needing replication is stuck at 3 so it is possible that problem is 
> some where else.
>
>
>
> Thanks again
>
> -S
>
>
>
> From: Adam J. Shook 
> Sent: Thursday, September 23, 2021 1:50 PM
> To: user@accumulo.apache.org
> Subject: Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Yes, if it is not heavily used then you may see a significant delay. You can 
> change the defaults using tserver.walog.max.age [1] and 
> tserver.walog.max.size [2]. If I recall you can change these via the shell 
> and a restart is not required.
>
>
>
> If you aren't seeing much ingestion, then the max age would be what you want 
> to set to ensure data is replicated within the window you want it to be 
> replicated.  Please keep in mind that setting either of these values to very 
> low thresholds will cause the WALs to roll over frequently and could 
> negatively impact the system, particularly for large Accumulo clusters.
>
>
>
> In my experience, using Accumulo replication is a good fit when you have 
> longer SLAs on replication.  If you are looking for anything in the 
> near-real-time realm (milliseconds to seconds to maybe even a few minutes), 
> you'd be better off double writing to multiple Accumulo instances.
>
>
>
> --Adam
>
>
>
> [1] 
> https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_age
>
> [2] 
> https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_size
>
>
>
> On Thu, Sep 23, 2021 at 1:31 PM Ligade, Shailesh [USA] 
>  wrote:
>
> Thanks Adam,
>
>
>
> System is not heavily used, Does that mean it will wait for 1G data in wal 
> file, (or 24 hours) before it will replicate?
>
>
>
> I don’t see any error In any log source master, tserver or target 
> master,tserver
>
>
>
> Monitor replication page has correct status, and once in while I see 
> In-Progress Replication section flashing by. But don’t see any new data in 
> the target table. ☹
>
>
>
> -S
>
>
>
> From: Adam J. Shook 
> Sent: Thursday, September 23, 2021 12:10 PM
> To: user@accumulo.apache.org
> Subject: Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Yes, inserting via the shell will be enough to test it.
>
>
>
> Note that the replication system uses the write-ahead logs (WAL) to replicate 
> the data.  These logs must be closed before any replication can occur, so 
> there will be a delay before it shows up in the peer table.  How long of a 
> delay depends on how much data is actively being written to the TabletServers 
> (and therefore the WAL) and/or how much time has passed since the WAL was 
> opened. The default max WAL data size is 1 GB and the max age is 24 hours.
>
>
>
> --Adam
>
>
>
> On Thu, Sep 23, 2021 at 11:13 AM Ligade, Shailesh [USA] 
>  wrote:
>
> Thanks Adam,
>
>
>
> I am setting accumulo.name property in accumulo-site.xml. I think this 
> property must be set to “Instance Name” value, I tried to set to “primary” 
> and I saw error statin

RE: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Ligade, Shailesh [USA]
Thanks, Appreciate your help.

It can be confusing as I am waiting here but it is not replicating I will 
reduce those values temporarily and see what happens. Interesting part is Files 
Needing replication is stuck at 3 so it is possible that problem is some where 
else.

Thanks again
-S

From: Adam J. Shook 
Sent: Thursday, September 23, 2021 1:50 PM
To: user@accumulo.apache.org
Subject: Re: [External] RE: accumulo 1.10 replication issue

Yes, if it is not heavily used then you may see a significant delay. You can 
change the defaults using tserver.walog.max.age [1] and tserver.walog.max.size 
[2]. If I recall you can change these via the shell and a restart is not 
required.

If you aren't seeing much ingestion, then the max age would be what you want to 
set to ensure data is replicated within the window you want it to be 
replicated.  Please keep in mind that setting either of these values to very 
low thresholds will cause the WALs to roll over frequently and could negatively 
impact the system, particularly for large Accumulo clusters.

In my experience, using Accumulo replication is a good fit when you have longer 
SLAs on replication.  If you are looking for anything in the near-real-time 
realm (milliseconds to seconds to maybe even a few minutes), you'd be better 
off double writing to multiple Accumulo instances.

--Adam

[1] 
https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_age<https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_tserver_walog_max_age__;Iw!!May37g!abBCLknLxFVCzoPfcMJQ_DMnbdLbmOa-oYeMQf0CbQTNSS8yF4jOrBV_yNn1zEjBog$>
[2] 
https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_size<https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_tserver_walog_max_size__;Iw!!May37g!abBCLknLxFVCzoPfcMJQ_DMnbdLbmOa-oYeMQf0CbQTNSS8yF4jOrBV_yNnD3o4pgA$>

On Thu, Sep 23, 2021 at 1:31 PM Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>> wrote:
Thanks Adam,

System is not heavily used, Does that mean it will wait for 1G data in wal 
file, (or 24 hours) before it will replicate?

I don’t see any error In any log source master, tserver or target master,tserver

Monitor replication page has correct status, and once in while I see 
In-Progress Replication section flashing by. But don’t see any new data in the 
target table. ☹

-S

From: Adam J. Shook mailto:adamjsh...@gmail.com>>
Sent: Thursday, September 23, 2021 12:10 PM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: Re: [External] RE: accumulo 1.10 replication issue

Yes, inserting via the shell will be enough to test it.

Note that the replication system uses the write-ahead logs (WAL) to replicate 
the data.  These logs must be closed before any replication can occur, so there 
will be a delay before it shows up in the peer table.  How long of a delay 
depends on how much data is actively being written to the TabletServers (and 
therefore the WAL) and/or how much time has passed since the WAL was opened. 
The default max WAL data size is 1 GB and the max age is 24 hours.

--Adam

On Thu, Sep 23, 2021 at 11:13 AM Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>> wrote:
Thanks Adam,

I am setting 
accumulo.name<https://urldefense.com/v3/__http:/accumulo.name__;!!May37g!au2NJ_bzRengNQZdhTHO9O38cNfzpFNzN8DFC49SzxW58cbMe9Vl20i58oJkg1wZWQ$>
 property in accumulo-site.xml. I think this property must be set to “Instance 
Name” value, I tried to set to “primary” and I saw error stating that instance 
id was not found in zookeeper

I have few tables to replicate so I am thinking I will set all others 
properties using shell config command

To test this, I just insert value using shell right? Or do I need to flush or 
compact on the table to see those values on the other side?

-S

From: Adam J. Shook mailto:adamjsh...@gmail.com>>
Sent: Thursday, September 23, 2021 11:08 AM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: Re: [External] RE: accumulo 1.10 replication issue

Your configurations look correct to me, and it sounds like it is partially 
working as you are seeing files that need replicated in the Accumulo Monitor. I 
do have the 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 and all replication.peer.* properties defined in accumulo-site.xml. Do you 
have all these properties defined there?  If not, try setting them in 
accumulo-site.xml and restarting your Accumulo services, particularly the 
Master and TabletServers.  The Master may not be queuing work and/or the 
TabletServers may not be looking for work.

You should see DEBUG-level logs in the TabletServers that say "Looking for work 
in /accumulo//replication/workqueue", so enable debug logging if 
you haven't done so already in the generic_logg

Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Adam J. Shook
Yes, if it is not heavily used then you may see a significant delay. You
can change the defaults using tserver.walog.max.age [1] and
tserver.walog.max.size [2]. If I recall you can change these via the shell
and a restart is not required.

If you aren't seeing much ingestion, then the max age would be what you
want to set to ensure data is replicated within the window you want it to
be replicated.  Please keep in mind that setting either of these values to
very low thresholds will cause the WALs to roll over frequently and could
negatively impact the system, particularly for large Accumulo clusters.

In my experience, using Accumulo replication is a good fit when you have
longer SLAs on replication.  If you are looking for anything in the
near-real-time realm (milliseconds to seconds to maybe even a few minutes),
you'd be better off double writing to multiple Accumulo instances.

--Adam

[1]
https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_age
[2]
https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_size

On Thu, Sep 23, 2021 at 1:31 PM Ligade, Shailesh [USA] <
ligade_shail...@bah.com> wrote:

> Thanks Adam,
>
>
>
> System is not heavily used, Does that mean it will wait for 1G data in wal
> file, (or 24 hours) before it will replicate?
>
>
>
> I don’t see any error In any log source master, tserver or target
> master,tserver
>
>
>
> Monitor replication page has correct status, and once in while I see
> In-Progress Replication section flashing by. But don’t see any new data in
> the target table. ☹
>
>
>
> -S
>
>
>
> *From:* Adam J. Shook 
> *Sent:* Thursday, September 23, 2021 12:10 PM
> *To:* user@accumulo.apache.org
> *Subject:* Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Yes, inserting via the shell will be enough to test it.
>
>
>
> Note that the replication system uses the write-ahead logs (WAL) to
> replicate the data.  These logs must be closed before any replication can
> occur, so there will be a delay before it shows up in the peer table.  How
> long of a delay depends on how much data is actively being written to the
> TabletServers (and therefore the WAL) and/or how much time has passed since
> the WAL was opened. The default max WAL data size is 1 GB and the max age
> is 24 hours.
>
>
>
> --Adam
>
>
>
> On Thu, Sep 23, 2021 at 11:13 AM Ligade, Shailesh [USA] <
> ligade_shail...@bah.com> wrote:
>
> Thanks Adam,
>
>
>
> I am setting accumulo.name
> <https://urldefense.com/v3/__http:/accumulo.name__;!!May37g!au2NJ_bzRengNQZdhTHO9O38cNfzpFNzN8DFC49SzxW58cbMe9Vl20i58oJkg1wZWQ$>
> property in accumulo-site.xml. I think this property must be set to
> “Instance Name” value, I tried to set to “primary” and I saw error stating
> that instance id was not found in zookeeper
>
>
>
> I have few tables to replicate so I am thinking I will set all others
> properties using shell config command
>
>
>
> To test this, I just insert value using shell right? Or do I need to flush
> or compact on the table to see those values on the other side?
>
>
>
> -S
>
>
>
> *From:* Adam J. Shook 
> *Sent:* Thursday, September 23, 2021 11:08 AM
> *To:* user@accumulo.apache.org
> *Subject:* Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Your configurations look correct to me, and it sounds like it is partially
> working as you are seeing files that need replicated in the Accumulo
> Monitor. I do have the replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> and all replication.peer.* properties defined in accumulo-site.xml. Do you
> have all these properties defined there?  If not, try setting them in
> accumulo-site.xml and restarting your Accumulo services, particularly the
> Master and TabletServers.  The Master may not be queuing work and/or the
> TabletServers may not be looking for work.
>
>
>
> You should see DEBUG-level logs in the TabletServers that say "Looking for
> work in /accumulo//replication/workqueue", so enable debug
> logging if you haven't done so already in the generic_logger.xml file.
>
>
>
> --Adam
>
>
>
> On Thu, Sep 23, 2021 at 6:53 AM Ligade, Shailesh [USA] <
> ligade_shail...@bah.com> wrote:
>
> Thanks for reply
>
>
>
> I am using insert command from shell to insert data.
>
>
>
> Also, a quick question, replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> property can it be set using cli? Will that work or it must 

RE: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Ligade, Shailesh [USA]
Thanks Adam,

System is not heavily used, Does that mean it will wait for 1G data in wal 
file, (or 24 hours) before it will replicate?

I don’t see any error In any log source master, tserver or target master,tserver

Monitor replication page has correct status, and once in while I see 
In-Progress Replication section flashing by. But don’t see any new data in the 
target table. ☹

-S

From: Adam J. Shook 
Sent: Thursday, September 23, 2021 12:10 PM
To: user@accumulo.apache.org
Subject: Re: [External] RE: accumulo 1.10 replication issue

Yes, inserting via the shell will be enough to test it.

Note that the replication system uses the write-ahead logs (WAL) to replicate 
the data.  These logs must be closed before any replication can occur, so there 
will be a delay before it shows up in the peer table.  How long of a delay 
depends on how much data is actively being written to the TabletServers (and 
therefore the WAL) and/or how much time has passed since the WAL was opened. 
The default max WAL data size is 1 GB and the max age is 24 hours.

--Adam

On Thu, Sep 23, 2021 at 11:13 AM Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>> wrote:
Thanks Adam,

I am setting 
accumulo.name<https://urldefense.com/v3/__http:/accumulo.name__;!!May37g!au2NJ_bzRengNQZdhTHO9O38cNfzpFNzN8DFC49SzxW58cbMe9Vl20i58oJkg1wZWQ$>
 property in accumulo-site.xml. I think this property must be set to “Instance 
Name” value, I tried to set to “primary” and I saw error stating that instance 
id was not found in zookeeper

I have few tables to replicate so I am thinking I will set all others 
properties using shell config command

To test this, I just insert value using shell right? Or do I need to flush or 
compact on the table to see those values on the other side?

-S

From: Adam J. Shook mailto:adamjsh...@gmail.com>>
Sent: Thursday, September 23, 2021 11:08 AM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: Re: [External] RE: accumulo 1.10 replication issue

Your configurations look correct to me, and it sounds like it is partially 
working as you are seeing files that need replicated in the Accumulo Monitor. I 
do have the 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 and all replication.peer.* properties defined in accumulo-site.xml. Do you 
have all these properties defined there?  If not, try setting them in 
accumulo-site.xml and restarting your Accumulo services, particularly the 
Master and TabletServers.  The Master may not be queuing work and/or the 
TabletServers may not be looking for work.

You should see DEBUG-level logs in the TabletServers that say "Looking for work 
in /accumulo//replication/workqueue", so enable debug logging if 
you haven't done so already in the generic_logger.xml file.

--Adam

On Thu, Sep 23, 2021 at 6:53 AM Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>> wrote:
Thanks for reply

I am using insert command from shell to insert data.

Also, a quick question, 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 property can it be set using cli? Will that work or it must be defined in 
accumilo-site.xml?

Thanks
-S


From: d...@etcoleman.com<mailto:d...@etcoleman.com> 
mailto:d...@etcoleman.com>>
Sent: Thursday, September 23, 2021 6:50 AM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: [External] RE: accumulo 1.10 replication issue

How are you inserting the data?

From: Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>>
Sent: Wednesday, September 22, 2021 10:22 PM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: accumulo 1.10 replication issue

Hello,

I am following
Apache Accumulo® User Manual Version 
1.10<https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_replication__;Iw!!May37g!epzOA4Zxtj4kjXvE1dPTtAae7AAFXbiZCVMxVk6_yQ3O-AlaG8GkML6q5OX1nt3O0A$>

I want to setup replication from accumulo instance inst1, table source, TO 
inst2, table target
I created a replication user,( same password) on both instances and grant 
Table.READ/WRITE for source and target respectively

I set 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 property to be same as inst on both instances

On inst1 Set following properties

replication.peer.inst1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,inst2,inst2zoo1:2181,inst2zoo2:2181,inst2zoo3:2181
replication.peer.user.inst2=replication
replication.peer.password.inst2=replication

set the source table for replication
config -t source -s table.replication=true
config -t source -s table.replication.target.inst2=(number I got for target 

Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Adam J. Shook
Yes, inserting via the shell will be enough to test it.

Note that the replication system uses the write-ahead logs (WAL) to
replicate the data.  These logs must be closed before any replication can
occur, so there will be a delay before it shows up in the peer table.  How
long of a delay depends on how much data is actively being written to the
TabletServers (and therefore the WAL) and/or how much time has passed since
the WAL was opened. The default max WAL data size is 1 GB and the max age
is 24 hours.

--Adam

On Thu, Sep 23, 2021 at 11:13 AM Ligade, Shailesh [USA] <
ligade_shail...@bah.com> wrote:

> Thanks Adam,
>
>
>
> I am setting accumulo.name property in accumulo-site.xml. I think this
> property must be set to “Instance Name” value, I tried to set to “primary”
> and I saw error stating that instance id was not found in zookeeper
>
>
>
> I have few tables to replicate so I am thinking I will set all others
> properties using shell config command
>
>
>
> To test this, I just insert value using shell right? Or do I need to flush
> or compact on the table to see those values on the other side?
>
>
>
> -S
>
>
>
> *From:* Adam J. Shook 
> *Sent:* Thursday, September 23, 2021 11:08 AM
> *To:* user@accumulo.apache.org
> *Subject:* Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Your configurations look correct to me, and it sounds like it is partially
> working as you are seeing files that need replicated in the Accumulo
> Monitor. I do have the replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> and all replication.peer.* properties defined in accumulo-site.xml. Do you
> have all these properties defined there?  If not, try setting them in
> accumulo-site.xml and restarting your Accumulo services, particularly the
> Master and TabletServers.  The Master may not be queuing work and/or the
> TabletServers may not be looking for work.
>
>
>
> You should see DEBUG-level logs in the TabletServers that say "Looking for
> work in /accumulo//replication/workqueue", so enable debug
> logging if you haven't done so already in the generic_logger.xml file.
>
>
>
> --Adam
>
>
>
> On Thu, Sep 23, 2021 at 6:53 AM Ligade, Shailesh [USA] <
> ligade_shail...@bah.com> wrote:
>
> Thanks for reply
>
>
>
> I am using insert command from shell to insert data.
>
>
>
> Also, a quick question, replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> property can it be set using cli? Will that work or it must be defined in
> accumilo-site.xml?
>
>
>
> Thanks
>
> -S
>
>
>
>
>
> *From:* d...@etcoleman.com 
> *Sent:* Thursday, September 23, 2021 6:50 AM
> *To:* user@accumulo.apache.org
> *Subject:* [External] RE: accumulo 1.10 replication issue
>
>
>
> How are you inserting the data?
>
>
>
> *From:* Ligade, Shailesh [USA] 
> *Sent:* Wednesday, September 22, 2021 10:22 PM
> *To:* user@accumulo.apache.org
> *Subject:* accumulo 1.10 replication issue
>
>
>
> Hello,
>
>
>
> I am following
>
> Apache Accumulo® User Manual Version 1.10
> <https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_replication__;Iw!!May37g!epzOA4Zxtj4kjXvE1dPTtAae7AAFXbiZCVMxVk6_yQ3O-AlaG8GkML6q5OX1nt3O0A$>
>
>
>
> I want to setup replication from accumulo instance inst1, table source, TO
> inst2, table target
>
> I created a replication user,( same password) on both instances and grant
> Table.READ/WRITE for source and target respectively
>
>
>
> I set replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> property to be same as inst on both instances
>
>
>
> On inst1 Set following properties
>
>
>
>
> replication.peer.inst1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,inst2,inst2zoo1:2181,inst2zoo2:2181,inst2zoo3:2181
>
> replication.peer.user.inst2=replication
>
> replication.peer.password.inst2=replication
>
>
>
> set the source table for replication
>
> config -t source -s table.replication=true
>
> config -t source -s table.replication.target.inst2=(number I got for
> target table from inst2 tables -l command)
>
>
>
> and finally I did
>
> online accumulo.replication
>
>
>
> Now when I insert data in source, I get feiles needing replication 1 on
> the monitor replication section. All other values are correct, TABLE –
> source, PEER – inst2 REMOTE ID as number I set
>
>
>
> However my In-Progress Replication always stay empty and I don’t see any
> data in inst2 target table
>
>
>
> No errors that I can see in master log or tserver log where tablet exist.
>
>
>
> Any idea what may be wrong? Is there any way to debug this?
>
>
>
> -S
>
>
>
>
>
>
>
>


RE: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Ligade, Shailesh [USA]
Thanks Adam,

I am setting accumulo.name property in accumulo-site.xml. I think this property 
must be set to “Instance Name” value, I tried to set to “primary” and I saw 
error stating that instance id was not found in zookeeper

I have few tables to replicate so I am thinking I will set all others 
properties using shell config command

To test this, I just insert value using shell right? Or do I need to flush or 
compact on the table to see those values on the other side?

-S

From: Adam J. Shook 
Sent: Thursday, September 23, 2021 11:08 AM
To: user@accumulo.apache.org
Subject: Re: [External] RE: accumulo 1.10 replication issue

Your configurations look correct to me, and it sounds like it is partially 
working as you are seeing files that need replicated in the Accumulo Monitor. I 
do have the 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 and all replication.peer.* properties defined in accumulo-site.xml. Do you 
have all these properties defined there?  If not, try setting them in 
accumulo-site.xml and restarting your Accumulo services, particularly the 
Master and TabletServers.  The Master may not be queuing work and/or the 
TabletServers may not be looking for work.

You should see DEBUG-level logs in the TabletServers that say "Looking for work 
in /accumulo//replication/workqueue", so enable debug logging if 
you haven't done so already in the generic_logger.xml file.

--Adam

On Thu, Sep 23, 2021 at 6:53 AM Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>> wrote:
Thanks for reply

I am using insert command from shell to insert data.

Also, a quick question, 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 property can it be set using cli? Will that work or it must be defined in 
accumilo-site.xml?

Thanks
-S


From: d...@etcoleman.com<mailto:d...@etcoleman.com> 
mailto:d...@etcoleman.com>>
Sent: Thursday, September 23, 2021 6:50 AM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: [External] RE: accumulo 1.10 replication issue

How are you inserting the data?

From: Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>>
Sent: Wednesday, September 22, 2021 10:22 PM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: accumulo 1.10 replication issue

Hello,

I am following
Apache Accumulo® User Manual Version 
1.10<https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_replication__;Iw!!May37g!epzOA4Zxtj4kjXvE1dPTtAae7AAFXbiZCVMxVk6_yQ3O-AlaG8GkML6q5OX1nt3O0A$>

I want to setup replication from accumulo instance inst1, table source, TO 
inst2, table target
I created a replication user,( same password) on both instances and grant 
Table.READ/WRITE for source and target respectively

I set 
replication.name<https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
 property to be same as inst on both instances

On inst1 Set following properties

replication.peer.inst1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,inst2,inst2zoo1:2181,inst2zoo2:2181,inst2zoo3:2181
replication.peer.user.inst2=replication
replication.peer.password.inst2=replication

set the source table for replication
config -t source -s table.replication=true
config -t source -s table.replication.target.inst2=(number I got for target 
table from inst2 tables -l command)

and finally I did
online accumulo.replication

Now when I insert data in source, I get feiles needing replication 1 on the 
monitor replication section. All other values are correct, TABLE – source, PEER 
– inst2 REMOTE ID as number I set

However my In-Progress Replication always stay empty and I don’t see any data 
in inst2 target table

No errors that I can see in master log or tserver log where tablet exist.

Any idea what may be wrong? Is there any way to debug this?

-S





Re: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Adam J. Shook
Your configurations look correct to me, and it sounds like it is partially
working as you are seeing files that need replicated in the Accumulo
Monitor. I do have the replication.name and all replication.peer.*
properties defined in accumulo-site.xml. Do you have all these properties
defined there?  If not, try setting them in accumulo-site.xml and
restarting your Accumulo services, particularly the Master and
TabletServers.  The Master may not be queuing work and/or the TabletServers
may not be looking for work.

You should see DEBUG-level logs in the TabletServers that say "Looking for
work in /accumulo//replication/workqueue", so enable debug
logging if you haven't done so already in the generic_logger.xml file.

--Adam

On Thu, Sep 23, 2021 at 6:53 AM Ligade, Shailesh [USA] <
ligade_shail...@bah.com> wrote:

> Thanks for reply
>
>
>
> I am using insert command from shell to insert data.
>
>
>
> Also, a quick question, replication.name property can it be set using
> cli? Will that work or it must be defined in accumilo-site.xml?
>
>
>
> Thanks
>
> -S
>
>
>
>
>
> *From:* d...@etcoleman.com 
> *Sent:* Thursday, September 23, 2021 6:50 AM
> *To:* user@accumulo.apache.org
> *Subject:* [External] RE: accumulo 1.10 replication issue
>
>
>
> How are you inserting the data?
>
>
>
> *From:* Ligade, Shailesh [USA] 
> *Sent:* Wednesday, September 22, 2021 10:22 PM
> *To:* user@accumulo.apache.org
> *Subject:* accumulo 1.10 replication issue
>
>
>
> Hello,
>
>
>
> I am following
>
> Apache Accumulo® User Manual Version 1.10
> <https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_replication__;Iw!!May37g!epzOA4Zxtj4kjXvE1dPTtAae7AAFXbiZCVMxVk6_yQ3O-AlaG8GkML6q5OX1nt3O0A$>
>
>
>
> I want to setup replication from accumulo instance inst1, table source, TO
> inst2, table target
>
> I created a replication user,( same password) on both instances and grant
> Table.READ/WRITE for source and target respectively
>
>
>
> I set replication.name property to be same as inst on both instances
>
>
>
> On inst1 Set following properties
>
>
>
>
> replication.peer.inst1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,inst2,inst2zoo1:2181,inst2zoo2:2181,inst2zoo3:2181
>
> replication.peer.user.inst2=replication
>
> replication.peer.password.inst2=replication
>
>
>
> set the source table for replication
>
> config -t source -s table.replication=true
>
> config -t source -s table.replication.target.inst2=(number I got for
> target table from inst2 tables -l command)
>
>
>
> and finally I did
>
> online accumulo.replication
>
>
>
> Now when I insert data in source, I get feiles needing replication 1 on
> the monitor replication section. All other values are correct, TABLE –
> source, PEER – inst2 REMOTE ID as number I set
>
>
>
> However my In-Progress Replication always stay empty and I don’t see any
> data in inst2 target table
>
>
>
> No errors that I can see in master log or tserver log where tablet exist.
>
>
>
> Any idea what may be wrong? Is there any way to debug this?
>
>
>
> -S
>
>
>
>
>
>
>


RE: [External] RE: accumulo 1.10 replication issue

2021-09-23 Thread Ligade, Shailesh [USA]
Thanks for reply

I am using insert command from shell to insert data.

Also, a quick question, replication.name property can it be set using cli? Will 
that work or it must be defined in accumilo-site.xml?

Thanks
-S


From: d...@etcoleman.com 
Sent: Thursday, September 23, 2021 6:50 AM
To: user@accumulo.apache.org
Subject: [External] RE: accumulo 1.10 replication issue

How are you inserting the data?

From: Ligade, Shailesh [USA] 
mailto:ligade_shail...@bah.com>>
Sent: Wednesday, September 22, 2021 10:22 PM
To: user@accumulo.apache.org<mailto:user@accumulo.apache.org>
Subject: accumulo 1.10 replication issue

Hello,

I am following
Apache Accumulo(r) User Manual Version 
1.10<https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_replication__;Iw!!May37g!epzOA4Zxtj4kjXvE1dPTtAae7AAFXbiZCVMxVk6_yQ3O-AlaG8GkML6q5OX1nt3O0A$>

I want to setup replication from accumulo instance inst1, table source, TO 
inst2, table target
I created a replication user,( same password) on both instances and grant 
Table.READ/WRITE for source and target respectively

I set replication.name property to be same as inst on both instances

On inst1 Set following properties

replication.peer.inst1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,inst2,inst2zoo1:2181,inst2zoo2:2181,inst2zoo3:2181
replication.peer.user.inst2=replication
replication.peer.password.inst2=replication

set the source table for replication
config -t source -s table.replication=true
config -t source -s table.replication.target.inst2=(number I got for target 
table from inst2 tables -l command)

and finally I did
online accumulo.replication

Now when I insert data in source, I get feiles needing replication 1 on the 
monitor replication section. All other values are correct, TABLE - source, PEER 
- inst2 REMOTE ID as number I set

However my In-Progress Replication always stay empty and I don't see any data 
in inst2 target table

No errors that I can see in master log or tserver log where tablet exist.

Any idea what may be wrong? Is there any way to debug this?

-S