Yes, if it is not heavily used then you may see a significant delay. You
can change the defaults using tserver.walog.max.age [1] and
tserver.walog.max.size [2]. If I recall you can change these via the shell
and a restart is not required.

If you aren't seeing much ingestion, then the max age would be what you
want to set to ensure data is replicated within the window you want it to
be replicated.  Please keep in mind that setting either of these values to
very low thresholds will cause the WALs to roll over frequently and could
negatively impact the system, particularly for large Accumulo clusters.

In my experience, using Accumulo replication is a good fit when you have
longer SLAs on replication.  If you are looking for anything in the
near-real-time realm (milliseconds to seconds to maybe even a few minutes),
you'd be better off double writing to multiple Accumulo instances.

--Adam

[1]
https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_age
[2]
https://accumulo.apache.org/1.10/accumulo_user_manual.html#_tserver_walog_max_size

On Thu, Sep 23, 2021 at 1:31 PM Ligade, Shailesh [USA] <
ligade_shail...@bah.com> wrote:

> Thanks Adam,
>
>
>
> System is not heavily used, Does that mean it will wait for 1G data in wal
> file, (or 24 hours) before it will replicate?
>
>
>
> I don’t see any error In any log source master, tserver or target
> master,tserver
>
>
>
> Monitor replication page has correct status, and once in while I see
> In-Progress Replication section flashing by. But don’t see any new data in
> the target table. ☹
>
>
>
> -S
>
>
>
> *From:* Adam J. Shook <adamjsh...@gmail.com>
> *Sent:* Thursday, September 23, 2021 12:10 PM
> *To:* user@accumulo.apache.org
> *Subject:* Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Yes, inserting via the shell will be enough to test it.
>
>
>
> Note that the replication system uses the write-ahead logs (WAL) to
> replicate the data.  These logs must be closed before any replication can
> occur, so there will be a delay before it shows up in the peer table.  How
> long of a delay depends on how much data is actively being written to the
> TabletServers (and therefore the WAL) and/or how much time has passed since
> the WAL was opened. The default max WAL data size is 1 GB and the max age
> is 24 hours.
>
>
>
> --Adam
>
>
>
> On Thu, Sep 23, 2021 at 11:13 AM Ligade, Shailesh [USA] <
> ligade_shail...@bah.com> wrote:
>
> Thanks Adam,
>
>
>
> I am setting accumulo.name
> <https://urldefense.com/v3/__http:/accumulo.name__;!!May37g!au2NJ_bzRengNQZdhTHO9O38cNfzpFNzN8DFC49SzxW58cbMe9Vl20i58oJkg1wZWQ$>
> property in accumulo-site.xml. I think this property must be set to
> “Instance Name” value, I tried to set to “primary” and I saw error stating
> that instance id was not found in zookeeper
>
>
>
> I have few tables to replicate so I am thinking I will set all others
> properties using shell config command
>
>
>
> To test this, I just insert value using shell right? Or do I need to flush
> or compact on the table to see those values on the other side?
>
>
>
> -S
>
>
>
> *From:* Adam J. Shook <adamjsh...@gmail.com>
> *Sent:* Thursday, September 23, 2021 11:08 AM
> *To:* user@accumulo.apache.org
> *Subject:* Re: [External] RE: accumulo 1.10 replication issue
>
>
>
> Your configurations look correct to me, and it sounds like it is partially
> working as you are seeing files that need replicated in the Accumulo
> Monitor. I do have the replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> and all replication.peer.* properties defined in accumulo-site.xml. Do you
> have all these properties defined there?  If not, try setting them in
> accumulo-site.xml and restarting your Accumulo services, particularly the
> Master and TabletServers.  The Master may not be queuing work and/or the
> TabletServers may not be looking for work.
>
>
>
> You should see DEBUG-level logs in the TabletServers that say "Looking for
> work in /accumulo/<instanceId>/replication/workqueue", so enable debug
> logging if you haven't done so already in the generic_logger.xml file.
>
>
>
> --Adam
>
>
>
> On Thu, Sep 23, 2021 at 6:53 AM Ligade, Shailesh [USA] <
> ligade_shail...@bah.com> wrote:
>
> Thanks for reply
>
>
>
> I am using insert command from shell to insert data.
>
>
>
> Also, a quick question, replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> property can it be set using cli? Will that work or it must be defined in
> accumilo-site.xml?
>
>
>
> Thanks
>
> -S
>
>
>
>
>
> *From:* d...@etcoleman.com <d...@etcoleman.com>
> *Sent:* Thursday, September 23, 2021 6:50 AM
> *To:* user@accumulo.apache.org
> *Subject:* [External] RE: accumulo 1.10 replication issue
>
>
>
> How are you inserting the data?
>
>
>
> *From:* Ligade, Shailesh [USA] <ligade_shail...@bah.com>
> *Sent:* Wednesday, September 22, 2021 10:22 PM
> *To:* user@accumulo.apache.org
> *Subject:* accumulo 1.10 replication issue
>
>
>
> Hello,
>
>
>
> I am following
>
> Apache Accumulo® User Manual Version 1.10
> <https://urldefense.com/v3/__https:/accumulo.apache.org/1.10/accumulo_user_manual.html*_replication__;Iw!!May37g!epzOA4Zxtj4kjXvE1dPTtAae7AAFXbiZCVMxVk6_yQ3O-AlaG8GkML6q5OX1nt3O0A$>
>
>
>
> I want to setup replication from accumulo instance inst1, table source, TO
> inst2, table target
>
> I created a replication user,( same password) on both instances and grant
> Table.READ/WRITE for source and target respectively
>
>
>
> I set replication.name
> <https://urldefense.com/v3/__http:/replication.name__;!!May37g!YaShfRxRNA1m14PM-_NQTaWWuL-fcis6RlUI9RKQU68Q2oWUTZuh-Q1EkHA0XIO8vA$>
> property to be same as inst on both instances
>
>
>
> On inst1 Set following properties
>
>
>
>
> replication.peer.inst1=org.apache.accumulo.tserver.replication.AccumuloReplicaSystem,inst2,inst2zoo1:2181,inst2zoo2:2181,inst2zoo3:2181
>
> replication.peer.user.inst2=replication
>
> replication.peer.password.inst2=replication
>
>
>
> set the source table for replication
>
> config -t source -s table.replication=true
>
> config -t source -s table.replication.target.inst2=(number I got for
> target table from inst2 tables -l command)
>
>
>
> and finally I did
>
> online accumulo.replication
>
>
>
> Now when I insert data in source, I get feiles needing replication 1 on
> the monitor replication section. All other values are correct, TABLE –
> source, PEER – inst2 REMOTE ID as number I set
>
>
>
> However my In-Progress Replication always stay empty and I don’t see any
> data in inst2 target table
>
>
>
> No errors that I can see in master log or tserver log where tablet exist.
>
>
>
> Any idea what may be wrong? Is there any way to debug this?
>
>
>
> -S
>
>
>
>
>
>
>
>

Reply via email to