RE: Problems with accumulo replication
You can also use the tserver.walog.max.age property to ensure that the walogs roll if there is no activity. The default is 24h and was backported to 1.7.2. See ACCUMULO-4004 for more info. -Original Message- From: Josh Elser [mailto:els...@apache.org] Sent: Friday, December 29, 2017 10:08 AM To: user@accumulo.apache.org Subject: Re: Problems with accumulo replication If the system is reporting files that need to be replicated, it's probably one of two problems: * The WALs are still in use by the TabletServers. In its current implementation, the WALs are not replicated until the TabletServers don't referenced those WALs. This happens either by writing enough data or when the tabletserver is restarted. You can try to investigate either for this. * The replication is trying to happen but fails. You can look at the TabletServer logs on the primary instance to see if there are any reported exceptions around sending the data to the peer. On 12/29/17 8:24 AM, vLex Systems wrote: > Hi, > > I've configured replication between two instances of accumulo: one is > the primary accumulo and the other is a peer created from a restore of > the backup of the primary. > > I've followed the instructions in the manual > (https://accumulo.apache.org/1.7/accumulo_user_manual#_replication) > and I can see the 4 tables I've configured to replicate in the > Accumulo Monitor but they do not replicate. They have 1 or 2 "Files > needing replication" and this number never decreases. > > I've also tried inserting data in one of the tables and the data does > not replicate to the accumulo peer instance. > > In the master log I see many entries like this one: > 2017-12-29 13:22:25,490 [replication.RemoveCompleteReplicationRecords] > INFO : Removed 0 complete replication entries from the table > accumulo.replication > > Does anyone know what could be happening? > > Thanks. >
Re: Problems with accumulo replication
If the system is reporting files that need to be replicated, it's probably one of two problems: * The WALs are still in use by the TabletServers. In its current implementation, the WALs are not replicated until the TabletServers don't referenced those WALs. This happens either by writing enough data or when the tabletserver is restarted. You can try to investigate either for this. * The replication is trying to happen but fails. You can look at the TabletServer logs on the primary instance to see if there are any reported exceptions around sending the data to the peer. On 12/29/17 8:24 AM, vLex Systems wrote: Hi, I've configured replication between two instances of accumulo: one is the primary accumulo and the other is a peer created from a restore of the backup of the primary. I've followed the instructions in the manual (https://accumulo.apache.org/1.7/accumulo_user_manual#_replication) and I can see the 4 tables I've configured to replicate in the Accumulo Monitor but they do not replicate. They have 1 or 2 "Files needing replication" and this number never decreases. I've also tried inserting data in one of the tables and the data does not replicate to the accumulo peer instance. In the master log I see many entries like this one: 2017-12-29 13:22:25,490 [replication.RemoveCompleteReplicationRecords] INFO : Removed 0 complete replication entries from the table accumulo.replication Does anyone know what could be happening? Thanks.
Problems with accumulo replication
Hi, I've configured replication between two instances of accumulo: one is the primary accumulo and the other is a peer created from a restore of the backup of the primary. I've followed the instructions in the manual (https://accumulo.apache.org/1.7/accumulo_user_manual#_replication) and I can see the 4 tables I've configured to replicate in the Accumulo Monitor but they do not replicate. They have 1 or 2 "Files needing replication" and this number never decreases. I've also tried inserting data in one of the tables and the data does not replicate to the accumulo peer instance. In the master log I see many entries like this one: 2017-12-29 13:22:25,490 [replication.RemoveCompleteReplicationRecords] INFO : Removed 0 complete replication entries from the table accumulo.replication Does anyone know what could be happening? Thanks.