Hey Amrit, thanks for the reply! I checked out SOLR-12036, but it doesn't look like it has to do with CDCR, and the patch that is attached doesn't look CDCR related. Are you sure that's the correct JIRA number?
Thanks, Chris On Wed, Mar 7, 2018 at 11:21 AM, Amrit Sarkar <sarkaramr...@gmail.com> wrote: > Hey Chris, > > I figured a separate issue while working on CDCR which may relate to your > problem. Please see jira: *SOLR-12063* > <https://issues.apache.org/jira/projects/SOLR/issues/SOLR-12063>. This is > a > bug got introduced when we supported the bidirectional approach where an > extra flag in tlog entry for cdcr is added. > > This part of the code is messing up: > *UpdateLog.java.RecentUpdates::update()::* > > switch (oper) { > case UpdateLog.ADD: > case UpdateLog.UPDATE_INPLACE: > case UpdateLog.DELETE: > case UpdateLog.DELETE_BY_QUERY: > Update update = new Update(); > update.log = oldLog; > update.pointer = reader.position(); > update.version = version; > > if (oper == UpdateLog.UPDATE_INPLACE && entry.size() == 5) { > update.previousVersion = (Long) entry.get(UpdateLog.PREV_ > VERSION_IDX); > } > updatesForLog.add(update); > updates.put(version, update); > > if (oper == UpdateLog.DELETE_BY_QUERY) { > deleteByQueryList.add(update); > } else if (oper == UpdateLog.DELETE) { > deleteList.add(new DeleteUpdate(version, > (byte[])entry.get(entry.size()-1))); > } > > break; > > case UpdateLog.COMMIT: > break; > default: > throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, > "Unknown Operation! " + oper); > } > > deleteList.add(new DeleteUpdate(version, (byte[])entry.get(entry.size() > -1))); > > is expecting the last entry to be the payload, but everywhere in the > project, *pos:[2] *is the index for the payload, while the last entry in > source code is *boolean* in / after Solr 7.2, denoting update is cdcr > forwarded or typical. UpdateLog.java.RecentUpdates is used to in cdcr sync, > checkpoint operations and hence it is a legit bug, slipped the tests I > wrote. > > The immediate fix patch is uploaded and I am awaiting feedback on that. > Meanwhile if it is possible for you to apply the patch, build the jar and > try it out, please do and let us know. > > For, *SOLR-9394* <https://issues.apache.org/jira/browse/SOLR-9394>, if you > can comment on the JIRA and post the sample docs, solr logs, relevant > information, I can give it a thorough look. > > Amrit Sarkar > Search Engineer > Lucidworks, Inc. > 415-589-9269 > www.lucidworks.com > Twitter http://twitter.com/lucidworks > LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > Medium: https://medium.com/@sarkaramrit2 > > On Wed, Mar 7, 2018 at 1:35 AM, Chris Troullis <cptroul...@gmail.com> > wrote: > > > Hi all, > > > > We recently upgraded to Solr 7.2.0 as we saw that there were some CDCR > bug > > fixes and features added that would finally let us be able to make use of > > it (bi-directional syncing was the big one). The first time we tried to > > implement we ran into all kinds of errors, but this time we were able to > > get it mostly working. > > > > The issue we seem to be having now is that any time a document is deleted > > via deleteById from a collection on the primary node, we are flooded with > > "Invalid Number" errors followed by a random sequence of characters when > > CDCR tries to sync the update to the backup site. This happens on all of > > our collections where our id fields are defined as longs (some of them > the > > ids are compound keys and are strings). > > > > Here's a sample exception: > > > > org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error > > from server at http://ip/solr/collection_shard1_replica_n1: Invalid > > Number: ] > > -s > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > directUpdate(CloudSolrClient.java:549) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > sendRequest(CloudSolrClient.java:1012) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:883) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:945) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:945) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:945) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:945) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:945) > > at > > org.apache.solr.client.solrj.impl.CloudSolrClient.request( > > CloudSolrClient.java:816) > > at > > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194) > > at > > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211) > > at > > org.apache.solr.handler.CdcrReplicator.sendRequest( > > CdcrReplicator.java:140) > > at > > org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:104) > > at > > org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0( > > CdcrReplicatorScheduler.java:81) > > at > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor. > > lambda$execute$0(ExecutorUtil.java:188) > > at > > java.util.concurrent.ThreadPoolExecutor.runWorker( > > ThreadPoolExecutor.java:1149) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run( > > ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > > > > > I'm scratching my head as to the cause of this. It's like it is trying to > > deleteById for the value "]", even though that is not the ID for the > > document that was deleted from the primary. So I don't know if it is > > pulling this from the wrong field somehow or where that value if coming > > from. > > > > I found this issue: https://issues.apache.org/jira/browse/SOLR-9394 > which > > looks related, but doesn't look like it has any traction. > > > > Has anyone else experienced this issue with CDCR, or have any ideas as to > > what could be causing this issue? > > > > Thanks, > > > > Chris > > >