Hey Amrit,

Did you happen to see my last reply?  Is SOLR-12036 the correct JIRA?

Thanks,

Chris

On Wed, Mar 7, 2018 at 1:52 PM, Chris Troullis <cptroul...@gmail.com> wrote:

> Hey Amrit, thanks for the reply!
>
> I checked out SOLR-12036, but it doesn't look like it has to do with CDCR,
> and the patch that is attached doesn't look CDCR related. Are you sure
> that's the correct JIRA number?
>
> Thanks,
>
> Chris
>
> On Wed, Mar 7, 2018 at 11:21 AM, Amrit Sarkar <sarkaramr...@gmail.com>
> wrote:
>
>> Hey Chris,
>>
>> I figured a separate issue while working on CDCR which may relate to your
>> problem. Please see jira: *SOLR-12063*
>> <https://issues.apache.org/jira/projects/SOLR/issues/SOLR-12063>. This
>> is a
>> bug got introduced when we supported the bidirectional approach where an
>> extra flag in tlog entry for cdcr is added.
>>
>> This part of the code is messing up:
>> *UpdateLog.java.RecentUpdates::update()::*
>>
>> switch (oper) {
>>   case UpdateLog.ADD:
>>   case UpdateLog.UPDATE_INPLACE:
>>   case UpdateLog.DELETE:
>>   case UpdateLog.DELETE_BY_QUERY:
>>     Update update = new Update();
>>     update.log = oldLog;
>>     update.pointer = reader.position();
>>     update.version = version;
>>
>>     if (oper == UpdateLog.UPDATE_INPLACE && entry.size() == 5) {
>>       update.previousVersion = (Long) entry.get(UpdateLog.PREV_VERSI
>> ON_IDX);
>>     }
>>     updatesForLog.add(update);
>>     updates.put(version, update);
>>
>>     if (oper == UpdateLog.DELETE_BY_QUERY) {
>>       deleteByQueryList.add(update);
>>     } else if (oper == UpdateLog.DELETE) {
>>       deleteList.add(new DeleteUpdate(version,
>> (byte[])entry.get(entry.size()-1)));
>>     }
>>
>>     break;
>>
>>   case UpdateLog.COMMIT:
>>     break;
>>   default:
>>     throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
>> "Unknown Operation! " + oper);
>> }
>>
>> deleteList.add(new DeleteUpdate(version, (byte[])entry.get(entry.size()
>> -1)));
>>
>> is expecting the last entry to be the payload, but everywhere in the
>> project, *pos:[2] *is the index for the payload, while the last entry in
>> source code is *boolean* in / after Solr 7.2, denoting update is cdcr
>> forwarded or typical. UpdateLog.java.RecentUpdates is used to in cdcr
>> sync,
>> checkpoint operations and hence it is a legit bug, slipped the tests I
>> wrote.
>>
>> The immediate fix patch is uploaded and I am awaiting feedback on that.
>> Meanwhile if it is possible for you to apply the patch, build the jar and
>> try it out, please do and let us know.
>>
>> For, *SOLR-9394* <https://issues.apache.org/jira/browse/SOLR-9394>, if
>> you
>> can comment on the JIRA and post the sample docs, solr logs, relevant
>> information, I can give it a thorough look.
>>
>> Amrit Sarkar
>> Search Engineer
>> Lucidworks, Inc.
>> 415-589-9269
>> www.lucidworks.com
>> Twitter http://twitter.com/lucidworks
>> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>> Medium: https://medium.com/@sarkaramrit2
>>
>> On Wed, Mar 7, 2018 at 1:35 AM, Chris Troullis <cptroul...@gmail.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > We recently upgraded to Solr 7.2.0 as we saw that there were some CDCR
>> bug
>> > fixes and features added that would finally let us be able to make use
>> of
>> > it (bi-directional syncing was the big one). The first time we tried to
>> > implement we ran into all kinds of errors, but this time we were able to
>> > get it mostly working.
>> >
>> > The issue we seem to be having now is that any time a document is
>> deleted
>> > via deleteById from a collection on the primary node, we are flooded
>> with
>> > "Invalid Number" errors followed by a random sequence of characters when
>> > CDCR tries to sync the update to the backup site. This happens on all of
>> > our collections where our id fields are defined as longs (some of them
>> the
>> > ids are compound keys and are strings).
>> >
>> > Here's a sample exception:
>> >
>> > org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>> > from server at http://ip/solr/collection_shard1_replica_n1: Invalid
>> > Number:  ]
>> > -s
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > directUpdate(CloudSolrClient.java:549)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > sendRequest(CloudSolrClient.java:1012)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > requestWithRetryOnStaleState(CloudSolrClient.java:883)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > requestWithRetryOnStaleState(CloudSolrClient.java:945)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > requestWithRetryOnStaleState(CloudSolrClient.java:945)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > requestWithRetryOnStaleState(CloudSolrClient.java:945)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > requestWithRetryOnStaleState(CloudSolrClient.java:945)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.
>> > requestWithRetryOnStaleState(CloudSolrClient.java:945)
>> >         at
>> > org.apache.solr.client.solrj.impl.CloudSolrClient.request(
>> > CloudSolrClient.java:816)
>> >         at
>> > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
>> >         at
>> > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
>> >         at
>> > org.apache.solr.handler.CdcrReplicator.sendRequest(
>> > CdcrReplicator.java:140)
>> >         at
>> > org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:104)
>> >         at
>> > org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(
>> > CdcrReplicatorScheduler.java:81)
>> >         at
>> > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.
>> > lambda$execute$0(ExecutorUtil.java:188)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(
>> > ThreadPoolExecutor.java:1149)
>> >         at
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> > ThreadPoolExecutor.java:624)
>> >         at java.lang.Thread.run(Thread.java:748)
>> >
>> >
>> > I'm scratching my head as to the cause of this. It's like it is trying
>> to
>> > deleteById for the value "]", even though that is not the ID for the
>> > document that was deleted from the primary. So I don't know if it is
>> > pulling this from the wrong field somehow or where that value if coming
>> > from.
>> >
>> > I found this issue: https://issues.apache.org/jira/browse/SOLR-9394
>> which
>> > looks related, but doesn't look like it has any traction.
>> >
>> > Has anyone else experienced this issue with CDCR, or have any ideas as
>> to
>> > what could be causing this issue?
>> >
>> > Thanks,
>> >
>> > Chris
>> >
>>
>
>

Reply via email to