nickva commented on issue #3773: URL: https://github.com/apache/couchdb/issues/3773#issuecomment-937959517
Great analysis, @jcoglan I think you may be right that it has to do with how intermediate results are stored and how the unicode collator compares them. We had a recent fix that may be related https://github.com/apache/couchdb/commit/4f33f14deb733f36cf0df03aedceebc746716d8c, there we noticed that the collation rules on the shards are different than the collation rules used when aggregating rows in the coordinator (fabric). Wonder which representation is the correct one - should these two rows be considered equal (does unicode collation consider them equivalent)? Or, is it correct that they would be emitted as separate rows. If first is correct, then it could be that the coordinator reduce step (in fabric) has a bug where it matches keys exactly instead of using unicode collation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
