This is an automated email from the ASF dual-hosted git repository.

skperez pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-nexus.git


The following commit(s) were added to refs/heads/master by this push:
     new 675f7c2  SDAP 412 - Solution to Duplicate Primary Issue in 
/match_spark Endpoint (#216)
675f7c2 is described below

commit 675f7c2afe2beb9898243be1a3024a72bdab7ca0
Author: Riley Kuttruff <[email protected]>
AuthorDate: Thu Mar 23 09:27:59 2023 -0700

    SDAP 412 - Solution to Duplicate Primary Issue in /match_spark Endpoint 
(#216)
    
    * Explicitly defined equality for DomsPoint.
    
    This prevents the duplicate primary points from appearing in the final 
results by merging them in the combineByKey step.
    
    * Lazy hashing for domspoint
    
    * Moved changelog entry
    
    * Simplified __eq__ and __hash__
    
    * Updated changelog entry to better reflect reasoning for fix
    
    * Switched equality field from data_id to object id @ construction time
    
    ---------
    
    Co-authored-by: rileykk <[email protected]>
---
 CHANGELOG.md                                    | 1 +
 analysis/webservice/algorithms_spark/Matchup.py | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index a99ee89..dc3baeb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -28,6 +28,7 @@ and this project adheres to [Semantic 
Versioning](https://semver.org/spec/v2.0.0
 - SDAP-449: Fixed 404 error when populating datasets; script was still using 
`/domslist`
 - SDAP-415: Fixed bug where mask was incorrectly combined across all variables 
for multi-variable satellite to satellite matchup
 - SDAP-434: Fix for webapp Docker image build failure
+- SDAP-412: Explicit definition of `__eq__` and `__hash__` in matchup 
`DomsPoint` class. This ensures all primary-secondary pairs with the same 
primary point are merged in the `combineByKey` step.
 ### Security
 
 ## [1.0.0] - 2022-12-05
diff --git a/analysis/webservice/algorithms_spark/Matchup.py 
b/analysis/webservice/algorithms_spark/Matchup.py
index 2bc91c3..1274b64 100644
--- a/analysis/webservice/algorithms_spark/Matchup.py
+++ b/analysis/webservice/algorithms_spark/Matchup.py
@@ -368,9 +368,17 @@ class DomsPoint(object):
         self.device = None
         self.file_url = None
 
+        self.__id = id(self)
+
     def __repr__(self):
         return str(self.__dict__)
 
+    def __eq__(self, other):
+        return isinstance(other, DomsPoint) and other.__id == self.__id
+
+    def __hash__(self):
+        return hash(self.data_id) if self.data_id else id(self)
+
     @staticmethod
     def _variables_to_device(variables):
         """

Reply via email to