SeasonPilot opened a new pull request, #707:
URL: https://github.com/apache/geaflow/pull/707
### What changes were proposed in this pull request?
<!--Please describe the major changes for this PR-->
This PR adds comprehensive test coverage for the Jaccard Similarity
algorithm implementation to resolve Issue #42.
Changes:
- Added 8 new test methods in GQLAlgorithmTest.java to thoroughly test the
Jaccard Similarity algorithm
- Created 7 new SQL test query files covering various scenarios
- Added 7 corresponding expected result files for test validation
- Created custom test graph with 9 vertices and 11 edges including edge
cases (self-loop, isolated vertex)
- Added 2 test data files (vertices and edges) for the custom graph
Test Coverage:
1. Basic Functionality: Standard similarity calculation between different
vertices
2. No Common Neighbors: Vertices with disjoint neighbor sets (Jaccard =
0.0)
3. Identical Vertices: Self-comparison case (returns empty by design)
4. High Similarity: Vertices with significant neighbor overlap
5. Complete Overlap: Vertices sharing all common neighbors
6. Disjoint Sets: Vertices from different connected components
7. Self-Loop: Vertex with self-loop edge
8. Isolated Vertex: Vertex with no neighbors
Files Added (18 total):
- 7 test query SQL files in src/test/resources/query/
- 7 expected result files in src/test/resources/expect/
- 2 test data files in src/test/resources/data/
- 2 test graph definition files
- 7 test methods added to GQLAlgorithmTest.java (70 lines)
### How was this PR tested?
- [ ] Tests have Added for the changes
- [ ] Production environment verified
<img width="3020" height="1770" alt="image"
src="https://github.com/user-attachments/assets/e8e221a7-a69f-4249-9fa1-34cdf68c13b3"
/>
Test Execution Results:
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0
Build: SUCCESS
Total time: 44.103 seconds
All 8 test cases passed successfully, achieving 100% pass rate. The tests
validate:
- Correct Jaccard coefficient calculations following the formula: J(A,B) =
|A ∩ B| / |A ∪ B|
- Proper handling of edge cases (self-loops, isolated vertices, identical
vertices)
- Undirected graph semantics using EdgeDirection.BOTH
- Empty result behavior for self-comparisons (intentional algorithm design)
Test Framework: TestNG with GeaFlow's QueryTester infrastructure
Resolves #42
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]