Kontinuation commented on PR #792:
URL: https://github.com/apache/sedona/pull/792#issuecomment-1461205662

   We can mock the static methods in `GeometrySerializer` and count the number 
of invocations. I've tried out [Mockito](https://site.mockito.org/) and found 
it quite suitable for this case.
   
   To verify that serialization calls were actually eliminated, we can 
construct a simple catalyst expression and eval it instead of running a 
full-fledged Spark job. We mock `GeometrySerializer` when evaluating the 
expression, then verify invocation counts after eval.
   
   Here is an example of using `org.mockito:mockito-core:5.1.1` to test the 
`SerdeAware` mechanism:
   
   ```scala
   class SerdeAwareFunctionSpec extends AnyFunSpec {
   
     describe("SerdeAwareFunction") {
       it("should save us some serialize and deserialize cost") {
         // Mock GeometrySerializer
         val factory = new GeometryFactory
         val stubGeom = factory.createPoint(new Coordinate(1, 2))
         val mocked = mockStatic(classOf[GeometrySerializer])
         mocked.when(() => 
GeometrySerializer.deserialize(any(classOf[Array[Byte]]))).thenReturn(stubGeom)
         mocked.when(() => 
GeometrySerializer.serialize(any(classOf[Geometry]))).thenReturn(Array[Byte](1, 
2, 3))
   
         val expr = ST_Union(Seq(
           ST_Buffer(Seq(ST_GeomFromText(Seq(Literal("POINT (1 2)"), 
Literal(0))), Literal(1.0))),
           ST_Point(Seq(Literal(1.0), Literal(2.0), Literal(null)))
         ))
   
         try {
           // Evaluate an expression
           expr.eval(null)
   
           // Verify number of invocations
           mocked.verify(
             () => GeometrySerializer.deserialize(any(classOf[Array[Byte]])),
             atMost(0))
           mocked.verify(
             () => GeometrySerializer.serialize(any(classOf[Geometry])),
             atMost(1))
         } finally {
           // Undo the mock
           mocked.close()
         }
       }
     }
   }
   ```
   
   This test will pass on this branch, and will fail on the master branch:
   
   ```
   Wanted at most 0 times but was 3
   org.mockito.exceptions.verification.MoreThanAllowedActualInvocations: 
   Wanted at most 0 times but was 3
        at 
org.apache.sedona.common.geometrySerde.GeometrySerializer.deserialize(GeometrySerializer.java:64)
        at 
org.apache.sedona.sql.functions.SerdeAwareFunctionSpec.$anonfun$new$5(SerdeAwareFunctionSpec.scala:52)
        at 
org.apache.sedona.sql.functions.SerdeAwareFunctionSpec.$anonfun$new$2(SerdeAwareFunctionSpec.scala:53)
   ...
   ```
   
   Note that we've mocked the `GeometrySerializer` to give constant results, so 
the result of `eval` is incorrect. It does not matter since the purpose of this 
test is to collect information on the code path that actually got executed. Any 
stub values that make it go through should be fine.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to