Kontinuation commented on PR #792: URL: https://github.com/apache/sedona/pull/792#issuecomment-1461205662
We can mock the static methods in `GeometrySerializer` and count the number of invocations. I've tried out [Mockito](https://site.mockito.org/) and found it quite suitable for this case. To verify that serialization calls were actually eliminated, we can construct a simple catalyst expression and eval it instead of running a full-fledged Spark job. We mock `GeometrySerializer` when evaluating the expression, then verify invocation counts after eval. Here is an example of using `org.mockito:mockito-core:5.1.1` to test the `SerdeAware` mechanism: ```scala class SerdeAwareFunctionSpec extends AnyFunSpec { describe("SerdeAwareFunction") { it("should save us some serialize and deserialize cost") { // Mock GeometrySerializer val factory = new GeometryFactory val stubGeom = factory.createPoint(new Coordinate(1, 2)) val mocked = mockStatic(classOf[GeometrySerializer]) mocked.when(() => GeometrySerializer.deserialize(any(classOf[Array[Byte]]))).thenReturn(stubGeom) mocked.when(() => GeometrySerializer.serialize(any(classOf[Geometry]))).thenReturn(Array[Byte](1, 2, 3)) val expr = ST_Union(Seq( ST_Buffer(Seq(ST_GeomFromText(Seq(Literal("POINT (1 2)"), Literal(0))), Literal(1.0))), ST_Point(Seq(Literal(1.0), Literal(2.0), Literal(null))) )) try { // Evaluate an expression expr.eval(null) // Verify number of invocations mocked.verify( () => GeometrySerializer.deserialize(any(classOf[Array[Byte]])), atMost(0)) mocked.verify( () => GeometrySerializer.serialize(any(classOf[Geometry])), atMost(1)) } finally { // Undo the mock mocked.close() } } } } ``` This test will pass on this branch, and will fail on the master branch: ``` Wanted at most 0 times but was 3 org.mockito.exceptions.verification.MoreThanAllowedActualInvocations: Wanted at most 0 times but was 3 at org.apache.sedona.common.geometrySerde.GeometrySerializer.deserialize(GeometrySerializer.java:64) at org.apache.sedona.sql.functions.SerdeAwareFunctionSpec.$anonfun$new$5(SerdeAwareFunctionSpec.scala:52) at org.apache.sedona.sql.functions.SerdeAwareFunctionSpec.$anonfun$new$2(SerdeAwareFunctionSpec.scala:53) ... ``` Note that we've mocked the `GeometrySerializer` to give constant results, so the result of `eval` is incorrect. It does not matter since the purpose of this test is to collect information on the code path that actually got executed. Any stub values that make it go through should be fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
