[GitHub] [spark] zero323 commented on a change in pull request #34273: [SPARK-36997][PYTHON][TESTS] Run mypy tests against ml, sql, streaming and core examples

GitBox Tue, 09 Nov 2021 04:17:52 -0800


zero323 commented on a change in pull request #34273:
URL: https://github.com/apache/spark/pull/34273#discussion_r745561614




##########
File path: examples/src/main/python/avro_inputformat.py
##########
@@ -75,7 +75,7 @@
         schema_rdd = sc.textFile(sys.argv[2], 1).collect()
         conf = {"avro.schema.input.key": reduce(lambda x, y: x + y, 
schema_rdd)}
 
-    avro_rdd = sc.newAPIHadoopFile(
+    avro_rdd = sc.newAPIHadoopFile(  # type: ignore[var-annotated]

Review comment:
       > How about just not running checks on examples if they're to be treated 
differently?
   
   That's an option, but one I'd personally like to avoid. That's because 
examples that are most likely to cause problems, are the ones corresponding to 
the "old" API's, which are the trickiest annotate (these depend heavily on 
generics, including generics `self`, interacting with user code) and have the 
least comprehensive coverage.
   
   Just to stress this point ‒ my main motivation to add this tests is not to 
validate `examples` (this are stable, highly exposed and majority of them has 
been around for long enough to assert that they're OK), but ensure that 
annotations are consistent with common usage patterns. 
   
   I definitely won't fight over that, but I am really not convinced that 
targeted ignores are a problem, especially when we have tons of these (and a 
huge bag of `casts`) within pyspark code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zero323 commented on a change in pull request #34273: [SPARK-36997][PYTHON][TESTS] Run mypy tests against ml, sql, streaming and core examples

Reply via email to