zero323 commented on a change in pull request #34273:
URL: https://github.com/apache/spark/pull/34273#discussion_r745561614
##########
File path: examples/src/main/python/avro_inputformat.py
##########
@@ -75,7 +75,7 @@
schema_rdd = sc.textFile(sys.argv[2], 1).collect()
conf = {"avro.schema.input.key": reduce(lambda x, y: x + y,
schema_rdd)}
- avro_rdd = sc.newAPIHadoopFile(
+ avro_rdd = sc.newAPIHadoopFile( # type: ignore[var-annotated]
Review comment:
> How about just not running checks on examples if they're to be treated
differently?
That's an option, but one I'd personally like to avoid. That's because
examples that are most likely to cause problems, are the ones corresponding to
the "old" API's, which are the trickiest annotate (these depend heavily on
generics, including generics `self`, interacting with user code) and have the
least comprehensive coverage.
Just to stress this point ‒ my main motivation to add this tests is not to
validate `examples` (this are stable, highly exposed and majority of them has
been around for long enough to assert that they're OK), but ensure that
annotations are consistent with common usage patterns.
I definitely won't fight over that, but I am really not convinced that
targeted ignores are a problem, especially when we have tons of these (and a
huge bag of `casts`) within pyspark code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]