Hi all, I'm working on a setup to use Apache Flink in an assignment for a Big Data (bachelor) university course and I'm interested in your view on this. To sketch the situation: - > 200 students follow this course - students have to write some (simple) Flink applications using the DataStream API; the focus is on writing the transformation code - students need to write Scala code - we provide a dataset and a template (Scala class) with function signatures and detailed description per application. e.g.: def assignment_one(input: DataStream[Event]): DataStream[(String, Int)] = ??? - we provide some setup code like parsing of data and setting up the streaming environment - assignments need to be auto-graded, based on correct results
In last years course edition we approached this by a custom Docker container. This container first compiled the students code, run all the Flink applications against a different dataset and then verified the output against our solutions. This was turned into a grade and reported back to the student. Although this was a working approach, I think we can do better. I'm wondering if any of you have experience with using Apache Flink in a university course (or have seen this somewhere) as well as assessing Flink code. Thanks a lot! Kind regards, Wouter Zorgdrager