I am new to spark streaming and trying to understand spark ui and to do optimizations.
1. Processing at executors took less time than at driver. How to optimize to make driver tasks fast ? 2. We are using dstream.repartition(defaultParallelism*3) to increase parallelism which is causing high shuffles. Is there any option to avoid repartition manually to reduce data shuffles. 3. Also trying to understand how 6 tasks in stage1 and 199 tasks in stage2 got created? *hardware configuration:* executor-cores: 3; driver-cores: 3; dynamicAllocation is true; initial,min,maxExecutors: 25 StackOverFlow link for screenshots: https://stackoverflow.com/questions/62993030/spark-dstream-help-needed-to-understand-ui-and-how-to-set-parallelism-or-defau Thanks in Advance -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org