This is not easy to say without testing. It depends on type of computation etc. it also depends on the Spark version. Generally vectorization / SIMD could be much faster if it is applied by Spark / the JVM in scenario 2.
> On 9. Aug 2017, at 07:05, Raghavendra Pandey <raghavendra.pan...@gmail.com> > wrote: > > I am using structured streaming to evaluate multiple rules on same running > stream. > I have two options to do that. One is to use forEach and evaluate all the > rules on the row.. > The other option is to express rules in spark sql dsl and run multiple > queries. > I was wondering if option 1 will result in better performance even though I > can get catalyst optimization in option 2. > > Thanks > Raghav --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org