Hi Jacek, I know that it is not currently doing so, but should it be? The algorithm isn’t complicated and could be applied to both OR and AND logical operators with comparison operators as children. My users write programs to generate queries that aren’t checked for this sort of thing. We’re probably going to write our own org.apache.spark.sql.catalyst.rules.Rule to handle it.
~ Shawn From: Jacek Laskowski [mailto:ja...@japila.pl] Sent: Wednesday, April 26, 2017 2:55 AM To: Lavelle, Shawn <shawn.lave...@osii.com> Cc: user <user@spark.apache.org> Subject: Re: Spark-SQL Query Optimization: overlapping ranges explain it and you'll know what happens under the covers. i.e. Use explain on the Dataset. Jacek On 25 Apr 2017 12:46 a.m., "Lavelle, Shawn" <shawn.lave...@osii.com<mailto:shawn.lave...@osii.com>> wrote: Hello Spark Users! Does the Spark Optimization engine reduce overlapping column ranges? If so, should it push this down to a Data Source? Example, This: Select * from table where col between 3 and 7 OR col between 5 and 9 Reduces to: Select * from table where col between 3 and 9 Thanks for your insight! ~ Shawn M Lavelle [cid:image002.png@01D2BF2D.E0330800] Shawn Lavelle Software Development 4101 Arrowhead Drive Medina, Minnesota 55340-9457 Phone: 763 551 0559<tel:(763)%20551-0559> Fax: 763 551 0750<tel:(763)%20551-0750> Email: shawn.lave...@osii.com<mailto:shawn.lave...@osii.com> Website: www.osii.com<http://www.osii.com>