On 3 Feb 2017, at 21:28, Jacek Laskowski <ja...@japila.pl<mailto:ja...@japila.pl>> wrote:
Hi Sean, Given that 3.0.0 is coming, removing the unused versions would be a huge benefit from maintenance point of view. I'd support removing support for 2.5 and earlier. Speaking of Hadoop support, is anyone considering 3.0.0 support? Can't find any JIRA for this. As it stands, hive 1.2.x rejects 3 as a supported Hadoop version, so dataframes won't work https://issues.apache.org/jira/browse/SPARK-18673 There's a quick fix to get hadoop to lie about what version it is to keep hive quiet, building hadoop with -Ddeclared.hadoop.version=2.11 to force it into 2.11, but that's not production. It does at least verify that nobody has broken any of the APIs (at least excluding those called via reflection on codepaths not tested in unit testing) the full Hive patch is very much a WiP and its aimed at Hive 2 https://issues.apache.org/jira/browse/HIVE-15016 ...which means either backporting to the org,spark-project hive 1.2 fork or moving up to Hive 2, which is inevitably going to be a major change