Thank you for your reply! I have already done the change locally. So for changing it would be fine.
I just wanted to be sure which way is correct. On 9 Dec 2015 18:20, "Fengdong Yu" <fengdo...@everstring.com> wrote: > I don’t think there is performance difference between 1.x API and 2.x API. > > but it’s not a big issue for your change, only > com.databricks.hadoop.mapreduce.lib.input.XmlInputFormat.java > <https://github.com/databricks/spark-xml/blob/master/src/main/java/com/databricks/hadoop/mapreduce/lib/input/XmlInputFormat.java> > need to change, right? > > It’s not a big change to 2.x API. if you agree, I can do, but I cannot > promise the time within one or two weeks because of my daily job. > > > > > > On Dec 9, 2015, at 5:01 PM, Hyukjin Kwon <gurwls...@gmail.com> wrote: > > Hi all, > > I am writing this email to both user-group and dev-group since this is > applicable to both. > > I am now working on Spark XML datasource ( > https://github.com/databricks/spark-xml). > This uses a InputFormat implementation which I downgraded to Hadoop 1.x > for version compatibility. > > However, I found all the internal JSON datasource and others in Databricks > use Hadoop 2.x API dealing with TaskAttemptContextImpl by reflecting the > method for this because TaskAttemptContext is a class in Hadoop 1.x and an > interface in Hadoop 2.x. > > So, I looked through the codes for some advantages for Hadoop 2.x API but > I couldn't. > I wonder if there are some advantages for using Hadoop 2.x API. > > I understand that it is still preferable to use Hadoop 2.x APIs at least > for future differences but somehow I feel like it might not have to use > Hadoop 2.x by reflecting a method. > > I would appreciate that if you leave a comment here > https://github.com/databricks/spark-xml/pull/14 as well as sending back a > reply if there is a good explanation > > Thanks! > > >