It proceeded with the jars I mentioned. However no data getting loaded into data frame...
sob sob :( On Fri, Jun 17, 2016 at 4:25 PM, VG <vlin...@gmail.com> wrote: > Hi Siva > > This is what i have for jars. Did you manage to run with these or > different versions ? > > > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-core_2.10</artifactId> > <version>1.6.1</version> > </dependency> > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-sql_2.10</artifactId> > <version>1.6.1</version> > </dependency> > <dependency> > <groupId>com.databricks</groupId> > <artifactId>spark-xml_2.10</artifactId> > <version>0.2.0</version> > </dependency> > <dependency> > <groupId>org.scala-lang</groupId> > <artifactId>scala-library</artifactId> > <version>2.10.6</version> > </dependency> > > Thanks > VG > > > On Fri, Jun 17, 2016 at 4:16 PM, Siva A <siva9940261...@gmail.com> wrote: > >> Hi Marco, >> >> I did run in IDE(Intellij) as well. It works fine. >> VG, make sure the right jar is in classpath. >> >> --Siva >> >> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni <mmistr...@gmail.com> >> wrote: >> >>> and your eclipse path is correct? >>> i suggest, as Siva did before, to build your jar and run it via >>> spark-submit by specifying the --packages option >>> it's as simple as run this command >>> >>> spark-submit --packages >>> com.databricks:spark-xml_<scalaversion>:<packageversion> --class <Name of >>> your class containing main> <path to your jar> >>> >>> Indeed, if you have only these lines to run, why dont you try them in >>> spark-shell ? >>> >>> hth >>> >>> On Fri, Jun 17, 2016 at 11:32 AM, VG <vlin...@gmail.com> wrote: >>> >>>> nopes. eclipse. >>>> >>>> >>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <siva9940261...@gmail.com> >>>> wrote: >>>> >>>>> If you are running from IDE, Are you using Intellij? >>>>> >>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <siva9940261...@gmail.com> >>>>> wrote: >>>>> >>>>>> Can you try to package as a jar and run using spark-submit >>>>>> >>>>>> Siva >>>>>> >>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vlin...@gmail.com> wrote: >>>>>> >>>>>>> I am trying to run from IDE and everything else is working fine. >>>>>>> I added spark-xml jar and now I ended up into this dependency >>>>>>> >>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager >>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError: >>>>>>> scala/collection/GenTraversableOnce$class* >>>>>>> at >>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150) >>>>>>> at >>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154) >>>>>>> at >>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) >>>>>>> at >>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) >>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19) >>>>>>> Caused by:* java.lang.ClassNotFoundException: >>>>>>> scala.collection.GenTraversableOnce$class* >>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>>>> ... 5 more >>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown >>>>>>> hook >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mmistr...@gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> So you are using spark-submit or spark-shell? >>>>>>>> >>>>>>>> you will need to launch either by passing --packages option (like >>>>>>>> in the example below for spark-csv). you will need to iknow >>>>>>>> >>>>>>>> --packages com.databricks:spark-xml_<scala.version>:<package >>>>>>>> version> >>>>>>>> >>>>>>>> hth >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vlin...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Apologies for that. >>>>>>>>> I am trying to use spark-xml to load data of a xml file. >>>>>>>>> >>>>>>>>> here is the exception >>>>>>>>> >>>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager >>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: >>>>>>>>> Failed to find data source: org.apache.spark.xml. Please find >>>>>>>>> packages at >>>>>>>>> http://spark-packages.org >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) >>>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19) >>>>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>>>> org.apache.spark.xml.DefaultSource >>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62) >>>>>>>>> at scala.util.Try$.apply(Try.scala:192) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62) >>>>>>>>> at scala.util.Try.orElse(Try.scala:84) >>>>>>>>> at >>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62) >>>>>>>>> ... 4 more >>>>>>>>> >>>>>>>>> Code >>>>>>>>> SQLContext sqlContext = new SQLContext(sc); >>>>>>>>> DataFrame df = sqlContext.read() >>>>>>>>> .format("org.apache.spark.xml") >>>>>>>>> .option("rowTag", "row") >>>>>>>>> .load("A.xml"); >>>>>>>>> >>>>>>>>> Any suggestions please .. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni < >>>>>>>>> mmistr...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> too little info >>>>>>>>>> it'll help if you can post the exception and show your sbt file >>>>>>>>>> (if you are using sbt), and provide minimal details on what you are >>>>>>>>>> doing >>>>>>>>>> kr >>>>>>>>>> >>>>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vlin...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Failed to find data source: com.databricks.spark.xml >>>>>>>>>>> >>>>>>>>>>> Any suggestions to resolve this >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >