I found that to solve this problem I need to use BUILD-SNAPSHOT version of
elasticsearch-hadoop.
After adding below entries to pom.xml it started to work.
...
<repository>
<id>sonatype-oss</id>
<url>http://oss.sonatype.org/content/repositories/snapshots</url>
<snapshots><enabled>true</enabled></snapshots>
</repository>
...
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop</artifactId>
<version>2.1.0.BUILD-SNAPSHOT</version>
</dependency>
On Thursday, March 26, 2015 at 10:12:25 AM UTC-4, Dmitriy Fingerman wrote:
>
> Hi,
>
> I am trying to read from Elasticsearch using Spark SQL and getting the
> exception below.
> My environment is CDH 5.3 with Spark 1.2.0 and Elasticsearch 1.4.4.
> Since Spark SQL is not officially supported on CDH 5.3, I added the Hive
> Jars to Spark classpath in compute-classpath.sh.
> I also added elasticsearch-hadoop-2.1.0.Beta3.jar to the Spark classpath
> in compute-classpath.sh.
> Also, I tried adding the Hive, elasticsearch-hadoop and elasticseach-spark
> Jars to SPARK_CLASSPATH environment variable prior to running spark-submit,
> but got the same exception.
>
> Exception in thread "main" java.lang.RuntimeException: Failed to load
> class for data source: org.elasticsearch.spark.sql
> at scala.sys.package$.error(package.scala:27)
> at org.apache.spark.sql.sources.CreateTableUsing.run(ddl.scala:99)
> at
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:67)
> at
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:67)
> at
> org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:75)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
> at
> org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
> at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
> at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
> at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:303)
> at com.informatica.sats.datamgtsrv.Percolator$.main(Percolator.scala:29)
> at com.informatica.sats.datamgtsrv.Percolator.main(Percolator.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> The code with I am trying to run:
>
> import org.apache.spark._
> import org.apache.spark.sql._
> import org.apache.spark.SparkContext._
> import org.elasticsearch.spark._
> import org.elasticsearch.spark.sql._
>
> object MyTest
> {
> def main(args: Array[String])
> {
> val sparkConf = new SparkConf().setAppName("MyTest")
> val sc = new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sc)
>
> sqlContext.sql("CREATE TEMPORARY TABLE INTERVALS " +
> "USING org.elasticsearch.spark.sql " +
> "OPTIONS (resource 'events/intervals') " )
>
> val allRDD = sqlContext.sql("SELECT * FROM INTERVALS")
>
> allRDD.foreach(rdd => {rdd.foreach(elem => print(elem + "\n\n"));})
> }
> }
>
> I checked in Spark source code ( resource
> org\apache\spark\sql\sources\ddl.scala ) and saw that the run method in
> CreateTableUsing class expects "DefaultSource.class" file for the data
> source that needs to be loaded.
> However, there is no such class in org.elasticsearch.spark.sql package in
> the official Elasticsearch builds.
> I checked in following jars:
>
> elasticsearch-spark_2.10-2.1.0.Beta3.jar
> elasticsearch-spark_2.10-2.1.0.Beta2.jar
> elasticsearch-spark_2.10-2.1.0.Beta1.jar
> elasticsearch-hadoop-2.1.0.Beta3.jar
>
> Can you please advise why this problem happens and how to resolve it?
>
> Thanks,
> Dmitriy Fingerman
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c6dc8980-a8b6-495d-88ad-477662b7b66e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.