Hi I use gradle and I don't think it really has "provided" but I was able to google and create the following file but the same error still persist. group 'com.company'version '1.0-SNAPSHOT' apply plugin: 'java'apply plugin: 'idea' repositories { mavenCentral() mavenLocal()} configurations { provided}sourceSets { main { compileClasspath += configurations.provided test.compileClasspath += configurations.provided test.runtimeClasspath += configurations.provided }} idea { module { scopes.PROVIDED.plus += [ configurations.provided ] }} dependencies { compile 'org.slf4j:slf4j-log4j12:1.7.12' provided group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.0.0' provided group: 'org.apache.spark', name: 'spark-streaming_2.11', version: '2.0.0' provided group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.0.0' provided group: 'com.datastax.spark', name: 'spark-cassandra-connector_2.11', version: '2.0.0-M3'}
jar { from { configurations.provided.collect { it.isDirectory() ? it : zipTree(it) } } // with jar from sourceSets.test.output manifest { attributes 'Main-Class': "com.company.batchprocessing.Hello" } exclude 'META-INF/.RSA', 'META-INF/.SF', 'META-INF/*.DSA' zip64 true} This successfully creates the jar but the error still persists. On Sun, Oct 9, 2016 11:44 PM, Shixiong(Ryan) Zhu shixi...@databricks.com wrote: Seems the runtime Spark is different from the compiled one. You should mark the Spark components "provided". See https://issues.apache.org/jira/browse/SPARK-9219 On Sun, Oct 9, 2016 at 8:13 PM, kant kodali <kanth...@gmail.com> wrote: I tried SpanBy but look like there is a strange error that happening no matter which way I try. Like the one here described for Java solution. http://qaoverflow.com/question/how-to-use-spanby-in-java/ java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD JavaPairRDD<ByteBuffer, Iterable<CassandraRow>> cassandraRowsRDD=javaFunctions (sc).cassandraTable("test", "hello" ) .select("col1", "col2", "col3" ) .spanBy(newFunction<CassandraRow, ByteBuffer>() { @Override publicByteBuffer call(CassandraRow v1) { returnv1.getBytes("rowkey"); } }, ByteBuffer.class); And then here I do this here is where the problem occurs List<Tuple2<ByteBuffer, Iterable<CassandraRow>>> listOftuples = cassandraRowsRDD.collect(); // ERROR OCCURS HERE Tuple2<ByteBuffer, Iterable<CassandraRow>> tuple = listOftuples.iterator().next(); ByteBuffer partitionKey = tuple._1(); for(CassandraRow cassandraRow: tuple._2()) { System.out.println(cassandraRow.getLong("col1")); } so I tried this and same error Iterable<Tuple2<ByteBuffer, Iterable<CassandraRow>>> listOftuples = cassandraRowsRDD.collect(); // ERROR OCCURS HERE Tuple2<ByteBuffer, Iterable<CassandraRow>> tuple = listOftuples.iterator().next(); ByteBuffer partitionKey = tuple._1(); for(CassandraRow cassandraRow: tuple._2()) { System.out.println(cassandraRow.getLong("col1")); } Although I understand that ByteBuffers aren't serializable I didn't get any not serializable exception but still I went head and changed everything to byte[] so no more ByteBuffers in the code. I have also tried cassandraRowsRDD.collect().forEach() and cassandraRowsRDD.stream().forEachPartition() and the same exact error occurs. I am running everything locally and in a stand alone mode so my spark cluster is just running on localhost. Scala code runner version 2.11.8 // when I run scala -version or even ./spark-shell compile group: 'org.apache.spark' name: 'spark-core_2.11' version: '2.0.0' compile group: 'org.apache.spark' name: 'spark-streaming_2.11' version: '2.0.0' compile group: 'org.apache.spark' name: 'spark-sql_2.11' version: '2.0.0' compile group: 'com.datastax.spark' name: 'spark-cassandra-connector_2.11' version: '2.0.0-M3': So I don't see anything wrong with these versions. 2) I am bundling everything into one jar and so far it did worked out well except for this error. I am using Java 8 and Gradle. any ideas on how I can fix this?