Hello, I have a simple word count example in Java and I can run this in Eclipse (code at the bottom)
I then create a jar file from it and try to run it from the cmd java -jar C:\Users\Owner\Desktop\wordcount.jar Data/testfile.txt But I get this error? I think the main error is: *Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: text* Any advise on how to run this jar file in spark would be appreciated Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/12/07 15:16:41 INFO SparkContext: Running Spark version 2.0.2 16/12/07 15:16:42 INFO SecurityManager: Changing view acls to: Owner 16/12/07 15:16:42 INFO SecurityManager: Changing modify acls to: Owner 16/12/07 15:16:42 INFO SecurityManager: Changing view acls groups to: 16/12/07 15:16:42 INFO SecurityManager: Changing modify acls groups to: 16/12/07 15:16:42 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Owner); groups with view permissions: Set(); users with modify permissions: Set(Owner); groups with modify permissions: Set() 16/12/07 15:16:44 INFO Utils: Successfully started service 'sparkDriver' on port 10211. 16/12/07 15:16:44 INFO SparkEnv: Registering MapOutputTracker 16/12/07 15:16:44 INFO SparkEnv: Registering BlockManagerMaster 16/12/07 15:16:44 INFO DiskBlockManager: Created local directory at C:\Users\Owner\AppData\Local\Temp\blockmgr-b4b1960b-08fc-44fd-a75e-1a0450556873 16/12/07 15:16:44 INFO MemoryStore: MemoryStore started with capacity 1984.5 MB 16/12/07 15:16:45 INFO SparkEnv: Registering OutputCommitCoordinator 16/12/07 15:16:45 INFO Utils: Successfully started service 'SparkUI' on port 4040. 16/12/07 15:16:45 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.19.2:4040 16/12/07 15:16:45 INFO Executor: Starting executor ID driver on host localhost 16/12/07 15:16:45 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 10252. 16/12/07 15:16:45 INFO NettyBlockTransferService: Server created on 192.168.19.2:10252 16/12/07 15:16:45 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.19.2, 10252) 16/12/07 15:16:45 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.19.2:10252 with 1984.5 MB RAM, BlockManagerId(driver, 192.168.19.2, 10252) 16/12/07 15:16:45 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.19.2, 10252) 16/12/07 15:16:46 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect. 16/12/07 15:16:46 INFO SharedState: Warehouse path is 'file:/C:/Users/Owner/spark-warehouse'. Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: text. Please find packages at https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:148) at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:79) at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:79) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:504) at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:540) at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:513) at JavaWordCount.main(JavaWordCount.java:57) Caused by: java.lang.ClassNotFoundException: text.DefaultSource at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:132) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:132) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:132) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:132) at scala.util.Try.orElse(Try.scala:84) at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:132) ... 8 more 16/12/07 15:16:46 INFO SparkContext: Invoking stop() from shutdown hook 16/12/07 15:16:46 INFO SparkUI: Stopped Spark web UI at http://192.168.19.2:4040 16/12/07 15:16:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/12/07 15:16:46 INFO MemoryStore: MemoryStore cleared 16/12/07 15:16:46 INFO BlockManager: BlockManager stopped 16/12/07 15:16:46 INFO BlockManagerMaster: BlockManagerMaster stopped 16/12/07 15:16:46 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/12/07 15:16:46 INFO SparkContext: Successfully stopped SparkContext 16/12/07 15:16:46 INFO ShutdownHookManager: Shutdown hook called 16/12/07 15:16:46 INFO ShutdownHookManager: Deleting directory C:\Users\Owner\AppData\Local\Temp\spark-dab2587b-a794-4947-ac13-d40056cf71d8 C:\Users\Owner> public final class JavaWordCount { private static final Pattern SPACE = Pattern.compile(" "); public static void main(String[] args) throws Exception { if (args.length < 1) { System.err.println("Usage: JavaWordCount <file>"); System.exit(1); } //boiler plate needed to run locally SparkConf conf = new SparkConf().setAppName("Word Count Application").setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(conf); SparkSession spark = SparkSession .builder() .appName("Word Count") .getOrCreate() .newSession(); JavaRDD<String> lines = spark.read().textFile(args[0]).javaRDD(); JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>() { @Override public Iterator<String> call(String s) { return Arrays.asList(SPACE.split(s)).iterator(); } }); JavaPairRDD<String, Integer> ones = words.mapToPair( new PairFunction<String, String, Integer>() { @Override public Tuple2<String, Integer> call(String s) { return new Tuple2<>(s, 1); } }); JavaPairRDD<String, Integer> counts = ones.reduceByKey( new Function2<Integer, Integer, Integer>() { @Override public Integer call(Integer i1, Integer i2) { return i1 + i2; } }); List<Tuple2<String, Integer>> output = counts.collect(); for (Tuple2<?,?> tuple : output) { System.out.println(tuple._1() + ": " + tuple._2()); } spark.stop(); } } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-from-Eclipse-and-then-Jar-tp28182.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org