I'm trying to deal with some code that runs differently on Spark stand-alone mode and Spark running on a cluster. Basically, for each item in an RDD, I'm trying to add it to a list, and once this is done, I want to send this list to Solr.
This works perfectly fine when I run the following code in stand-alone mode of Spark, but does not work when the same code is run on a cluster. When I run the same code on a cluster, it is like "send to Solr" part of the code is executed before the list to be sent to Solr is filled with items. I try to force the execution by solrInputDocumentJavaRDD.collect(); after foreach, but it seems like it does not have any effect. // For each RDD solrInputDocumentJavaDStream.foreachRDD( new Function<JavaRDD<SolrInputDocument>, Void>() { @Override public Void call(JavaRDD<SolrInputDocument> solrInputDocumentJavaRDD) throws Exception { // For each item in a single RDD solrInputDocumentJavaRDD.foreach( new VoidFunction<SolrInputDocument>() { @Override public void call(SolrInputDocument solrInputDocument) { // Add the solrInputDocument to the list of SolrInputDocuments SolrIndexerDriver.solrInputDocumentList.add(solrInputDocument); } }); // Try to force execution solrInputDocumentJavaRDD.collect(); // After having finished adding every SolrInputDocument to the list // add it to the solrServer, and commit, waiting for the commit to be flushed try { // Seems like when run in cluster mode, the list size is zero, // therefore the following part is never executed if (SolrIndexerDriver.solrInputDocumentList != null && SolrIndexerDriver.solrInputDocumentList.size() > 0) { SolrIndexerDriver.solrServer.add(SolrIndexerDriver.solrInputDocumentList); SolrIndexerDriver.solrServer.commit(true, true); SolrIndexerDriver.solrInputDocumentList.clear(); } } catch (SolrServerException | IOException e) { e.printStackTrace(); } return null; } } ); What should I do, so that sending-to-Solr part executes after the list of SolrDocuments are added to solrInputDocumentList (and works also in cluster mode)? -- Emre Sevinç