Awesome. Thanks Best Regards
On Tue, Jan 13, 2015 at 10:35 PM, Ankur Srivastava < ankur.srivast...@gmail.com> wrote: > I realized that I was running the cluster with > spark.cassandra.output.concurrent.writes=2, > changing it to 1 did the trick. We realized that the issue was because > spark was producing data at much higher frequency than our small Cassandra > cluster could write and so changing the property value to 1 fixed the issue > for us. > > Thanks > Ankur > > On Mon, Jan 12, 2015 at 9:04 AM, Ankur Srivastava < > ankur.srivast...@gmail.com> wrote: > >> Hi Akhil, >> >> Thank you for the pointers. Below is how we are saving data to Cassandra. >> >> javaFunctions(rddToSave).writerBuilder(datapipelineKeyspace, >> >> datapipelineOutputTable, mapToRow(Sample.class)) >> >> The data we are saving at this stage is ~200 million rows. >> >> How do we control application threads in spark so that it does not exceed >> "rpc_max_threads"? We are running with default value of this property in >> cassandra.yaml. I have already set these >> two properties for Spark-Cassandra connector: >> >> spark.cassandra.output.batch.size.rows=1 >> spark.cassandra.output.concurrent.writes=1 >> >> Thanks >> - Ankur >> >> >> On Sun, Jan 11, 2015 at 10:16 PM, Akhil Das <ak...@sigmoidanalytics.com> >> wrote: >> >>> I see, can you paste the piece of code? Its probably because you are >>> exceeding the number of connection that are specified in the >>> property rpc_max_threads. Make sure you close all the connections properly. >>> >>> Thanks >>> Best Regards >>> >>> On Mon, Jan 12, 2015 at 7:45 AM, Ankur Srivastava < >>> ankur.srivast...@gmail.com> wrote: >>> >>>> Hi Akhil, thank you for your response. >>>> >>>> Actually we are first reading from cassandra and then writing back >>>> after doing some processing. All the reader stages succeed with no error >>>> and many writer stages also succeed but many fail as well. >>>> >>>> Thanks >>>> Ankur >>>> >>>> On Sat, Jan 10, 2015 at 10:15 PM, Akhil Das <ak...@sigmoidanalytics.com >>>> > wrote: >>>> >>>>> Just make sure you are not connecting to the Old RPC Port (9160), new >>>>> binary port is running on 9042. >>>>> >>>>> What is your rpc_address listed in cassandra.yaml? Also make sure you >>>>> have start_native_transport: *true *in the yaml file. >>>>> >>>>> Thanks >>>>> Best Regards >>>>> >>>>> On Sat, Jan 10, 2015 at 8:44 AM, Ankur Srivastava < >>>>> ankur.srivast...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We are currently using spark to join data in Cassandra and then write >>>>>> the results back into Cassandra. While reads happen with out any error >>>>>> during the writes we see many exceptions like below. Our environment >>>>>> details are: >>>>>> >>>>>> - Spark v 1.1.0 >>>>>> - spark-cassandra-connector-java_2.10 v 1.1.0 >>>>>> >>>>>> We are using below settings for the writer >>>>>> >>>>>> spark.cassandra.output.batch.size.rows=1 >>>>>> >>>>>> spark.cassandra.output.concurrent.writes=1 >>>>>> >>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All >>>>>> host(s) tried for query failed (tried: [] - use getErrors() for details) >>>>>> >>>>>> at >>>>>> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) >>>>>> >>>>>> at >>>>>> com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) >>>>>> >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>> >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>> >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> >>>>>> Thanks >>>>>> >>>>>> Ankur >>>>>> >>>>> >>>>> >>>> >>> >> >