Re: Increasing size of Batch of prepared statements
Thanks Jens for the comments. As I am trying cassandra stress tool, does it mean that the tool is executing batch of Insert statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, Apologize for the late answer. On Mon, Oct 6, 2014 at 2:38 PM, shahab shahab.mok...@gmail.com wrote: But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? AFAIK, the size _warning_ you are getting relates to the size of the batch of prepared statements (INSERT INTO mykeyspace.mytable VALUES (?,?,?,?)). That is, it has nothing to do with the actual content of your row. 20-30 K shouldn't be a problem. But it's considered good practise to split larger files (maybe 5 MB into chunks) since it makes operations easier to your cluster more likely to spread more evenly across cluster. What shall i do if I want columns with large size? Just don't insert to many rows in a single batch and you should be fine. Like Shane's JIRA ticket said, the warning is to let you know you are not following best practice when adding too many rows in a single batch. It can create bottlenecks in a single Cassandra node. Cheers, Jens -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink
Re: Increasing size of Batch of prepared statements
Hi again Shabab, Yes, it seems that way. I have no experience with the “cassandra stress tool”, but wouldn’t be surprised if the batch size could be tweaked. Cheers, Jens ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter On Thu, Oct 23, 2014 at 10:00 AM, shahab shahab.mok...@gmail.com wrote: Thanks Jens for the comments. As I am trying cassandra stress tool, does it mean that the tool is executing batch of Insert statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, Apologize for the late answer. On Mon, Oct 6, 2014 at 2:38 PM, shahab shahab.mok...@gmail.com wrote: But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? AFAIK, the size _warning_ you are getting relates to the size of the batch of prepared statements (INSERT INTO mykeyspace.mytable VALUES (?,?,?,?)). That is, it has nothing to do with the actual content of your row. 20-30 K shouldn't be a problem. But it's considered good practise to split larger files (maybe 5 MB into chunks) since it makes operations easier to your cluster more likely to spread more evenly across cluster. What shall i do if I want columns with large size? Just don't insert to many rows in a single batch and you should be fine. Like Shane's JIRA ticket said, the warning is to let you know you are not following best practice when adding too many rows in a single batch. It can create bottlenecks in a single Cassandra node. Cheers, Jens -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink
Re: Increasing size of Batch of prepared statements
OK, Thanks again Jens. best, /Shahab On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil jens.ran...@tink.se wrote: Hi again Shabab, Yes, it seems that way. I have no experience with the “cassandra stress tool”, but wouldn’t be surprised if the batch size could be tweaked. Cheers, Jens ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter On Thu, Oct 23, 2014 at 10:00 AM, shahab shahab.mok...@gmail.com wrote: Thanks Jens for the comments. As I am trying cassandra stress tool, does it mean that the tool is executing batch of Insert statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, Apologize for the late answer. On Mon, Oct 6, 2014 at 2:38 PM, shahab shahab.mok...@gmail.com wrote: But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? AFAIK, the size _warning_ you are getting relates to the size of the batch of prepared statements (INSERT INTO mykeyspace.mytable VALUES (?,?,?,?)). That is, it has nothing to do with the actual content of your row. 20-30 K shouldn't be a problem. But it's considered good practise to split larger files (maybe 5 MB into chunks) since it makes operations easier to your cluster more likely to spread more evenly across cluster. What shall i do if I want columns with large size? Just don't insert to many rows in a single batch and you should be fine. Like Shane's JIRA ticket said, the warning is to let you know you are not following best practice when adding too many rows in a single batch. It can create bottlenecks in a single Cassandra node. Cheers, Jens -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink
Re: Increasing size of Batch of prepared statements
CASSANDRA-8091 (Stress tool creates too large batches) is relevant: https://issues.apache.org/jira/browse/CASSANDRA-8091 On Thu, Oct 23, 2014 at 6:28 AM, shahab shahab.mok...@gmail.com wrote: OK, Thanks again Jens. best, /Shahab On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil jens.ran...@tink.se wrote: Hi again Shabab, Yes, it seems that way. I have no experience with the “cassandra stress tool”, but wouldn’t be surprised if the batch size could be tweaked. Cheers, Jens ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter On Thu, Oct 23, 2014 at 10:00 AM, shahab shahab.mok...@gmail.com wrote: Thanks Jens for the comments. As I am trying cassandra stress tool, does it mean that the tool is executing batch of Insert statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, Apologize for the late answer. On Mon, Oct 6, 2014 at 2:38 PM, shahab shahab.mok...@gmail.com wrote: But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? AFAIK, the size _warning_ you are getting relates to the size of the batch of prepared statements (INSERT INTO mykeyspace.mytable VALUES (?,?,?,?)). That is, it has nothing to do with the actual content of your row. 20-30 K shouldn't be a problem. But it's considered good practise to split larger files (maybe 5 MB into chunks) since it makes operations easier to your cluster more likely to spread more evenly across cluster. What shall i do if I want columns with large size? Just don't insert to many rows in a single batch and you should be fine. Like Shane's JIRA ticket said, the warning is to let you know you are not following best practice when adding too many rows in a single batch. It can create bottlenecks in a single Cassandra node. Cheers, Jens -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink -- Tyler Hobbs DataStax http://datastax.com/
Re: Increasing size of Batch of prepared statements
Thanks Tyler for sharing this. It is exactly what I was looking for to know. best, /Shahab On Thu, Oct 23, 2014 at 5:37 PM, Tyler Hobbs ty...@datastax.com wrote: CASSANDRA-8091 (Stress tool creates too large batches) is relevant: https://issues.apache.org/jira/browse/CASSANDRA-8091 On Thu, Oct 23, 2014 at 6:28 AM, shahab shahab.mok...@gmail.com wrote: OK, Thanks again Jens. best, /Shahab On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil jens.ran...@tink.se wrote: Hi again Shabab, Yes, it seems that way. I have no experience with the “cassandra stress tool”, but wouldn’t be surprised if the batch size could be tweaked. Cheers, Jens ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter On Thu, Oct 23, 2014 at 10:00 AM, shahab shahab.mok...@gmail.com wrote: Thanks Jens for the comments. As I am trying cassandra stress tool, does it mean that the tool is executing batch of Insert statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, Apologize for the late answer. On Mon, Oct 6, 2014 at 2:38 PM, shahab shahab.mok...@gmail.com wrote: But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? AFAIK, the size _warning_ you are getting relates to the size of the batch of prepared statements (INSERT INTO mykeyspace.mytable VALUES (?,?,?,?)). That is, it has nothing to do with the actual content of your row. 20-30 K shouldn't be a problem. But it's considered good practise to split larger files (maybe 5 MB into chunks) since it makes operations easier to your cluster more likely to spread more evenly across cluster. What shall i do if I want columns with large size? Just don't insert to many rows in a single batch and you should be fine. Like Shane's JIRA ticket said, the warning is to let you know you are not following best practice when adding too many rows in a single batch. It can create bottlenecks in a single Cassandra node. Cheers, Jens -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink -- Tyler Hobbs DataStax http://datastax.com/
Re: Increasing size of Batch of prepared statements
Shabab, Apologize for the late answer. On Mon, Oct 6, 2014 at 2:38 PM, shahab shahab.mok...@gmail.com wrote: But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? AFAIK, the size _warning_ you are getting relates to the size of the batch of prepared statements (INSERT INTO mykeyspace.mytable VALUES (?,?,?,?)). That is, it has nothing to do with the actual content of your row. 20-30 K shouldn't be a problem. But it's considered good practise to split larger files (maybe 5 MB into chunks) since it makes operations easier to your cluster more likely to spread more evenly across cluster. What shall i do if I want columns with large size? Just don't insert to many rows in a single batch and you should be fine. Like Shane's JIRA ticket said, the warning is to let you know you are not following best practice when adding too many rows in a single batch. It can create bottlenecks in a single Cassandra node. Cheers, Jens -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook https://www.facebook.com/#!/tink.se Linkedin http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary Twitter https://twitter.com/tink
Re: Increasing size of Batch of prepared statements
Thanks Jens for the comment. Actually I am using Cassandra Stress Tool and this is the tools who inserts such a large statements. But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra? What shall i do if I want columns with large size? best, /Shahab On Sun, Oct 5, 2014 at 6:03 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, If you are hitting this limit because you are inserting a lot of (CQL) rows in a single batch I suggest you split the statement up in multiple smaller batches. Generally, large inserts like this will not perform very well. Cheers, Jens — Sent from Mailbox https://www.dropbox.com/mailbox On Fri, Oct 3, 2014 at 6:47 PM, shahab shahab.mok...@gmail.com wrote: Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258 - Batch of prepared statements for [mydb.mycf] is of size 3272725, exceeding specified threshold of 5120 by 3267605. Apparently it relates to the (default) size of prepared insert statement . Is there any way to change the default value? thanks /Shahab
Re: Increasing size of Batch of prepared statements
Thanks Shane. best, /Shahab On Fri, Oct 3, 2014 at 6:51 PM, Shane Hansen shanemhan...@gmail.com wrote: It appears to be configurable in cassandra.yaml using batch_size_warn_threshold https://issues.apache.org/jira/browse/CASSANDRA-6487 On Fri, Oct 3, 2014 at 10:47 AM, shahab shahab.mok...@gmail.com wrote: Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258 - Batch of prepared statements for [mydb.mycf] is of size 3272725, exceeding specified threshold of 5120 by 3267605. Apparently it relates to the (default) size of prepared insert statement . Is there any way to change the default value? thanks /Shahab
Re: Increasing size of Batch of prepared statements
Shabab,If you are hitting this limit because you are inserting a lot of (CQL) rows in a single batch I suggest you split the statement up in multiple smaller batches. Generally, large inserts like this will not perform very well. Cheers, Jens — Sent from Mailbox On Fri, Oct 3, 2014 at 6:47 PM, shahab shahab.mok...@gmail.com wrote: Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258 - Batch of prepared statements for [mydb.mycf] is of size 3272725, exceeding specified threshold of 5120 by 3267605. Apparently it relates to the (default) size of prepared insert statement . Is there any way to change the default value? thanks /Shahab
Increasing size of Batch of prepared statements
Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258 - Batch of prepared statements for [mydb.mycf] is of size 3272725, exceeding specified threshold of 5120 by 3267605. Apparently it relates to the (default) size of prepared insert statement . Is there any way to change the default value? thanks /Shahab
Re: Increasing size of Batch of prepared statements
It appears to be configurable in cassandra.yaml using batch_size_warn_threshold https://issues.apache.org/jira/browse/CASSANDRA-6487 On Fri, Oct 3, 2014 at 10:47 AM, shahab shahab.mok...@gmail.com wrote: Hi, I am getting the following warning in the cassandra log: BatchStatement.java:258 - Batch of prepared statements for [mydb.mycf] is of size 3272725, exceeding specified threshold of 5120 by 3267605. Apparently it relates to the (default) size of prepared insert statement . Is there any way to change the default value? thanks /Shahab