[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251101#comment-15251101 ] Stefania commented on CASSANDRA-11474: -- Thank you, committed as eb072a0fa05292dd347e96d3bc45b445995227ec and pull request merged. > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania >Priority: Minor > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249720#comment-15249720 ] Paulo Motta commented on CASSANDRA-11474: - LGTM, I think we don't need to restrict {{test_reading_counter_without_batching}} to version <= 3.5 (even though we currently don't exercise a specific code path on trunk, it can catch stuff in the future). If you agree, can you remove that restriction in the PR? After that you can mark this as ready to commit. Thank you! > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania >Priority: Minor > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249338#comment-15249338 ] Stefania commented on CASSANDRA-11474: -- CI results are good. [~pauloricardomg] are you OK with the trunk patch as well now? > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania >Priority: Minor > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242254#comment-15242254 ] Stefania commented on CASSANDRA-11474: -- Thanks for the review. I've removed the single insert special case from the patch on trunk but left the fix on reporting errors during the initialization phase of worker processes. I've also prepared the ticket for commit by editing CHANGES.txt and the commit messages, please check if the description seems reasonable. I've restarted CI, results are still pending. Also, here is the dtest [pull request|https://github.com/riptano/cassandra-dtest/pull/928]. > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania >Priority: Minor > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241121#comment-15241121 ] Paulo Motta commented on CASSANDRA-11474: - LGTM, but given that CASSANDRA-10876 removed this limitation for single partition batches on trunk, special casing single mutation batches on COPY FROM doesn't bring us much additional benefits, so I think we should include this only on 2.2 and 3.0 for code simplicity. Could you provide a trunk patch without the single insert optimization, but only with the error report and empty chunk fixes? > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania >Priority: Minor > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229940#comment-15229940 ] Stefania commented on CASSANDRA-11474: -- Thanks for pointing this out, it wasn't clear to me. Until now I've assumed {{batch_size_fail_threshold_in_kb}} was in place to protect against LOGGED batches that are too large. Batching was introduced only recently in COPY FROM (CASSANDRA-9302), so my concern was to restore any functionality that might have been removed by it. > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania >Priority: Minor > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229911#comment-15229911 ] Sylvain Lebresne commented on CASSANDRA-11474: -- I haven't looked carefully at the patch as I'm not too familiar with that code and I'll let somewhat more familiar review. But I'll note for the record that it's imo wrong that {{batch_size_fail_threshold_in_kb}} doesn't apply to {{INSERT}} too (I've created CASSANDRA-11522 for the details) and that {{COPY FROM}} shouldn't be in the business of avoiding server protections: If the user configures his server so that inserts above a certain size are rejected, then an import that violates that threshold _should_ fail. With that said, it kind of make sense I suppose to avoid batches when we don't have to on principle so I don't object to this patch in practice, I just disagree with the justification (and thus think this is pretty minor in practice). > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229696#comment-15229696 ] Stefania commented on CASSANDRA-11474: -- CI looks good, this is ready for review. > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229555#comment-15229555 ] Stefania commented on CASSANDRA-11474: -- Patches and CI available here: ||2.2||3.0||trunk|| |[patch|https://github.com/stef1927/cassandra/commits/11474-2.2]|[patch|https://github.com/stef1927/cassandra/commits/11474-3.0]|[patch|https://github.com/stef1927/cassandra/commits/11474]| |[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11474-2.2-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11474-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11474-dtest/]| There is a conflict from 2.2 to 3.0 whilst the 3.0 patch merges cleanly into trunk. I should also note that this is a pretty serious limitation of COPY FROM, albeit very unlikely to occur. However, we don't need the fix in 2.1 because {{batch_size_fail_threshold_in_kb}} is only available in 2.2+. Additional dtest to reproduce the problem are available [here|https://github.com/stef1927/cassandra-dtest/tree/11474]. > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: lhf > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches
[ https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223589#comment-15223589 ] Stefania commented on CASSANDRA-11474: -- We should also make sure that if a child process aborts during startup, then the error message is displayed to the user. At the moment the only way to see exceptions thrown by worker processes during startup is via {{--debug}}. > cqlsh: COPY FROM should use regular inserts for single statement batches > > > Key: CASSANDRA-11474 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11474 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > I haven't reproduced it with a test yet but, from code inspection, if CQL > rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter > cannot be changed, then data import will fail. > Users can control the batch size by setting MAXBATCHSIZE. > If a batch contains a single statement, there is no need to use a batch and > we should use normal inserts instead or, alternatively, we should skip the > batch size check for unlogged batches with only one statement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)