[
https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728422#comment-14728422
]
Stefania commented on CASSANDRA-9304:
-------------------------------------
The {{RateLimiter}} still seems a bit off. It looked kind of wrong before as
you pointed out. It's not terribly important but I think this line
{{self.current_rate = (self.current_rate + new_rate) / 2.0}} was meant as an
average between the current rate and the new one. So the first time, when
{{current_rate}} is zero, it should not divide by 2 or else we report half the
rate. Secondly, when we calculate the new rate as {{n / difference}}, we may
miss records because {{n}} is the number of records passed to every call whilst
{{difference}} is the time elapsed since the last time we logged. I wouldn't
calculate the rate every time either, but only when logging it. If
{{current_record}} cannot be reset to zero after logging it (maybe this was the
initial intention of the existing code), then we need a new counter which gives
the number of records accumulated between each log point.
It's great we now test for all partitioners but we are only exporting 1 record
in {{test_all_datatypes_round_trip}} so a better candidate would have been
{{test_round_trip}}, where at least we export 10K records. So would you mind
adapting {{test_round_trip}} to also run with every partitioner?
In fact it would be good to have a bulk round-trip test as well (only for the
default partitioner) where we export and import 1M records? We would need to
use cassandra stress to write the records. Then we just check the counts. This
is just a suggestion.
I had problems when running the cqlsh_tests locally:
{code}
nosetests -s cqlsh_tests
{code}
{code}
======================================================================
ERROR: test_source_glass (cqlsh_tests.cqlsh_tests.TestCqlsh)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/stefania/git/cstar/cassandra-dtest/tools.py", line 252, in wrapped
f(obj)
File "/home/stefania/git/cstar/cassandra-dtest/cqlsh_tests/cqlsh_tests.py",
line 341, in test_source_glass
self.verify_glass(node1)
File "/home/stefania/git/cstar/cassandra-dtest/cqlsh_tests/cqlsh_tests.py",
line 102, in verify_glass
'I can eat glass and it does not hurt me': 'Is'
File "/home/stefania/git/cstar/cassandra-dtest/cqlsh_tests/cqlsh_tests.py",
line 95, in verify_varcharmap
got = {k.encode("utf-8"): v for k, v in rows[0][0].iteritems()}
IndexError: list index out of range
-------------------- >> begin captured logging << --------------------
dtest: DEBUG: cluster ccm directory: /tmp/dtest-Ldxvcq
--------------------- >> end captured logging << ---------------------
======================================================================
FAIL: test_all_datatypes_read (cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File
"/home/stefania/git/cstar/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py",
line 690, in test_all_datatypes_read
self.assertCsvResultEqual(self.tempfile.name, results)
File
"/home/stefania/git/cstar/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py",
line 153, in assertCsvResultEqual
raise e
AssertionError: Element counts were not equal:
First has 1, Second has 0: ['ascii', '1099511627776', '0xbeef', 'True',
'3.140000000000000124344978758017532527446746826171875', '2.444', '1.1',
'127.0.0.1', '25',
'\xe3\x83\xbd(\xc2\xb4\xe3\x83\xbc\xef\xbd\x80)\xe3\x83\x8e', '2005-07-14
12:30:00', '2b4e32ce-51de-11e5-85b7-0050b67e8b2f',
'830bc4cd-a790-4ac2-85f9-648b0a71306b', 'asdf', '36893488147419103232']
First has 0, Second has 1: ['ascii', '1099511627776', '0xbeef', 'True',
'3.140000000000000124344978758017532527446746826171875', '2.444', '1.1',
'127.0.0.1', '25',
'\xe3\x83\xbd(\xc2\xb4\xe3\x83\xbc\xef\xbd\x80)\xe3\x83\x8e', '2005-07-14
04:30:00', '2b4e32ce-51de-11e5-85b7-0050b67e8b2f',
'830bc4cd-a790-4ac2-85f9-648b0a71306b', 'asdf', '36893488147419103232']
-------------------- >> begin captured logging << --------------------
dtest: DEBUG: cluster ccm directory: /tmp/dtest-cSohP9
dtest: DEBUG: Importing from csv file: /tmp/tmpJgdPJc
dtest: WARNING: Mismatch at index: 10
dtest: WARNING: Value in csv: 2005-07-14 12:30:00
dtest: WARNING: Value in result: 2005-07-14 04:30:00
--------------------- >> end captured logging << ---------------------
----------------------------------------------------------------------
Ran 69 tests in 1161.775s
FAILED (SKIP=5, errors=1, failures=1)
{code}
I scheduled new CI jobs on my view:
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9304-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9304-dtest/
Let's see if they too report the problems I had locally.
> COPY TO improvements
> --------------------
>
> Key: CASSANDRA-9304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9304
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: David Kua
> Priority: Minor
> Labels: cqlsh
> Fix For: 2.1.x
>
>
> COPY FROM has gotten a lot of love. COPY TO not so much. One obvious
> improvement could be to parallelize reading and writing (write one page of
> data while fetching the next).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)