[ 
https://issues.apache.org/jira/browse/CASSANDRA-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476666#comment-16476666
 ] 

Patrick Bannister edited comment on CASSANDRA-14298 at 5/16/18 3:05 AM:
------------------------------------------------------------------------

I've made a lot of progress porting cqlshlib to Python 3. Along the way I've 
been taking notes on all the areas that I think would require extra effort for 
cross compatibility with Python 2.

I don't have a complete plan yet, but I have some observations.

In terms of level of effort and complexity, this is not going to be as simple 
as running 2to3 and then adding a few imports from future and six. However, we 
won't need to rearchitect the library either. So far I've found that existing 
classes and functions work with just a few tweaks to their implementation, 
mostly around IO and strings vs. bytes.

The biggest challenge, regardless of whether we go straight Python 3 or 
cross-compatible, is going to be adequately testing the result. The cqlshlib 
unittests and the cqlsh_tests have been useful to help find bugs, but I'm not 
confident that our tests have enough code coverage to exercise everything. We 
would need a strategy for more comprehensive testing.

Some specifics:
 * The SaferScanner class in saferscanner.py requires a slightly different 
implementation in Python 2 vs. Python 3, because of changes in the internals of 
the re module for regular expressions.
 * copyutil.py, formatting.py, and displaying.py have needed the most work so 
far, since they have a lot of IO and serialization.
 * The formatter for blobs in formatting.py needs a different implementation in 
Python 2 vs. Python 3, because of changes in the behavior of binascii.hexlify.
 * On the dtests side, there are several tests that fail intermittently due to 
different sorting between expected results and observed results. The result of 
these tests is flaky depending on what randomly occurring sort happens to come 
out of the test. I've been able to get these tests to pass consistently by 
sorting results just before asserting equality.
 * Another notable dtest issue: in the cqlsh_copy_tests, the bulk_round_trip 
tests that use the blogposts profile are failing because of a limitation of the 
Python csv.reader, which is used in cqlshlib3 and in the bulk_round_trip tests. 
Python's csv.reader chokes on newlines and null characters, but the 
cassandra-stress tool's Strings Generator subclass generates both of these 
things in text fields. (Edit: this may be a combination of misuse on my part of 
csv.reader, plus a failure to properly port formatting of text data.)


was (Author: ptbannister):
I've made a lot of progress porting cqlshlib to Python 3. Along the way I've 
been taking notes on all the areas that I think would require extra effort for 
cross compatibility with Python 2.

I don't have a complete plan yet, but I have some observations.

In terms of level of effort and complexity, this is not going to be as simple 
as running 2to3 and then adding a few imports from future and six. However, we 
won't need to rearchitect the library either. So far I've found that existing 
classes and functions work with just a few tweaks to their implementation, 
mostly around IO and strings vs. bytes.

The biggest challenge, regardless of whether we go straight Python 3 or 
cross-compatible, is going to be adequately testing the result. The cqlshlib 
unittests and the cqlsh_tests have been useful to help find bugs, but I'm not 
confident that our tests have enough code coverage to exercise everything. We 
would need a strategy for more comprehensive testing.

Some specifics:
 * The SaferScanner class in saferscanner.py requires a slightly different 
implementation in Python 2 vs. Python 3, because of changes in the internals of 
the re module for regular expressions.
 * copyutil.py, formatting.py, and displaying.py have needed the most work so 
far, since they have a lot of IO and serialization.
 * The formatter for blobs in formatting.py needs a different implementation in 
Python 2 vs. Python 3, because of changes in the behavior of binascii.hexlify.
 * On the dtests side, there are several tests that fail intermittently due to 
different sorting between expected results and observed results. The result of 
these tests is flaky depending on what randomly occurring sort happens to come 
out of the test. I've been able to get these tests to pass consistently by 
sorting results just before asserting equality.
 * Another notable dtest issue: in the cqlsh_copy_tests, the bulk_round_trip 
tests that use the blogposts profile are failing because of a limitation of the 
Python csv.reader, which is used in cqlshlib3 and in the bulk_round_trip tests. 
Python's csv.reader chokes on newlines and null characters, but the 
cassandra-stress tool's Strings Generator subclass generates both of these 
things in text fields.

> cqlshlib tests broken on b.a.o
> ------------------------------
>
>                 Key: CASSANDRA-14298
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14298
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Build, Testing
>            Reporter: Stefan Podkowinski
>            Assignee: Patrick Bannister
>            Priority: Major
>              Labels: cqlsh, dtest
>         Attachments: CASSANDRA-14298-old.txt, CASSANDRA-14298.txt, 
> cqlsh_tests_notes.md
>
>
> It appears that cqlsh-tests on builds.apache.org on all branches stopped 
> working since we removed nosetests from the system environment. See e.g. 
> [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-cqlsh-tests/458/cython=no,jdk=JDK%201.8%20(latest),label=cassandra/console].
>  Looks like we either have to make nosetests available again or migrate to 
> pytest as we did with dtests. Giving pytest a quick try resulted in many 
> errors locally, but I haven't inspected them in detail yet. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to