[jira] [Updated] (CASSANDRA-14176) Cannot export & import data containing no-break space (U+00A0)

2018-07-09 Thread Joshua McKenzie (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-14176:

Component/s: Tools

> Cannot export & import data containing no-break space (U+00A0)
> --
>
> Key: CASSANDRA-14176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14176
> Project: Cassandra
>  Issue Type: Bug
>  Components: Libraries, Tools
>Reporter: Marcel Dopita
>Assignee: Marcel Dopita
>Priority: Major
> Attachments: fix.patch
>
>
> We were unable to export and then import the same data to Cassandra - like 
> line breaks or the no-break space (U+00A0).
> Adding v.decode() to copyutil.py fixed most characters like line breaks etc.
> Only after using the included patch, the character U+00A0 was correctly 
> stored in Cassandra and successfully (verifiable) exported & imported & 
> exported.
>  
> {code:java}
> diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py
> index 7f97b49..883c957 100644
> --- a/pylib/cqlshlib/copyutil.py
> +++ b/pylib/cqlshlib/copyutil.py
> @@ -1871,7 +1871,7 @@ class ImportConversion(object):
>  return bytearray.fromhex(v[2:])
>  
>  def convert_text(v, **_):
> -    return v
> +    return v.decode('string_escape')
>  
>  def convert_uuid(v, **_):
>  return UUID(v)
> diff --git a/pylib/cqlshlib/formatting.py b/pylib/cqlshlib/formatting.py
> index 803ea63..79eb691 100644
> --- a/pylib/cqlshlib/formatting.py
> +++ b/pylib/cqlshlib/formatting.py
> @@ -33,7 +33,7 @@ from util import UTC
>  
>  is_win = platform.system() == 'Windows'
>  
> -unicode_controlchars_re = re.compile(r'[\x00-\x31\x7f-\xa0]')
> +unicode_controlchars_re = re.compile(r'[\x00-\x31]')
>  controlchars_re = re.compile(r'[\x00-\x31\x7f-\xff]')
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14176) Cannot export & import data containing no-break space (U+00A0)

2018-04-05 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14176:

Reviewer: Paulo Motta

Good catch, would you mind adding a regression test to the 
[cqlsh_copy_tests|https://github.com/apache/cassandra-dtest/blob/master/cqlsh_tests/cqlsh_copy_tests.py]
 suite reproducing the problem?

> Cannot export & import data containing no-break space (U+00A0)
> --
>
> Key: CASSANDRA-14176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14176
> Project: Cassandra
>  Issue Type: Bug
>  Components: Libraries
>Reporter: Marcel Dopita
>Assignee: Marcel Dopita
>Priority: Major
> Attachments: fix.patch
>
>
> We were unable to export and then import the same data to Cassandra - like 
> line breaks or the no-break space (U+00A0).
> Adding v.decode() to copyutil.py fixed most characters like line breaks etc.
> Only after using the included patch, the character U+00A0 was correctly 
> stored in Cassandra and successfully (verifiable) exported & imported & 
> exported.
>  
> {code:java}
> diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py
> index 7f97b49..883c957 100644
> --- a/pylib/cqlshlib/copyutil.py
> +++ b/pylib/cqlshlib/copyutil.py
> @@ -1871,7 +1871,7 @@ class ImportConversion(object):
>  return bytearray.fromhex(v[2:])
>  
>  def convert_text(v, **_):
> -    return v
> +    return v.decode('string_escape')
>  
>  def convert_uuid(v, **_):
>  return UUID(v)
> diff --git a/pylib/cqlshlib/formatting.py b/pylib/cqlshlib/formatting.py
> index 803ea63..79eb691 100644
> --- a/pylib/cqlshlib/formatting.py
> +++ b/pylib/cqlshlib/formatting.py
> @@ -33,7 +33,7 @@ from util import UTC
>  
>  is_win = platform.system() == 'Windows'
>  
> -unicode_controlchars_re = re.compile(r'[\x00-\x31\x7f-\xa0]')
> +unicode_controlchars_re = re.compile(r'[\x00-\x31]')
>  controlchars_re = re.compile(r'[\x00-\x31\x7f-\xff]')
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14176) Cannot export & import data containing no-break space (U+00A0)

2018-04-05 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14176:

Status: Open  (was: Patch Available)

> Cannot export & import data containing no-break space (U+00A0)
> --
>
> Key: CASSANDRA-14176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14176
> Project: Cassandra
>  Issue Type: Bug
>  Components: Libraries
>Reporter: Marcel Dopita
>Assignee: Marcel Dopita
>Priority: Major
> Attachments: fix.patch
>
>
> We were unable to export and then import the same data to Cassandra - like 
> line breaks or the no-break space (U+00A0).
> Adding v.decode() to copyutil.py fixed most characters like line breaks etc.
> Only after using the included patch, the character U+00A0 was correctly 
> stored in Cassandra and successfully (verifiable) exported & imported & 
> exported.
>  
> {code:java}
> diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py
> index 7f97b49..883c957 100644
> --- a/pylib/cqlshlib/copyutil.py
> +++ b/pylib/cqlshlib/copyutil.py
> @@ -1871,7 +1871,7 @@ class ImportConversion(object):
>  return bytearray.fromhex(v[2:])
>  
>  def convert_text(v, **_):
> -    return v
> +    return v.decode('string_escape')
>  
>  def convert_uuid(v, **_):
>  return UUID(v)
> diff --git a/pylib/cqlshlib/formatting.py b/pylib/cqlshlib/formatting.py
> index 803ea63..79eb691 100644
> --- a/pylib/cqlshlib/formatting.py
> +++ b/pylib/cqlshlib/formatting.py
> @@ -33,7 +33,7 @@ from util import UTC
>  
>  is_win = platform.system() == 'Windows'
>  
> -unicode_controlchars_re = re.compile(r'[\x00-\x31\x7f-\xa0]')
> +unicode_controlchars_re = re.compile(r'[\x00-\x31]')
>  controlchars_re = re.compile(r'[\x00-\x31\x7f-\xff]')
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14176) Cannot export & import data containing no-break space (U+00A0)

2018-01-19 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14176:
---
Assignee: Marcel Dopita
  Status: Patch Available  (was: Open)

Marking patch available. 

> Cannot export & import data containing no-break space (U+00A0)
> --
>
> Key: CASSANDRA-14176
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14176
> Project: Cassandra
>  Issue Type: Bug
>  Components: Libraries
>Reporter: Marcel Dopita
>Assignee: Marcel Dopita
>Priority: Major
> Attachments: fix.patch
>
>
> We were unable to export and then import the same data to Cassandra - like 
> line breaks or the no-break space (U+00A0).
> Adding v.decode() to copyutil.py fixed most characters like line breaks etc.
> Only after using the included patch, the character U+00A0 was correctly 
> stored in Cassandra and successfully (verifiable) exported & imported & 
> exported.
>  
> {code:java}
> diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py
> index 7f97b49..883c957 100644
> --- a/pylib/cqlshlib/copyutil.py
> +++ b/pylib/cqlshlib/copyutil.py
> @@ -1871,7 +1871,7 @@ class ImportConversion(object):
>  return bytearray.fromhex(v[2:])
>  
>  def convert_text(v, **_):
> -    return v
> +    return v.decode('string_escape')
>  
>  def convert_uuid(v, **_):
>  return UUID(v)
> diff --git a/pylib/cqlshlib/formatting.py b/pylib/cqlshlib/formatting.py
> index 803ea63..79eb691 100644
> --- a/pylib/cqlshlib/formatting.py
> +++ b/pylib/cqlshlib/formatting.py
> @@ -33,7 +33,7 @@ from util import UTC
>  
>  is_win = platform.system() == 'Windows'
>  
> -unicode_controlchars_re = re.compile(r'[\x00-\x31\x7f-\xa0]')
> +unicode_controlchars_re = re.compile(r'[\x00-\x31]')
>  controlchars_re = re.compile(r'[\x00-\x31\x7f-\xff]')
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org