It seems I didn't communicate the problem and my solution clearly enough, so let me try again. (Also, I learned from the discussions!)

Currently, both the client and the server parse CSV options. The client does *not* re-encode the CSV before sending; the parsing is really done twice. This means e.g. that you need 3 backslashes to escape a literal comma: after the client-side split, '\\\,' becomes '\,'; which after the server-side split becomes ','.


Since CSV is specific to the command-line, and the client is responsible for translating command-line input to XML-RPC (which has its own syntax for lists), the ideal fix will be to move CSV processing entirely to the client. This will be a rather invasive change, mainly because some parts of the UI now expect the server-side parsing (but they don't escape CSV, so values containing commas or backslashes are broken). So it won't make it to the upcoming release. My patch provides a quick fix: when a call comes from the command-line client, disable the server-side parsing.

I can't get away from moving split_csv() (which is not idempotent) out of normalize() (which is, and gets called lots of times); this is the patch's major change in therms of LOC.


I'll note again that this only affects values with backslashes or double quotes. Exactly these options are currently broken (=need double escaping). The "normal" uses of CSV are completely unaffected.


Attaching updated patch; the change vs. the original is that the "don't parse again" flag is now only set at the server, when a XMLRPC call is received, making the client fully forward-compatible (the flag doesn't get sent through the wire).


The ticket is https://fedorahosted.org/freeipa/ticket/2227, but this patch is only the first step in fixing it.

--
PetrĀ³
From b8a091514ee74283bbcb02d8b54fdfe4b49a7f04 Mon Sep 17 00:00:00 2001
From: Petr Viktorin <pvikt...@redhat.com>
Date: Thu, 23 Feb 2012 07:29:47 -0500
Subject: [PATCH] Only split CSV strings once

When a call comes from XML-RPMC, set a flag that prevents the server
from parsing CSV.

Splitting on commas is not an idempotent operation:
'a,b\,c' -> ('a', 'b,c') -> ('a', 'b', 'c')
That means we can't do it when the call is forwarded. But currently
we do: both the client and the server parse CSV, so in the end it's
split twice, requiring double escaping for commas and backslashes.

Since the CSV fomat for options is specific to the CLI, the ideal
fix to this problem is to only parse it in the client. This requires
some work and testing at the UI side, however -- currently some
pieces of the UI rely on the server to parse CSV, instead of using
JSON lists. This patch implements a quick hack whose main virtue
is that it's forward-compatible with the ideal fix.

Also, document Param's csv arguments, and update tests.
---
 ipalib/frontend.py                   |   25 +++++++++++++++++
 ipalib/parameters.py                 |   50 +++++++++++++++++++++++++++-------
 ipaserver/rpcserver.py               |    1 +
 tests/test_ipalib/test_parameters.py |   24 ++++++++--------
 4 files changed, 78 insertions(+), 22 deletions(-)

diff --git a/ipalib/frontend.py b/ipalib/frontend.py
index da25b4c1aef100cb54d7e248ba4e2ea5dc250cef..9c898b9c05686ddfd2dea4e5c6f6a894c4fff1e7 100644
--- a/ipalib/frontend.py
+++ b/ipalib/frontend.py
@@ -414,6 +414,7 @@ class Command(HasParam):
         If not in a server context, the call will be forwarded over
         XML-RPC and the executed an the nearest IPA server.
         """
+        dont_split_csv = options.get('_dont_split_csv', False)
         self.ensure_finalized()
         params = self.args_options_2_params(*args, **options)
         self.debug(
@@ -425,6 +426,10 @@ class Command(HasParam):
                 break
             params.update(default)
         params = self.normalize(**params)
+        if not dont_split_csv:
+            # split_csv() needs to run exactly once, wherever the command
+            # originated (validate() expects the strings to be split already).
+            params = self.split_csv(**params)
         params = self.convert(**params)
         self.debug(
             '%s(%s)', self.name, ', '.join(self._repr_iter(**params))
@@ -557,6 +562,26 @@ class Command(HasParam):
             if name in params:
                 yield(name, params[name])
 
+    def split_csv(self, **kw):
+        """
+        Return a dictionary of values where values are decoded from CSV.
+
+        For example:
+
+        >>> class my_command(Command):
+        ...     takes_options = (
+        ...         Param('flags', multivalue=True, csv=True),
+        ...     )
+        ...
+        >>> c = my_command()
+        >>> c.finalize()
+        >>> c.split_csv(flags=u'public,replicated')
+        {'flags': (u'public', u'replicated')}
+        """
+        return dict(
+            (k, self.params[k].split_csv(v)) for (k, v) in kw.iteritems()
+        )
+
     def normalize(self, **kw):
         """
         Return a dictionary of normalized values.
diff --git a/ipalib/parameters.py b/ipalib/parameters.py
index 755d04dd9446a6622bfe99e899158a1ab04d1790..cac3441506bdea65cc1fcf9c85164f56d1618808 100644
--- a/ipalib/parameters.py
+++ b/ipalib/parameters.py
@@ -345,11 +345,16 @@ class Param(ReadOnly):
               is not `required`. Applied for all crud.Update based commands
             * req_update: The parameter is `required` in all crud.Update based
               commands
-      - hint: This attribute is currently not used
+      - hint: this attribute is currently not used
       - alwaysask: when enabled, CLI asks for parameter value even when the
         parameter is not `required`
       - sortorder: used to sort a list of parameters for Command. See
         `Command.finalize()` for further information
+      - csv: this multivalue attribute is given in CSV format
+      - csv_separator: character that separates values in CSV (comma by
+        default)
+      - csv_skipspace: if true, leading whitespace will be ignored in
+        individual CSV values
     """
 
     # This is a dummy type so that most of the functionality of Param can be
@@ -691,6 +696,40 @@ class Param(ReadOnly):
             # decode UTF-8 back to Unicode, cell by cell:
             yield [unicode(cell, 'utf-8') for cell in row]
 
+    def split_csv(self, value):
+        """Split CSV strings into individual values.
+
+        For CSV params, ``value`` is a tuple of strings. Each of these is split
+        on commas, and the results are concatenated into one tuple.
+
+        For example::
+
+            >>> param = Param('telephones', multivalue=True, csv=True)
+            >>> param.split_csv((u'1, 2', u'3', u'4, 5, 6'))
+            (u'1', u'2', u'3', u'4', u'5', u'6')
+
+        If ``value`` is not a tuple (or list), it is only split::
+
+            >>> param = Param('telephones', multivalue=True, csv=True)
+            >>> param.split_csv(u'1, 2, 3')
+            (u'1', u'2', u'3')
+
+        For non-CSV params, return the value unchanged.
+        """
+        if self.csv:
+            if type(value) not in (tuple, list):
+                value = (value,)
+            newval = ()
+            for v in value:
+                if isinstance(v, basestring):
+                    csvreader = self.__unicode_csv_reader([unicode(v)])
+                    newval += tuple(csvreader.next()) #pylint: disable=E1101
+                else:
+                    newval += (v,)
+            return newval
+        else:
+            return value
+
     def normalize(self, value):
         """
         Normalize ``value`` using normalizer callback.
@@ -717,15 +756,6 @@ class Param(ReadOnly):
         if self.multivalue:
             if type(value) not in (tuple, list):
                 value = (value,)
-            if self.csv:
-                newval = ()
-                for v in value:
-                    if isinstance(v, basestring):
-                        csvreader = self.__unicode_csv_reader([unicode(v)])
-                        newval += tuple(csvreader.next()) #pylint: disable=E1101
-                    else:
-                        newval += (v,)
-                value = newval
         if self.multivalue:
             return tuple(
                 self._normalize_scalar(v) for v in value
diff --git a/ipaserver/rpcserver.py b/ipaserver/rpcserver.py
index 205dc7655235fdaa749b711b8a268aeb044a5274..9d70455281ec400af073a2eb942e70bf52a301ab 100644
--- a/ipaserver/rpcserver.py
+++ b/ipaserver/rpcserver.py
@@ -359,6 +359,7 @@ class xmlserver(WSGIExecutioner):
     def unmarshal(self, data):
         (params, name) = xml_loads(data)
         (args, options) = params_2_args_options(params)
+        options['_dont_split_csv'] = True
         return (name, args, options, None)
 
     def marshal(self, result, error, _id=None):
diff --git a/tests/test_ipalib/test_parameters.py b/tests/test_ipalib/test_parameters.py
index ad8d8404496f1631cfdfc2db4e6f849ecdd09afe..83c33ddfce073d3aa90691b62cab30f30ffcee1e 100644
--- a/tests/test_ipalib/test_parameters.py
+++ b/tests/test_ipalib/test_parameters.py
@@ -635,44 +635,44 @@ class test_Param(ClassChecker):
         assert o._convert_scalar.value is default
         assert o.normalizer.value is default
 
-    def test_csv_normalize(self):
+    def test_split_csv(self):
         """
-        Test the `ipalib.parameters.Param.normalize` method with csv.
+        Test the `ipalib.parameters.Param.split_csv` method with csv.
         """
         o = self.cls('my_list+', csv=True)
-        n = o.normalize('a,b')
+        n = o.split_csv('a,b')
         assert type(n) is tuple
         assert len(n) is 2
 
-        n = o.normalize('bar,   "hi, there",foo')
+        n = o.split_csv('bar,   "hi, there",foo')
         assert type(n) is tuple
         assert len(n) is 3
 
-    def test_csv_normalize_separator(self):
+    def test_split_csv_separator(self):
         """
-        Test the `ipalib.parameters.Param.normalize` method with csv and a separator.
+        Test the `ipalib.parameters.Param.split_csv` method with csv and a separator.
         """
         o = self.cls('my_list+', csv=True, csv_separator='|')
 
-        n = o.normalize('a')
+        n = o.split_csv('a')
         assert type(n) is tuple
         assert len(n) is 1
 
-        n = o.normalize('a|b')
+        n = o.split_csv('a|b')
         assert type(n) is tuple
         assert len(n) is 2
 
-    def test_csv_normalize_skipspace(self):
+    def test_split_csv_skipspace(self):
         """
-        Test the `ipalib.parameters.Param.normalize` method with csv without skipping spaces.
+        Test the `ipalib.parameters.Param.split_csv` method with csv without skipping spaces.
         """
         o = self.cls('my_list+', csv=True, csv_skipspace=False)
 
-        n = o.normalize('a')
+        n = o.split_csv('a')
         assert type(n) is tuple
         assert len(n) is 1
 
-        n = o.normalize('a, "b,c", d')
+        n = o.split_csv('a, "b,c", d')
         assert type(n) is tuple
         # the output w/o skipspace is ['a',' "b','c"',' d']
         assert len(n) is 4
-- 
1.7.7.6

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Reply via email to