Re: [Freeipa-devel] [PATCH] 0015 Only split CSV strings once

Petr Viktorin Wed, 21 Mar 2012 07:31:31 -0700

On 03/20/2012 10:08 PM, Rob Crittenden wrote:

Petr Viktorin wrote:

On 03/16/2012 12:55 PM, Petr Viktorin wrote:

On 03/15/2012 08:55 PM, Rob Crittenden wrote:

Petr Viktorin wrote:

https://fedorahosted.org/freeipa/ticket/2227 (Unable to add certain
sudo
commands to groups). What an interesting bug to get :)



One problem with our CSV splitting is that it's not idempotent
(baskslashes are "eaten" when there are escaped commas), but when we
forward a call it gets done on both the client and the server. The
attached patch fixes that in a somewhat hacky way. It's not a complete
fix for the issue though, for that I need to ask what to do.


Some Params use the CSV format when we just need a list of
comma-separated values. Our flavor of CSV does escaping in two
different
ways. This is pretty bad for predictability, least-surprise,
documentation, ...

To recap, the two ways are escaping (escaped commas are ignored,
backslashes also need to be escaped) and quoting (values are
optionally
enclosed in double quotes, to include a '"' in a quoted string, use
'""').
Escaping is good because users are used to it, but currently literal
backslashes need to be doubled, making multivalue syntax different
from
single-value syntax, even if there are no commas in the value.
Quoting is weird, but complete by itself so it doesn't also need
escaping.

Do we need to use both? Is this documented somewhere? Some of our
tests
check only for one, some assume the other. Users are probably even
more
confused.


If we only keep one of those, the fix for #2227 should be quite easy.
If not (backwards compatibility), we need to document this properly,
test all the corner cases, and fix the UI to handle CSV escaping.
Since it's subtly broken in lots of places, maybe it's best to break
backwards compatibility and go for a simple solution now.

Anyway, if I want to do a proper fix, CSV handling should be only done
on the client. Unfortunately turning off CSV handling in the server
will
break the UI, which at places uses `join(',')` (no escaping commas, no
single place to change it), even though it can just send a list.


I'm with Honza, not too keen on the _forwarded_call part. I'd rather
you
do a mix of comparing self.env.in_server and whether the value is a
basestring to determine if we need to split it.


This was a hack necessary because the WebUI relied on server-side
splitting (in a few cases, and it did not escape correctly). Now that
Petr Vobornik's patch 99 is in, it is unnecessary.
This discussion thread has an updated patch (0015-03); what I'm
attaching now builds on top of that to add several new changes in tests.
To summarize it: the splitting is only done in the client; from the RPC
call on no CSV magic is involved at all.

Please tell me your reasons for basestring checking. I think it's
entirely unnecessary and just adds complexity.
In the rest of the framework a string, as a parameter value, has the
same effect as a list of that one string. Why break this?
We are concerned about the API here, not the command line. Shortcuts to
save the user's keystrokes in the common case should *not* win over
predictability and making sure the easier way is the correct, robust way
to do it. Compare -- with basestring checking:

some_command(arg=escape_commas(some_string))
- is correct

some_command(arg=[some_string])
- is also correct

some_command(arg=some_string)
- is WRONG because the string is not escaped
- but is the easiest solution!

The problem is much worse because it is subtle: it only affects values
with commas and backslashes, so *the wrong solution would work* in most
of the cases.
And I can assure you plugin writers *would* pick the wrong way if it
would usually work. Our own web UI had examples.

Contrast to my proposed behavior:
some_command(arg=some_string)
- is correct

some_command(arg=[some_string])
- is also correct

some_command(arg=escape_commas(some_string))
- is wrong, but doesn't make much sense because at this level you don't
have to worry about CSV at all

Basestring checking would add unnecessary complexity for both us *and*
the plugin writer. The only case it would make it easier for the plugin
writer is if the plugin took CSV values -- but why force a random
format, tailored for our specific frontend, into RPC clients?

The client should translate the command line into a RPC call. There's no
reason to arbitrarily expect the server to do part of the job.

I'm not sure why this is broken out of normalize. Isn't this
normalizing
the incoming values into something more sane?


"Normalization" implies it's an idempotent operation, which CSV
splitting is not (at least its current incarnation, without the
basestring checking):
r"a,b\,c" splits to ["a", "b,c"] which splits to ["a", "b", "c"]


Just to make things clear: freeipa-pviktori-0015-04 is the latest
version of this patch.
It's complemented by freeipa-pviktori-0021-03 which adds the tests for
this (and other things) (but contains one more refactor).


If and when that's all reviewed and pushed, I'd like to discuss
freeipa-pviktori-0018 (Simplify CSV escaping syntax). I'm not saying my
solution there is perfect, but the flavor of Python's CSV parsing we're
using now is quite quirky. It would be nice to get the syntax right, or
at least better, while we're fixing this from a totally-broken state.
Later there'll be backwards compatibility problems.
But, don't let that hold back this patch.


I'm still not completely convinced. This patch looks ok and I'm sold on
client side only parsing but this doesn't actually fix any of the CSV
bugs we have open.

I'd like to evaluate this in more context.

I think the simplest fix is to drop escapechar processing in the
csvreader and have users quote strings with commas instead, this is how
I originally intended it to work. The UI won't have to do anything
special since it will already be split and it isn't a lot to ask to put
quotes around a value on the CLI.

This is my proposed fix on top of your patch:

diff --git a/ipalib/parameters.py b/ipalib/parameters.py
index 80b175d..ec6020e 100644
--- a/ipalib/parameters.py
+++ b/ipalib/parameters.py
@@ -702,7 +702,7 @@ class Param(ReadOnly):
# csv.py doesn't do Unicode; encode temporarily as UTF-8:
csv_reader = csv.reader(self.__utf_8_encoder(unicode_csv_data),
dialect=dialect,
- delimiter=self.csv_separator, escapechar='\\',
+ delimiter=self.csv_separator, quotechar='"',
skipinitialspace=self.csv_skipspace,
**kwargs)
for row in csv_reader:

With this change:
- all the self-tests pass
- https://fedorahosted.org/freeipa/ticket/2417 is fixed
- https://fedorahosted.org/freeipa/ticket/2227 is fixed

I tested with:

$ ipa sudocmdgroup-add-member test --sudocmd='/bin/chown -R
apache\:developers /var/www/*/shared/log'

$ ipa role-add-privilege --privileges='"u,r stuff"' test

rob


Yes, that works well. ACK on your change; here it is squashed in.

https://fedorahosted.org/freeipa/ticket/2402 can wait for 3.0, then?

--
Petr³

From e0780d270d3549ac82ea0a798c50babb5f586d3f Mon Sep 17 00:00:00 2001
From: Petr Viktorin <pvikt...@redhat.com>
Date: Thu, 23 Feb 2012 07:29:47 -0500
Subject: [PATCH 15/16] Only split CSV in the client

Neither the Web UI nor other XML-RPC or JSON clients other than the CLI
need CSV handling (which includes escaping values properly, creating an
additional burden for frontend writers -- and worse: if this is not
done, common cases will work).
Only parse CSV in the CLI client, where it's needed.

Also, document Param's csv arguments, and update tests.
---
 ipalib/cli.py                                |    3 +-
 ipalib/frontend.py                           |   20 ++++++++++
 ipalib/parameters.py                         |   52 ++++++++++++++++++++-----
 tests/test_ipalib/test_parameters.py         |   24 ++++++------
 tests/test_xmlrpc/test_delegation_plugin.py  |    4 +-
 tests/test_xmlrpc/test_dns_plugin.py         |    5 +-
 tests/test_xmlrpc/test_privilege_plugin.py   |    2 +-
 tests/test_xmlrpc/test_role_plugin.py        |    9 +---
 tests/test_xmlrpc/test_selfservice_plugin.py |    8 ++--
 9 files changed, 88 insertions(+), 39 deletions(-)

diff --git a/ipalib/cli.py b/ipalib/cli.py
index 7af63761a361af432015d49c88bdb85d0042c1e8..ea320cf652e309592f9906831e3de2d0beb10198 100644
--- a/ipalib/cli.py
+++ b/ipalib/cli.py
@@ -1025,9 +1025,10 @@ class cli(backend.Executioner):
         if not isinstance(cmd, frontend.Local):
             self.create_context()
         kw = self.parse(cmd, argv)
-        kw['version'] = API_VERSION
         if self.env.interactive:
             self.prompt_interactively(cmd, kw)
+        kw = cmd.split_csv(**kw)
+        kw['version'] = API_VERSION
         self.load_files(cmd, kw)
         try:
             result = self.execute(name, **kw)
diff --git a/ipalib/frontend.py b/ipalib/frontend.py
index da25b4c1aef100cb54d7e248ba4e2ea5dc250cef..47c72d1ab2543e27b143adc3f8a1c64063c0038e 100644
--- a/ipalib/frontend.py
+++ b/ipalib/frontend.py
@@ -557,6 +557,26 @@ class Command(HasParam):
             if name in params:
                 yield(name, params[name])
 
+    def split_csv(self, **kw):
+        """
+        Return a dictionary of values where values are decoded from CSV.
+
+        For example:
+
+        >>> class my_command(Command):
+        ...     takes_options = (
+        ...         Param('flags', multivalue=True, csv=True),
+        ...     )
+        ...
+        >>> c = my_command()
+        >>> c.finalize()
+        >>> c.split_csv(flags=u'public,replicated')
+        {'flags': (u'public', u'replicated')}
+        """
+        return dict(
+            (k, self.params[k].split_csv(v)) for (k, v) in kw.iteritems()
+        )
+
     def normalize(self, **kw):
         """
         Return a dictionary of normalized values.
diff --git a/ipalib/parameters.py b/ipalib/parameters.py
index 94f11d912c010fb644d9683d7630a2d753f06371..ec6020e520fdfb3986d693c805f9f3031fcecd7b 100644
--- a/ipalib/parameters.py
+++ b/ipalib/parameters.py
@@ -345,11 +345,16 @@ class Param(ReadOnly):
               is not `required`. Applied for all crud.Update based commands
             * req_update: The parameter is `required` in all crud.Update based
               commands
-      - hint: This attribute is currently not used
+      - hint: this attribute is currently not used
       - alwaysask: when enabled, CLI asks for parameter value even when the
         parameter is not `required`
       - sortorder: used to sort a list of parameters for Command. See
         `Command.finalize()` for further information
+      - csv: this multivalue attribute is given in CSV format
+      - csv_separator: character that separates values in CSV (comma by
+        default)
+      - csv_skipspace: if true, leading whitespace will be ignored in
+        individual CSV values
     """
 
     # This is a dummy type so that most of the functionality of Param can be
@@ -697,13 +702,47 @@ class Param(ReadOnly):
         # csv.py doesn't do Unicode; encode temporarily as UTF-8:
         csv_reader = csv.reader(self.__utf_8_encoder(unicode_csv_data),
                                 dialect=dialect,
-                                delimiter=self.csv_separator, escapechar='\\',
+                                delimiter=self.csv_separator, quotechar='"',
                                 skipinitialspace=self.csv_skipspace,
                                 **kwargs)
         for row in csv_reader:
             # decode UTF-8 back to Unicode, cell by cell:
             yield [unicode(cell, 'utf-8') for cell in row]
 
+    def split_csv(self, value):
+        """Split CSV strings into individual values.
+
+        For CSV params, ``value`` is a tuple of strings. Each of these is split
+        on commas, and the results are concatenated into one tuple.
+
+        For example::
+
+            >>> param = Param('telephones', multivalue=True, csv=True)
+            >>> param.split_csv((u'1, 2', u'3', u'4, 5, 6'))
+            (u'1', u'2', u'3', u'4', u'5', u'6')
+
+        If ``value`` is not a tuple (or list), it is only split::
+
+            >>> param = Param('telephones', multivalue=True, csv=True)
+            >>> param.split_csv(u'1, 2, 3')
+            (u'1', u'2', u'3')
+
+        For non-CSV params, return the value unchanged.
+        """
+        if self.csv:
+            if type(value) not in (tuple, list):
+                value = (value,)
+            newval = ()
+            for v in value:
+                if isinstance(v, basestring):
+                    csvreader = self.__unicode_csv_reader([unicode(v)])
+                    newval += tuple(csvreader.next()) #pylint: disable=E1101
+                else:
+                    newval += (v,)
+            return newval
+        else:
+            return value
+
     def normalize(self, value):
         """
         Normalize ``value`` using normalizer callback.
@@ -730,15 +769,6 @@ class Param(ReadOnly):
         if self.multivalue:
             if type(value) not in (tuple, list):
                 value = (value,)
-            if self.csv:
-                newval = ()
-                for v in value:
-                    if isinstance(v, basestring):
-                        csvreader = self.__unicode_csv_reader([unicode(v)])
-                        newval += tuple(csvreader.next()) #pylint: disable=E1101
-                    else:
-                        newval += (v,)
-                value = newval
         if self.multivalue:
             return tuple(
                 self._normalize_scalar(v) for v in value
diff --git a/tests/test_ipalib/test_parameters.py b/tests/test_ipalib/test_parameters.py
index ad8d8404496f1631cfdfc2db4e6f849ecdd09afe..83c33ddfce073d3aa90691b62cab30f30ffcee1e 100644
--- a/tests/test_ipalib/test_parameters.py
+++ b/tests/test_ipalib/test_parameters.py
@@ -635,44 +635,44 @@ class test_Param(ClassChecker):
         assert o._convert_scalar.value is default
         assert o.normalizer.value is default
 
-    def test_csv_normalize(self):
+    def test_split_csv(self):
         """
-        Test the `ipalib.parameters.Param.normalize` method with csv.
+        Test the `ipalib.parameters.Param.split_csv` method with csv.
         """
         o = self.cls('my_list+', csv=True)
-        n = o.normalize('a,b')
+        n = o.split_csv('a,b')
         assert type(n) is tuple
         assert len(n) is 2
 
-        n = o.normalize('bar,   "hi, there",foo')
+        n = o.split_csv('bar,   "hi, there",foo')
         assert type(n) is tuple
         assert len(n) is 3
 
-    def test_csv_normalize_separator(self):
+    def test_split_csv_separator(self):
         """
-        Test the `ipalib.parameters.Param.normalize` method with csv and a separator.
+        Test the `ipalib.parameters.Param.split_csv` method with csv and a separator.
         """
         o = self.cls('my_list+', csv=True, csv_separator='|')
 
-        n = o.normalize('a')
+        n = o.split_csv('a')
         assert type(n) is tuple
         assert len(n) is 1
 
-        n = o.normalize('a|b')
+        n = o.split_csv('a|b')
         assert type(n) is tuple
         assert len(n) is 2
 
-    def test_csv_normalize_skipspace(self):
+    def test_split_csv_skipspace(self):
         """
-        Test the `ipalib.parameters.Param.normalize` method with csv without skipping spaces.
+        Test the `ipalib.parameters.Param.split_csv` method with csv without skipping spaces.
         """
         o = self.cls('my_list+', csv=True, csv_skipspace=False)
 
-        n = o.normalize('a')
+        n = o.split_csv('a')
         assert type(n) is tuple
         assert len(n) is 1
 
-        n = o.normalize('a, "b,c", d')
+        n = o.split_csv('a, "b,c", d')
         assert type(n) is tuple
         # the output w/o skipspace is ['a',' "b','c"',' d']
         assert len(n) is 4
diff --git a/tests/test_xmlrpc/test_delegation_plugin.py b/tests/test_xmlrpc/test_delegation_plugin.py
index db5f7186527d2e0c6567dd5a727e878144bd3020..5030f8bc26e869c0553d331c290e481a94514c9e 100644
--- a/tests/test_xmlrpc/test_delegation_plugin.py
+++ b/tests/test_xmlrpc/test_delegation_plugin.py
@@ -87,7 +87,7 @@ class test_delegation(Declarative):
             desc='Create %r' % delegation1,
             command=(
                 'delegation_add', [delegation1], dict(
-                     attrs=u'street,c,l,st,postalCode',
+                     attrs=[u'street', u'c', u'l', u'st', u'postalCode'],
                      permissions=u'write',
                      group=u'editors',
                      memberof=u'admins',
@@ -111,7 +111,7 @@ class test_delegation(Declarative):
             desc='Try to create duplicate %r' % delegation1,
             command=(
                 'delegation_add', [delegation1], dict(
-                     attrs=u'street,c,l,st,postalCode',
+                     attrs=[u'street', u'c', u'l', u'st', u'postalCode'],
                      permissions=u'write',
                      group=u'editors',
                      memberof=u'admins',
diff --git a/tests/test_xmlrpc/test_dns_plugin.py b/tests/test_xmlrpc/test_dns_plugin.py
index e310d31947c71a7e1fc4215e2c069904c1588003..ec6565197880a2e2dafa07315cd1c715c120cfcd 100644
--- a/tests/test_xmlrpc/test_dns_plugin.py
+++ b/tests/test_xmlrpc/test_dns_plugin.py
@@ -724,8 +724,9 @@ class test_dns(Declarative):
 
         dict(
             desc='Add NSEC record to %r using dnsrecord_add' % (dnsres1),
-            command=('dnsrecord_add', [dnszone1, dnsres1], {'nsec_part_next': dnszone1,
-                                                            'nsec_part_types' : ['TXT', 'A']}),
+            command=('dnsrecord_add', [dnszone1, dnsres1], {
+                'nsec_part_next': dnszone1,
+                'nsec_part_types' : [u'TXT', u'A']}),
             expected={
                 'value': dnsres1,
                 'summary': None,
diff --git a/tests/test_xmlrpc/test_privilege_plugin.py b/tests/test_xmlrpc/test_privilege_plugin.py
index 58dbff85bfd824d3d7e45877d99fc10e14871d80..944225585227a677e4915d4ddbada2dedaa5524c 100644
--- a/tests/test_xmlrpc/test_privilege_plugin.py
+++ b/tests/test_xmlrpc/test_privilege_plugin.py
@@ -87,7 +87,7 @@ class test_privilege(Declarative):
             command=(
                 'permission_add', [permission1], dict(
                      type=u'user',
-                     permissions=u'add, delete',
+                     permissions=[u'add', u'delete'],
                 )
             ),
             expected=dict(
diff --git a/tests/test_xmlrpc/test_role_plugin.py b/tests/test_xmlrpc/test_role_plugin.py
index f871e2683a993b4bd65223fb0ae1c20af977e047..6f6de32147e0fe969cfc63ba3ae42bffdf64eb8a 100644
--- a/tests/test_xmlrpc/test_role_plugin.py
+++ b/tests/test_xmlrpc/test_role_plugin.py
@@ -47,9 +47,6 @@ privilege1 = u'r,w privilege 1'
 privilege1_dn = DN(('cn', privilege1), DN(api.env.container_privilege),
                    api.env.basedn)
 
-def escape_comma(value):
-    return value.replace(',', '\\,')
-
 class test_role(Declarative):
 
     cleanup_commands = [
@@ -184,7 +181,7 @@ class test_role(Declarative):
         dict(
             desc='Add privilege %r to role %r' % (privilege1, role1),
             command=('role_add_privilege', [role1],
-                dict(privilege=escape_comma(privilege1))
+                dict(privilege=privilege1)
             ),
             expected=dict(
                 completed=1,
@@ -465,7 +462,7 @@ class test_role(Declarative):
         dict(
             desc='Remove privilege %r from role %r' % (privilege1, role1),
             command=('role_remove_privilege', [role1],
-                dict(privilege=escape_comma(privilege1))
+                dict(privilege=privilege1)
             ),
             expected=dict(
                 completed=1,
@@ -486,7 +483,7 @@ class test_role(Declarative):
         dict(
             desc='Remove privilege %r from role %r again' % (privilege1, role1),
             command=('role_remove_privilege', [role1],
-                dict(privilege=escape_comma(privilege1))
+                dict(privilege=privilege1)
             ),
             expected=dict(
                 completed=0,
diff --git a/tests/test_xmlrpc/test_selfservice_plugin.py b/tests/test_xmlrpc/test_selfservice_plugin.py
index 2ddff50ea1c0a9c6dae3e15e6bb4af35b225878e..d457627c5c5f8318af8c89c606b6ca037e7559c6 100644
--- a/tests/test_xmlrpc/test_selfservice_plugin.py
+++ b/tests/test_xmlrpc/test_selfservice_plugin.py
@@ -74,8 +74,8 @@ class test_selfservice(Declarative):
             desc='Create %r' % selfservice1,
             command=(
                 'selfservice_add', [selfservice1], dict(
-                     attrs=u'street,c,l,st,postalCode',
-                     permissions=u'write',
+                    attrs=[u'street', u'c', u'l', u'st', u'postalcode'],
+                    permissions=u'write',
                 )
             ),
             expected=dict(
@@ -95,8 +95,8 @@ class test_selfservice(Declarative):
             desc='Try to create duplicate %r' % selfservice1,
             command=(
                 'selfservice_add', [selfservice1], dict(
-                     attrs=u'street,c,l,st,postalCode',
-                     permissions=u'write',
+                    attrs=[u'street', u'c', u'l', u'st', u'postalcode'],
+                    permissions=u'write',
                 ),
             ),
             expected=errors.DuplicateEntry(),
-- 
1.7.7.6

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 0015 Only split CSV strings once

Reply via email to