Dne 27.5.2015 v 07:39 Jan Cholasta napsal(a):
Dne 27.5.2015 v 02:38 Endi Sukma Dewata napsal(a):
Please take a look at the attached patch to add vault-archive/retrieve
commands.

On 4/20/2015 1:12 AM, Jan Cholasta wrote:
16) You do way too much stuff in vault_add.forward(). Only code that
must be done on the client needs to be there, i.e. handling of the
"data", "text" and "in" options.

The vault_archive call must be in vault_add.execute(), otherwise
a) we
will be making 2 RPC calls from the client and b) it won't be
called at
all when api.env.in_server is True.

This is done by design. The vault_add.forward() generates the salt
and
the keys. The vault_archive.forward() will encrypt the data. These
operations have to be done on the client side to secure the
transport of
the data from the client through the server and finally to KRA. This
mechanism prevents the server from looking at the unencrypted data.

OK, but that does not justify that it's broken in server-side API. It
can and should be done so that it works the same way on both client
and
server.

I think the best solution would be to split the command into two
commands, server-side vault_archive_raw to archive already encrypted
data, and client-side vault_archive to encrypt data and archive them
with vault_archive_raw in its .execute(). Same thing for
vault_retrieve.

Actually I think it's better to just merge the add and archive,
reducing
the number of RPC calls. The vault_archive now will support two
types of
operations:

(a) Archive data into a new vault (it will create the vault just before
archiving the data):

   $ ipa vault-archive testvault --create --in data ...

(b) Archive data into an existing vault:

   $ ipa vault-archive testvault --in data ...

The vault_add is now just a wrapper for the vault_archive(a).

If that's just an implementation detail, OK.

If it's possible to modify existing vault objects using vault-add or
create new objects using vault-archive, then NACK.

The vault-archive is now completely separate from vault-add. You can no
longer archive data while creating a vault:

   $ ipa vault-add test
   $ ipa vault-archive test --in secret.bin

OK.


BTW, I also think it would be better if there were 2 separate sets of
commands for binary and textual data
(vault_{archive,retrieve}_{data,text}) rather than trying to handle
everything in vault_{archive,retrieve}.

I don't think we want to provide a separate command of every possible
data type & operation combination. Users would get confused. The
archive
& retrieve commands should be able to handle all current & future data
types with options.

A command with two sets of mutually exclusive options is really two
commands in disguise, which is a good sign it should be divided into two
actual commands.

Who are you to say users would get confused? I say they would be more
confused by a command with a plethora of mutually exclusive "options".

What other possible data types are there?

The patch has been simplified to support only binary data. The data can
be specified using either of these options:

   $ ipa vault-archive test --data <base-64 encoded data>
   $ ipa vault-archive test --in <input file>

I don't think we want to provide two separate commands for these options
although they are mutually exclusive, do we?

No, these are not really 2 different options, but rather 2 forms of the
same option, which for the lack of better support for files in the
framework have to be done as 2 options.


The add & archive combination was added for convenience, not for
optimization. This way you would be able to archive data into a new
vault using a single command. Without this, you'd have to execute two
separate commands: add & archive, which will result in 2 RPC calls
anyway.

I think I would prefer if it was separate, as that would be consistent
with other plugins (e.g. for objects with members, we don't allow
adding
members directly in -add, you have to use -add-member after -add).

The vault data is similar to group description, not group members. When
creating a group we can supply the description. If not specified it
will
be blank. Archiving vault data is similar to updating the group
description.

It's similar to group members because there are separate commands to
manipulate them.

Just because there are separate commands doesn't mean vault data
(single-valued) is similar to group members (multi-valued). It uses
separate commands because of the encryption steps involved while
archiving/retrieving data.

That was not the point, but whatever.


You have to choose one of:

   a) vault data is settable using vault-add and vault-mod and gettable
using vault-mod, vault-show and vault-find

   b) vault data is settable using vault-archive and gettable using
vault-retrieve

Anything in between is not permitted.

As mentioned above, the add and archive commands are now separate.

21) vault_archive is not a retrieve operation, it should be based on
LDAPUpdate instead of LDAPRetrieve. Or Command actually, since it
does
not do anything with LDAP. The same applies to vault_retrieve.

The vault_archive does not actually modify the LDAP entry because it
stores the data in KRA. It is actually an LDAPRetrieve operation
because
it needs to get the vault info before it can perform the archival
operation. Same thing with vault_retrieve.

It is not a LDAPRetrieve operation, because it has different
semantics.
Please use Command as base class and either use ldap2 for direct
LDAP or
call vault_show instead of hacking around LDAPRetrieve.

It's been changed to inherit from LDAPQuery instead.

NACK, it's not a LDAPQuery operation, because it has different
semantics. There is more to a command than executing code, so you should
use a correct base class.

Changed to inherit from Command as requested. Now these commands no
longer have a direct access to the vault object (self.obj) although they
are accessing vault objects like other vault commands. Also now the
vault name argument has to be added explicitly on each command.

You can inherit from crud.Retrieve and crud.Update to get self.obj and
the argument back.


There are existing commands that inherit from LDAPQuery while doing
other things too, so the 'semantic' seems to be kind of arbitrarily
defined and subjective.

There is a lot of existing stuff in IPA that is bad in one way or
another, but it doesn't mean new code should be bad as well.


22) vault_archive will break with binary data that is not UTF-8
encoded
text.

This is where it occurs:

+        vault_data[u'data'] = unicode(data)

Generally, don't use unicode() on str values and str() on unicode
values
directly, always use .decode() and .encode().

The unicode(s, encoding) is actually equivalent to s.decode(encoding),
so the following code will not solve the problem:

   vault_data[u'data'] = data.decode()

As you said, decode() will only work if the data being decoded actually
follows the encoding rules (i.e. already UTF-8 encoded).

It needs to be a Unicode because json.dumps() doesn't work with
binary
data. Fixed by adding base-64 encoding.

The base-64 encoding is necessary to convert random binaries into ASCII
so it can be decoded into Unicode. Here is the current code:

   vault_data[u'data'] = unicode(base64.b64encode(data))

which is equivalent to:

   vault_data[u'data'] = base64.b64encode(data).decode()

If you read a little bit further, you would get to the point, which is
certainly not calling .decode() without arguments, but *always
explicitly specify the encoding*.

Added the explicit encoding name although it's not necessary since the
data being encoded/decoded is base-64 encoded (i.e. ASCII).

It may not be necessary but it doesn't hurt either, anyway my main
concern was with the other uses of unicode() in the original patch.


If something str needs to be unicode, you should use .decode() to
explicitly specify the encoding, instead of relying on unicode() to
pick
the correct one.

Since we know this is ASCII data we can now specify UTF-8 encoding.

   vault_data[u'data'] = base64.b64encode(data).decode('utf-8')

But for anything that comes from user input (e.g. filenames,
passwords),
it's better to use the default encoding because that can be configured
by the user.

You are confusing user's configured encoding with Python's default
encoding. Default encoding in Python isn't derived from user's
localization settings.

Anyway, anything that comes from user input is already decoded using
user's configured encoding when it enters the framework so I don't know
why are you even bringing it up here.

It's irrelevant now that the command only supports binary data.

OK.


Anyway, I think a better solution than base64 would be to use the
"raw_unicode_escape" encoding:

As explained above, base-64 encoding is necessary because random
binaries don't follow any encoding rules. I'd rather not use
raw_unicode_escape because it's not really a text data.

The result of decoding binary data using raw_unicode_escape is perfectly
valid unicode data which doesn't eat 33% more space like base64 encoded
binary does, hence my suggestion.

Anyway, it turns out when encoded in JSON, raw_unicode_escape string
generally takes more space than base64 encoded string because of JSON
escaping, so base64 is indeed better.

Unchanged. It still uses base-64 encoding.

Right.


Here's how it's
now implemented:

     if data:
         data = data.decode('raw_unicode_escape')

Input data is already in binaries, no conversion needed.

     elif text:
         data = text

Input text will be converted to binaries with default encoding:

   data = text.encode()

See what the default encoding actually is and why you shouldn't rely on
it above.

Irrelevant now.

     elif input_file:
         with open(input_file, 'rb') as f:
             data = f.read()
         data = data.decode('raw_unicode_escape')

Input contains binary data, no conversion needed.

     else:
         data = u''

If not specified, the data will be empty string:

   data = ''

The data needs to be converted into binaries so it can be encrypted
before transport (depending on the vault type):

   data = self.obj.encrypt(data, ...)

     vault_data[u'data'] = data

Then for transport the data is base-64 encoded first, then converted
into Unicode:

   vault_data[u'data'] = base64.b64encode(data).decode('utf-8')


If KRA is not installed, vault-archive and vault-retrieve fail with
internal error.

On a related note, since KRA is optional, can we move the vaults container to cn=kra,cn=vaults? This is the convetion used by the other optional components (DNS and recently CA).



The commands still behave differently based on whether they were called
from API which was initialized with in_server set to True or False.


There is no point in exposing the session_key, nonce and vault_data
options in CLI when their value is always overwritten in forward().


Will this always succeed?

+        # deactivate vault record in KRA
+        response = kra_client.keys.list_keys(
+            client_key_id, pki.key.KeyClient.KEY_STATUS_ACTIVE)
+
+        for key_info in response.key_infos:
+            kra_client.keys.modify_key_status(
+                key_info.get_key_id(),
+                pki.key.KeyClient.KEY_STATUS_INACTIVE)

If not, we might get into an inconsistent state, where the vault is
deleted in LDAP but still active in KRA. (I'm not sure if this is
actually a problem or not.)



--
Jan Cholasta

--
Manage your subscription for the Freeipa-devel mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-devel
Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code

Reply via email to