Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-24 Thread Petr Viktorin

On 07/23/2012 10:46 PM, John Dennis wrote:
[...]


The only thing holding up the ACK is the question of why po-files now
has update_pot as a dependency.



If files simply depend on $(DOMAIN).pot, then they are considered 
up-to-date even after they're changed (e.g. with strip-po). They need to 
depend on a rule that always runs so that they get merged.


There's another alternative to achieve this: adding them to .PHONY. The 
attached version does that, perhaps it's cleaner.



--
Petr³


From 87d94d673a7647ffe508a11c985e76f575180971 Mon Sep 17 00:00:00 2001
From: Petr Viktorin pvikt...@redhat.com
Date: Wed, 20 Jun 2012 06:38:16 -0400
Subject: [PATCH] Arrange stripping .po files

The .po files we use for translations have two shortcomings when used in Git:
- They include file locations, which change each time the source is updated.
  This results in large, unreadable diffs that don't merge well.
- They include source strings for untranslated messages, wasting space
  unnecessarily.

Update the Makefile so that the extraneous information is stripped when the
files are updated or pulled form Transifex, and empty translation files are
removed entirely.
Also, translations are normalized to a common style. This should help diffs
and merges.

The validator requires file location comments to identify the programming
language, and to produce good error reports.
To make this work, merge the comments in before validation.

First patch for: https://fedorahosted.org/freeipa/ticket/2435
---
 install/configure.ac   |5 +
 install/po/Makefile.in |   22 --
 install/po/README  |   16 ++--
 tests/i18n.py  |   12 ++--
 4 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/install/configure.ac b/install/configure.ac
index 827ddbab411a4aa8abbdd4488e217ce67046bd6b..9e781a684429191b3c5eb46aed4fceecc9be6586 100644
--- a/install/configure.ac
+++ b/install/configure.ac
@@ -48,6 +48,11 @@ if test x$MSGCMP = xno; then
 AC_MSG_ERROR([msgcmp not found, install gettext])
 fi
 
+AC_PATH_PROG(MSGATTRIB, msgattrib, [no])
+if test x$MSGATTRIB = xno; then
+AC_MSG_ERROR([msgattrib not found, install gettext])
+fi
+
 AC_PATH_PROG(TX, tx, [/usr/bin/tx])
 
 AC_ARG_WITH([gettext_domain],
diff --git a/install/po/Makefile.in b/install/po/Makefile.in
index 9a3dde78a20a6beb35ab08230331f28b7ea3161d..bc91a933b9e10e4178cb4190e62140549da06591 100644
--- a/install/po/Makefile.in
+++ b/install/po/Makefile.in
@@ -14,6 +14,7 @@ MSGFMT = @MSGFMT@
 MSGINIT = @MSGINIT@
 MSGMERGE = @MSGMERGE@
 MSGCMP = @MSGCMP@
+MSGATTRIB = @MSGATTRIB@
 TX = @TX@
 IPA_TEST_I18N = ../../tests/i18n.py
 
@@ -67,7 +68,7 @@ C_POTFILES = $(C_FILES) $(H_FILES)
 
 .SUFFIXES:
 .SUFFIXES: .po .mo
-.PHONY: all create-po update-po update-pot install mostlyclean clean distclean test mo-files debug
+.PHONY: all create-po update-po update-pot install mostlyclean clean distclean test mo-files debug strip-po merge-po $(po_files)
 
 all:
 
@@ -86,6 +87,19 @@ $(po_files): $(DOMAIN).pot
 	echo Merging $(DOMAIN).pot into $@; \
 	$(MSGMERGE) --no-fuzzy-matching -o $@ $@ $(DOMAIN).pot
 
+strip-po:
+	@for po_file in $(po_files); do \
+		echo Stripping $$po_file; \
+		$(MSGATTRIB) --translated --no-fuzzy --no-location $$po_file  $$po_file.tmp; \
+		mv $$po_file.tmp $$po_file; \
+	done
+	@export FILES_TO_REMOVE=`find . -name '*.po' -empty`; \
+	if [ $$FILES_TO_REMOVE !=  ]; then \
+		echo Removing empty translation files; \
+		rm -v $$FILES_TO_REMOVE; \
+		echo; echo Please remove the deleted files from LINGUAS!; echo; \
+	fi
+
 create-po: $(DOMAIN).pot
 	@for po_file in $(po_files); do \
 	if [ ! -e $$po_file ]; then \
@@ -98,10 +112,14 @@ create-po: $(DOMAIN).pot
 
 pull-po:
 	cd ../..; $(TX) pull -f
+	$(MAKE) strip-po
 
-update-po: update-pot
+merge-po: update-pot
 	$(MAKE) $(po_files)
 
+update-po: merge-po
+	$(MAKE) strip-po
+
 update-pot:
 	@rm -f $(DOMAIN).pot.update
 	@pushd ../.. ; \
diff --git a/install/po/README b/install/po/README
index ada7df40e3f294b204a5d44c267ee57ebe734042..6894a06337fac68675cb1a852ca828c54da74f96 100644
--- a/install/po/README
+++ b/install/po/README
@@ -6,28 +6,40 @@ A: Edit Makefile.in and add the source file to the appropriate *_POTFILES list.
NOTE: Now this i only necessary for python files that lack the .py
  extension. All .py, .c and .h files are automatically sourced.
 
+Q: Untranslated strings and file locations are missing from my .po file.
+   How do I add them?
+
+A: make merge-po
+   Untranslated strings are left out of the files in SCM. The merge-po command
+   runs msgmerge to add them again.
+
 Q: How do I pick up new strings to translate from the source files after the
source have been modified?
 
-A: make update-po
+A: make merge-po
This regenerates the pot template file by scanning all the source files.
Then the new strings are merged into each .po file from the new pot file.
 
 Q: How do I just regenerate the pot template file without 

[Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Alexander Bokovoy

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

  indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.

I was hit by this issue today, see https://fedorahosted.org/freeipa/ticket/2942

--
/ Alexander Bokovoy
From 60ec577e92ac9d2eabb78b27a877fa2dbd96741d Mon Sep 17 00:00:00 2001
From: Alexander Bokovoy aboko...@redhat.com
Date: Tue, 24 Jul 2012 12:07:23 +0300
Subject: [PATCH 3/3] Rework task naming in LDAP updates to avoid conflicting
 names in certain cases

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an integer value.
Instead, we can take time and attribute name. To get reasonable 'randomness'
these values are then hashed with sha1 and use the resulting string as task 
name.

https://fedorahosted.org/freeipa/ticket/2942
---
 ipaserver/install/ldapupdate.py |   18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/ipaserver/install/ldapupdate.py b/ipaserver/install/ldapupdate.py
index 
c64139889d9f84866ac0cd358ed3a3a7d95af7dc..a28ffb8c8fa86d4989db282ea60721db6d0a7c9a
 100644
--- a/ipaserver/install/ldapupdate.py
+++ b/ipaserver/install/ldapupdate.py
@@ -29,9 +29,11 @@ from ipaserver.install import installutils
 from ipaserver.install import service
 from ipaserver import ipaldap
 from ipapython import entity, ipautil
+from ipapython.compat import sha1
 from ipalib import util
 from ipalib import errors
 from ipalib import api
+from ipalib.dn import DN
 import ldap
 from ldap.dn import escape_dn_chars
 from ipapython.ipa_log_manager import *
@@ -328,16 +330,14 @@ class LDAPUpdate:
 if self.live_run:
 time.sleep(5)
 
-r = random.SystemRandom()
+tasktime = str(time.time())
+self.sub_dict['IDXHASH'] = sha1('_'.join(
+  [tasktime, attribute, tasktime, attribute ])).hexdigest()
 
-# Refresh the time to make uniqueness more probable. Add on some
-# randomness for good measure.
-self.sub_dict['TIME'] = int(time.time()) + r.randint(0,1)
+cn = self._template_str(indextask_$IDXHASH)
+dn = DN(('cn', cn), ('cn', 'index'), ('cn', 'tasks'), ('cn', 'config'))
 
-cn = self._template_str(indextask_$TIME)
-dn = cn=%s, cn=index, cn=tasks, cn=config % cn
-
-e = ipaldap.Entry(dn)
+e = ipaldap.Entry(str(dn))
 
 e.setValues('objectClass', ['top', 'extensibleObject'])
 e.setValue('cn', cn)
@@ -345,7 +345,7 @@ class LDAPUpdate:
 e.setValues('nsIndexAttribute', attribute)
 
 root_logger.info(Creating task to index attribute: %s, attribute)
-root_logger.debug(Task id: %s, dn)
+root_logger.debug(Task id: %s, str(dn))
 
 if self.live_run:
 self.conn.addEntry(e)
-- 
1.7.10.4

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-24 Thread Petr Viktorin

On 07/24/2012 01:12 AM, John Dennis wrote:

On 07/23/2012 06:27 AM, Petr Viktorin wrote:

As a translator (for another project), I don't like Transifex and prefer
to send good old Git pull requests. I understand a traditional
workflow is hard to coordinate with others that use Transifex, but still
I'd hate it if we became dependent on Tx.


For better or worse we are dependent on TX (Transifex). Fedora has
adopted TX as it's translation tool, RHEL's translation tools integrate
with TX (as well as other translation portals). And SSSD and IPA have
made a a commitment to TX based on the direction of Fedora and RHEL.

Given that we've adopted TX I don't see the value in maintaining tools
that support both TX and non-TX workflows. I'd rather see us delete the
non-TX elements. If we have just one workflow it's easier to understand
and maintain the code. If we ever decide we need to go back to a non-TX
workflow we can always retrieve the deleted code from git.



This means you have to be a member of a Fedora translation team to 
translate. It makes it harder for people to fork the project. A workflow 
with a mandatory central repository makes it impossible to experiment 
locally.
I'm all for having a standard way to receive contributions, but limiting 
how people can create those contributions isn't good.


I'm all for deleting unused code, but here I think it would be a bad move.


--
Petr³


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 1033 renew CA subsystem certificates

2012-07-24 Thread Petr Viktorin

On 07/23/2012 10:03 PM, Rob Crittenden wrote:

Rob Crittenden wrote:

Andrew Wnuk wrote:

On 07/16/2012 01:35 PM, Rob Crittenden wrote:

Nalin Dahyabhai wrote:

On Mon, Jul 16, 2012 at 09:23:24AM -0400, Rob Crittenden wrote:

Use the new certmonger capability to be able to renew the dogtag
subsystem certificates (audit, OCSP, etc).


Are the copies of the certificates in the pki-ca CS.cfg file being
updated elsewhere?  Or is it not turning out to be a problem if they
aren't?


I didn't test validating OCSP signatures but the audit subsystem
seemed fine (it complained wildly when I had the wrong trust in the
NSS db).

Andrew, do I need to update CS.cfg as well?


Yes, you may need update CS.cfg too.


Ok, added a bit to update CS.cfg with the new certificate.


This should fix some SELinux issues preventing certmonger from
monitoring the dogtag certificate database in /var/lib/pki-ca/alias.

rob


I don't know enough about dogtag/certmonger to comment on the 
functionality, but there are minor issues I can find. Attaching a patch 
to fix them.



`make rpms` fails:

rpmbuild --define _topdir /rpmbuild -ba freeipa.spec
error: %changelog not in descending chronological order
make: *** [rpms] Error 1



`git am` complains:

Applying: Use certmonger to renew CA subsystem certificates
/home/pviktori/freeipa/.git/rebase-apply/patch:576: new blank line at EOF.
+
/home/pviktori/freeipa/.git/rebase-apply/patch:645: new blank line at EOF.
+
warning: 2 lines add whitespace errors.


--
Petr³
From 047a1f7dc78c632b7f9882ab21f1fe5dc82fb006 Mon Sep 17 00:00:00 2001
From: Petr Viktorin pvikt...@redhat.com
Date: Tue, 24 Jul 2012 04:43:28 -0400
Subject: [PATCH] fixes for rcrit-1033-03

---
 freeipa.spec.in|2 +-
 install/share/default-aci.ldif |1 -
 install/updates/21-ca_renewal_container.update |1 -
 3 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/freeipa.spec.in b/freeipa.spec.in
index ce6c21aa3d9a6d92f6125e36905df5d3cf7b1a74..002a70a4385a502edc0c99b9b56ebbe02ef392ad 100644
--- a/freeipa.spec.in
+++ b/freeipa.spec.in
@@ -745,7 +745,7 @@ fi
 %ghost %attr(0644,root,apache) %config(noreplace) %{_sysconfdir}/ipa/ca.crt
 
 %changelog
-* Fri Jul  6 2012 Rob Crittenden rcrit...@redhat.com - 2.99.0-39
+* Tue Jul 24 2012 Rob Crittenden rcrit...@redhat.com - 2.99.0-39
 - Set minimum certmonger to 0.58 for dogtag cert renewal
 
 * Wed Jul 18 2012 Alexander Bokovoy aboko...@redhat.com - 2.99.0-38
diff --git a/install/share/default-aci.ldif b/install/share/default-aci.ldif
index 2fc04f667d3abe3a831c7d116c130c903f8e5106..6199ae5a68f648ca4e9fa6ded8083e5dcc07cb78 100644
--- a/install/share/default-aci.ldif
+++ b/install/share/default-aci.ldif
@@ -96,4 +96,3 @@ dn: cn=ipa,cn=etc,$SUFFIX
 changetype: modify
 add: aci
 aci: (target=ldap:///cn=*,cn=ca_renewal,cn=ipa,cn=etc,$SUFFIX;)(targetattr=userCertificate)(version 3.0; acl Modify CA Certificates for renewals; allow(write) userdn = host/$FQDN@$REALM;)
-
diff --git a/install/updates/21-ca_renewal_container.update b/install/updates/21-ca_renewal_container.update
index edb8f3e37bf8f6dee191782b8b2519f198fb3cd1..50b92d73d8af75cbc782769c45b6c439b07bbb2d 100644
--- a/install/updates/21-ca_renewal_container.update
+++ b/install/updates/21-ca_renewal_container.update
@@ -6,4 +6,3 @@ dn: cn=ca_renewal,cn=ipa,cn=etc,$SUFFIX
 add:objectClass: top
 add:objectClass: nsContainer
 add:cn: ca_renewal
-
-- 
1.7.10.4

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Petr Viktorin

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

   indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it doesn't 
add any randomness at all. It just obscures what's really happening.

Same with repeating [tasktime, attribute] two times.



 -root_logger.debug(Task id: %s, dn)
 +root_logger.debug(Task id: %s, str(dn))
This change is unnecessary; the %s means convert to str.




I was hit by this issue today, see
https://fedorahosted.org/freeipa/ticket/2942





--
Petr³


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Alexander Bokovoy

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

  indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it 
doesn't add any randomness at all. It just obscures what's really 
happening.

Hence using quotes to describe it. We don't need randomness in the task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


Same with repeating [tasktime, attribute] two times.

This can be reduced as SHA-1 output does not depend on size of the
input message.


--
/ Alexander Bokovoy

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Petr Viktorin

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

  indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique, neither 
is SHA1(indextask_$date_$attribute). Hashing has no effect on the 
chance of collisions.


You could use Python's pseudorandom number generator (random.randint) 
instead of random.SystemRandom. It's not cryptographically secure but 
it's enough to avoid collisions, and it doesn't use up system entropy 
(except for initial seeding, through `import random`).

indextask_$date_$attribute_$pseudorandomvalue should be unique enough.


Same with repeating [tasktime, attribute] two times.

This can be reduced as SHA-1 output does not depend on size of the
input message.



--
Petr³


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Alexander Bokovoy

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

 indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique, 
neither is SHA1(indextask_$date_$attribute). Hashing has no effect 
on the chance of collisions.


You could use Python's pseudorandom number generator (random.randint) 
instead of random.SystemRandom. It's not cryptographically secure but 
it's enough to avoid collisions, and it doesn't use up system entropy 
(except for initial seeding, through `import random`).

indextask_$date_$attribute_$pseudorandomvalue should be unique enough.

Using random here is really bad. What we ideally need is a method to
increment sequential calls for the same attribute. We use seconds to 
differentiate
between all tasks but that is not really required, tasks that were
processed will be removed.

--
/ Alexander Bokovoy

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Petr Viktorin

On 07/24/2012 02:49 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

 indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique,
neither is SHA1(indextask_$date_$attribute). Hashing has no effect
on the chance of collisions.

You could use Python's pseudorandom number generator (random.randint)
instead of random.SystemRandom. It's not cryptographically secure but
it's enough to avoid collisions, and it doesn't use up system entropy
(except for initial seeding, through `import random`).
indextask_$date_$attribute_$pseudorandomvalue should be unique enough.

Using random here is really bad.
What we ideally need is a method to increment sequential calls for
the same attribute.We use seconds to differentiate
between all tasks but that is not really required, tasks that were
processed will be removed.



Or maybe use $date_$attribute and just warn (ignore the error) if 
there's a duplicate -- if a reindex task for the same attribute already 
exists from the same second, do we really need to start a new one?


--
Petr³


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Rob Crittenden

Petr Viktorin wrote:

On 07/24/2012 02:49 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and
use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

 indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique,
neither is SHA1(indextask_$date_$attribute). Hashing has no effect
on the chance of collisions.

You could use Python's pseudorandom number generator (random.randint)
instead of random.SystemRandom. It's not cryptographically secure but
it's enough to avoid collisions, and it doesn't use up system entropy
(except for initial seeding, through `import random`).
indextask_$date_$attribute_$pseudorandomvalue should be unique enough.

Using random here is really bad.
What we ideally need is a method to increment sequential calls for
the same attribute.We use seconds to differentiate
between all tasks but that is not really required, tasks that were
processed will be removed.



Or maybe use $date_$attribute and just warn (ignore the error) if
there's a duplicate -- if a reindex task for the same attribute already
exists from the same second, do we really need to start a new one?



That is true.

We can generate a UUID, I think that is probably a better/safer thing to 
use overall.


rob

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


[Freeipa-devel] [PATCH] 0071 Create /etc/sysconfig/network if it doesn't exist

2012-07-24 Thread Petr Viktorin
When --hostname is given, client installation tries to write to 
/etc/sysconfig/network, which is not guaranteed to exist. Create it if 
it doesn't exist and we need to write to it.


https://fedorahosted.org/freeipa/ticket/2840

--
Petr³


From 826700fe8a695f3dffdd83398cebd3526f0a1640 Mon Sep 17 00:00:00 2001
From: Petr Viktorin pvikt...@redhat.com
Date: Thu, 19 Jul 2012 09:07:23 -0400
Subject: [PATCH] Create /etc/sysconfig/network if it doesn't exist

When the --hostname option is given to ipa-client-install, we
write HOSTNAME to /etc/sysconfig/network. When that file didn't exist,
the installer crashed.

Create the file if it doesn't exist and we need to write to it.

https://fedorahosted.org/freeipa/ticket/2840
---
 ipapython/ipautil.py |4 +++-
 ipapython/platform/redhat.py |   16 +---
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/ipapython/ipautil.py b/ipapython/ipautil.py
index 22c8e2937a731252ffcac77c9bd7e0bf636e61ff..d4cbcb3b437e26521f5abd50c0ba3c23e0b5f24f 100644
--- a/ipapython/ipautil.py
+++ b/ipapython/ipautil.py
@@ -990,13 +990,15 @@ def add_options(config, replacevars, appendvars, oldvars):
 
 return old_values
 
-def backup_config_and_replace_variables(fstore, filepath, replacevars=dict(), appendvars=dict()):
+def backup_config_and_replace_variables(
+fstore, filepath, replacevars=dict(), appendvars=dict()):
 
 Take a key=value based configuration file, back up it, and
 write new version with certain values replaced or appended
 
 All (key,value) pairs from replacevars and appendvars that
 were not found in the configuration file, will be added there.
+The file must exist before this function is called.
 
 It is responsibility of a caller to ensure that replacevars and
 appendvars do not overlap.
diff --git a/ipapython/platform/redhat.py b/ipapython/platform/redhat.py
index d3c23ab0debbcfa58eefa9607fe5cac8fcbf592b..3f35cfcc9607bd0c9a05659d936a9903ec198f1b 100644
--- a/ipapython/platform/redhat.py
+++ b/ipapython/platform/redhat.py
@@ -24,6 +24,8 @@
 import stat
 import sys
 import socket
+import stat
+
 from ipapython import ipautil
 from ipapython.platform import base
 from ipalib import api
@@ -187,10 +189,18 @@ def backup_and_replace_hostname(fstore, statestore, hostname):
 except ipautil.CalledProcessError, e:
 print sys.stderr, Failed to set this machine hostname to %s (%s). % (hostname, str(e))
 replacevars = {'HOSTNAME':hostname}
-old_values = ipautil.backup_config_and_replace_variables(fstore,
-  /etc/sysconfig/network,
-  replacevars=replacevars)
+
+filepath = '/etc/sysconfig/network'
+if not os.path.exists(filepath):
+# file doesn't exist; create it with correct ownership  mode
+open(filepath, 'a').close()
+os.chmod(filepath,
+stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH)
+os.chown(filepath, 0, 0)
+old_values = ipautil.backup_config_and_replace_variables(
+fstore, filepath, replacevars=replacevars)
 restore_context(/etc/sysconfig/network)
+
 if 'HOSTNAME' in old_values:
 statestore.backup_state('network', 'hostname', old_values['HOSTNAME'])
 else:
-- 
1.7.10.4

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Alexander Bokovoy

On Tue, 24 Jul 2012, Rob Crittenden wrote:

Petr Viktorin wrote:

On 07/24/2012 02:49 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To get
reasonable 'randomness' these values are then hashed with sha1 and
use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique,
neither is SHA1(indextask_$date_$attribute). Hashing has no effect
on the chance of collisions.

You could use Python's pseudorandom number generator (random.randint)
instead of random.SystemRandom. It's not cryptographically secure but
it's enough to avoid collisions, and it doesn't use up system entropy
(except for initial seeding, through `import random`).
indextask_$date_$attribute_$pseudorandomvalue should be unique enough.

Using random here is really bad.
What we ideally need is a method to increment sequential calls for
the same attribute.We use seconds to differentiate
between all tasks but that is not really required, tasks that were
processed will be removed.



Or maybe use $date_$attribute and just warn (ignore the error) if
there's a duplicate -- if a reindex task for the same attribute already
exists from the same second, do we really need to start a new one?



That is true.

We can generate a UUID, I think that is probably a better/safer thing 
to use overall.

Updated patch attached.

2012-07-24T14:36:31Z INFO Creating task to index attribute: memberOf
2012-07-24T14:36:31Z DEBUG Task id: 
cn=indextask_memberOf_135624333918176170_14302,cn=index,cn=tasks,cn=config
2012-07-24T14:36:32Z INFO Indexing finished
...
2012-07-24T14:36:43Z INFO Creating task to index attribute: memberUser
2012-07-24T14:36:43Z DEBUG Task id: 
cn=indextask_memberUser_135624334038532670_14302,cn=index,cn=tasks,cn=config
2012-07-24T14:36:44Z INFO Indexing finished

You can note that clock_seq does not change, it is because uuid.uuid1()
uses nanosecond resolution and uuid_generate_time() if the latter is
available. If we happen to ask for uuids within sub-nanosecond interval,
clock_seq will be different then.

I'm extracting only time and clock_seq instead of pasting full uuid
value because uuid_generate_time() will leak out ethernet MAC address
for 48-bit node. We don't need more bits here so I drop these 48 bits
and avoid publishing the ethernet MAC address, even in logs.

--
/ Alexander Bokovoy
From e5262b5625e8e3b2deaf9228ca8a53dcbea90593 Mon Sep 17 00:00:00 2001
From: Alexander Bokovoy aboko...@redhat.com
Date: Tue, 24 Jul 2012 12:07:23 +0300
Subject: [PATCH 3/3] Rework task naming in LDAP updates to avoid conflicting
 names in certain cases

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an integer value.
Instead, we generate an UUID and use its 60-bit time, 14-bit sequential number,
and attribute name.

https://fedorahosted.org/freeipa/ticket/2942
---
 ipaserver/install/ldapupdate.py |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/ipaserver/install/ldapupdate.py b/ipaserver/install/ldapupdate.py
index 

Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Petr Viktorin

On 07/24/2012 04:50 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Rob Crittenden wrote:

Petr Viktorin wrote:

On 07/24/2012 02:49 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To
get
reasonable 'randomness' these values are then hashed with sha1 and
use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it
doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the
task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second
resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique,
neither is SHA1(indextask_$date_$attribute). Hashing has no effect
on the chance of collisions.

You could use Python's pseudorandom number generator (random.randint)
instead of random.SystemRandom. It's not cryptographically secure but
it's enough to avoid collisions, and it doesn't use up system entropy
(except for initial seeding, through `import random`).
indextask_$date_$attribute_$pseudorandomvalue should be unique
enough.

Using random here is really bad.
What we ideally need is a method to increment sequential calls for
the same attribute.We use seconds to differentiate
between all tasks but that is not really required, tasks that were
processed will be removed.



Or maybe use $date_$attribute and just warn (ignore the error) if
there's a duplicate -- if a reindex task for the same attribute already
exists from the same second, do we really need to start a new one?



That is true.

We can generate a UUID, I think that is probably a better/safer thing
to use overall.

Updated patch attached.

2012-07-24T14:36:31Z INFO Creating task to index attribute: memberOf
2012-07-24T14:36:31Z DEBUG Task id:
cn=indextask_memberOf_135624333918176170_14302,cn=index,cn=tasks,cn=config
2012-07-24T14:36:32Z INFO Indexing finished
...
2012-07-24T14:36:43Z INFO Creating task to index attribute: memberUser
2012-07-24T14:36:43Z DEBUG Task id:
cn=indextask_memberUser_135624334038532670_14302,cn=index,cn=tasks,cn=config

2012-07-24T14:36:44Z INFO Indexing finished

You can note that clock_seq does not change, it is because uuid.uuid1()
uses nanosecond resolution and uuid_generate_time() if the latter is
available. If we happen to ask for uuids within sub-nanosecond interval,
clock_seq will be different then.

I'm extracting only time and clock_seq instead of pasting full uuid
value because uuid_generate_time() will leak out ethernet MAC address
for 48-bit node. We don't need more bits here so I drop these 48 bits
and avoid publishing the ethernet MAC address, even in logs.

--
/ Alexander Bokovoy


Yes, this approach will work. Just some technical comments:



+self.sub_dict['TIME'] = cn_uuid.get_time()
+self.sub_dict['SEQ'] = cn_uuid.get_clock_seq()


Use the attributes, `cn_uuid.time` and `cn_uuid.clock_seq`. The accessor 
methods are undocumented.



+self.sub_dict['ATTR'] = attribute

-# Refresh the time to make uniqueness more probable. Add on some
-# randomness for good measure.
-self.sub_dict['TIME'] = int(time.time()) + r.randint(0,1)
+cn = self._template_str(indextask_${ATTR}_${TIME}_${SEQ})


You're overwriting sub_dict['TIME'], which is used elsewhere in the 
class. That could cause trouble.

There's no reason to use sub_dict here; you can simply do:

cn = 'indextask_%s_%s_%s' % (attribute, cn_uuid.time, cn_uuid.clock_seq)



--
Petr³


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] slow response

2012-07-24 Thread John Dennis

On 07/24/2012 12:03 PM, Stephen Ingram wrote:

On Mon, Jul 23, 2012 at 11:25 AM, John Dennis jden...@redhat.com wrote:

On 07/23/2012 12:37 PM, John Dennis wrote:


Note the timestamps only have 1 second resolution, something
which would be easy to fix, so bear that in mind when looking at the
output of the script.



This is partially as a note to myself, but also to others who want to test.
The following patch should produce timestamps in the debug output with
microsecond resolution.


No one seems to be jumping in on this. Should I run the test again
with this patch? I'm really interested in getting this resolved as I
can't help but think others are experiencing the same problem. I do
have a user in the system that is scheduled for deletion. Would it
help for you to grab a ticket with those credentials so you could
actually see the slowness in action?


It would be good to re-run your test with better timing information. 
Attached is a patch that adds more timing information to the debug log 
output. Please apply the patch and re-run your test the same way you did 
before and email me the contents of /var/log/httpd/error_log. In the 
error_log you should see lines with elapsed_time=, e.g.


INFO: ad...@xxx.xxx.xxx.xxx.xx: batch(({'params': ((), {}), 'method': 
'ping'},)): SUCCESS elapsed_time=0.001663


To apply the patch you'll need to be root to modify the files, this 
should work:


% su
% cd /usr/lib/python2.7/site-packages
% patch -p1 -b  path_to_saved_patch_file

If you don't have patch installed:

% sudo yum install patch

Then restart httpd and re-run the test.


--
John Dennis jden...@redhat.com

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
diff --git a/ipalib/session.py b/ipalib/session.py
index 4b783bb..1cff222 100644
--- a/ipalib/session.py
+++ b/ipalib/session.py
@@ -628,9 +628,9 @@ mod_auth_kerb. Everything else remains the same.
 
 default_max_session_duration = 60*60 # number of seconds
 
-ISO8601_DATETIME_FMT = '%Y-%m-%dT%H:%M:%S' # FIXME jrd, this should be defined elsewhere
+ISO8601_DATETIME_FMT = '%Y-%m-%dT%H:%M:%S.%f' # FIXME jrd, this should be defined elsewhere
 def fmt_time(timestamp):
-return time.strftime(ISO8601_DATETIME_FMT, time.localtime(timestamp))
+return datetime.fromtimestamp(timestamp).strftime(ISO8601_DATETIME_FMT)
 
 #---
 
diff --git a/ipaserver/plugins/ldap2.py b/ipaserver/plugins/ldap2.py
index 6a3d216..c615a7d 100644
--- a/ipaserver/plugins/ldap2.py
+++ b/ipaserver/plugins/ldap2.py
@@ -715,6 +715,7 @@ class ldap2(CrudBackend, Encoder):
 attrs_list = list(set(attrs_list))
 
 # pass arguments to python-ldap
+start_time = time.time()
 try:
 id = self.conn.search_ext(
 base_dn, scope, filter, attrs_list, timeout=time_limit,
@@ -733,6 +734,9 @@ class ldap2(CrudBackend, Encoder):
 except _ldap.LDAPError, e:
 _handle_errors(e)
 
+end_time = time.time()
+root_logger.debug('ldap2 find_entries: elapsed_time=%f', end_time-start_time)
+
 if not res and not truncated:
 raise errors.NotFound(reason='no such entry')
 
@@ -1363,4 +1367,3 @@ class ldap2(CrudBackend, Encoder):
 return (len(output), output)
 
 api.register(ldap2)
-
diff --git a/ipaserver/rpcserver.py b/ipaserver/rpcserver.py
index c770290..d290276 100644
--- a/ipaserver/rpcserver.py
+++ b/ipaserver/rpcserver.py
@@ -305,6 +305,7 @@ class WSGIExecutioner(Executioner):
 args = ()
 options = {}
 
+start_time = time.time()
 if not 'HTTP_REFERER' in environ:
 return self.marshal(result, RefererError(referer='missing'), _id)
 if not environ['HTTP_REFERER'].startswith('https://%s/ipa' % self.api.env.host) and not self.env.in_tree:
@@ -349,10 +350,11 @@ class WSGIExecutioner(Executioner):
 # get at least some context of what is going on
 params = options
 principal = getattr(context, 'principal', 'UNKNOWN')
+end_time = time.time()
 if error:
 self.info('%s: %s(%s): %s', principal, name, ', '.join(self.Command[name]._repr_iter(**params)), e.__class__.__name__)
 else:
-self.info('%s: %s(%s): SUCCESS', principal, name, ', '.join(self.Command[name]._repr_iter(**params)))
+self.info('%s: %s(%s): SUCCESS elapsed_time=%f', principal, name, ', '.join(self.Command[name]._repr_iter(**params)), end_time-start_time)
 else:
 self.info('%s: %s', context.principal, e.__class__.__name__)
 return self.marshal(result, error, _id)
@@ -791,6 +793,7 @@ class jsonserver_session(jsonserver, KerberosSession):
 '''
 '''
 
+start_time = time.time()
 self.debug('WSGI jsonserver_session.__call__:')
 
 # Load the session data
@@ -852,6 +855,8 @@ class 

[Freeipa-devel] Pushed one liner for spec file

2012-07-24 Thread Simo Sorce
FYI,
I pushed the following as a one-liner so that automatica builders gets
dependencies right:
http://git.fedorahosted.org/git/?p=freeipa.git;a=commitdiff;h=dedb180ddc773baf42bb31efc07a1dda698631bb

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

2012-07-24 Thread Alexander Bokovoy

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 04:50 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Rob Crittenden wrote:

Petr Viktorin wrote:

On 07/24/2012 02:49 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:

On Tue, 24 Jul 2012, Petr Viktorin wrote:

On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:

Hi,

There are two problems in task naming in LDAP updates:

1. Randomness may be scarce in virtual machines
2. Random number is added to the time value rounded to a second

The second issue leads to values that may repeat themselves as time
only grows and random number is non-negative as well, so
t2+r2 can be equal to t1+t2 generated earlier.

Since task name is a DN, there is no strict requirement to use an
integer value.  Instead, we can take time and attribute name. To
get
reasonable 'randomness' these values are then hashed with sha1 and
use
the resulting string as task name.

SHA1 may technically be an overkill here as we could simply use

indextask_$date_$attribute

where $date is a value of time.time() but SHA1 gives a resonable
'randomness' into the string.


What kind of randomness do you mean? SHA1 is deterministic, it
doesn't
add any randomness at all. It just obscures what's really happening.

Hence using quotes to describe it. We don't need randomness in the
task
names, we need something that avoids collisions.

An issue here is in time.time() -- it may give us sub-second
resolution
if underlying OS supports it, it may not. Having a second-level
resolution is not enough, especially on fast machines, so we can't
simply use int(times.time()) as it was in the original version.

indextask_$date_$attribute has this issue that we don't have enough
guarantee for $date (time.time()) to be unique in sufficiently tight
conditions, thus use of SHA-1 to generate something that has better
chances to avoid collisions than $data_$attribute.


My point is that if indextask_$date_$attribute is not unique,
neither is SHA1(indextask_$date_$attribute). Hashing has no effect
on the chance of collisions.

You could use Python's pseudorandom number generator (random.randint)
instead of random.SystemRandom. It's not cryptographically secure but
it's enough to avoid collisions, and it doesn't use up system entropy
(except for initial seeding, through `import random`).
indextask_$date_$attribute_$pseudorandomvalue should be unique
enough.

Using random here is really bad.
What we ideally need is a method to increment sequential calls for
the same attribute.We use seconds to differentiate
between all tasks but that is not really required, tasks that were
processed will be removed.



Or maybe use $date_$attribute and just warn (ignore the error) if
there's a duplicate -- if a reindex task for the same attribute already
exists from the same second, do we really need to start a new one?



That is true.

We can generate a UUID, I think that is probably a better/safer thing
to use overall.

Updated patch attached.

2012-07-24T14:36:31Z INFO Creating task to index attribute: memberOf
2012-07-24T14:36:31Z DEBUG Task id:
cn=indextask_memberOf_135624333918176170_14302,cn=index,cn=tasks,cn=config
2012-07-24T14:36:32Z INFO Indexing finished
...
2012-07-24T14:36:43Z INFO Creating task to index attribute: memberUser
2012-07-24T14:36:43Z DEBUG Task id:
cn=indextask_memberUser_135624334038532670_14302,cn=index,cn=tasks,cn=config

2012-07-24T14:36:44Z INFO Indexing finished

You can note that clock_seq does not change, it is because uuid.uuid1()
uses nanosecond resolution and uuid_generate_time() if the latter is
available. If we happen to ask for uuids within sub-nanosecond interval,
clock_seq will be different then.

I'm extracting only time and clock_seq instead of pasting full uuid
value because uuid_generate_time() will leak out ethernet MAC address
for 48-bit node. We don't need more bits here so I drop these 48 bits
and avoid publishing the ethernet MAC address, even in logs.

--
/ Alexander Bokovoy


Yes, this approach will work. Just some technical comments:



+self.sub_dict['TIME'] = cn_uuid.get_time()
+self.sub_dict['SEQ'] = cn_uuid.get_clock_seq()


Use the attributes, `cn_uuid.time` and `cn_uuid.clock_seq`. The 
accessor methods are undocumented.

Fixed.




+self.sub_dict['ATTR'] = attribute

-# Refresh the time to make uniqueness more probable. Add on some
-# randomness for good measure.
-self.sub_dict['TIME'] = int(time.time()) + r.randint(0,1)
+cn = self._template_str(indextask_${ATTR}_${TIME}_${SEQ})


You're overwriting sub_dict['TIME'], which is used elsewhere in the 
class. That could cause trouble.

Since it was set here and others are using it, I kept it updated in a
new version as well, based on cn_uuid.time. But since cn_uuid.time is in
nanoseconds resolution, I have to divide the value by 1e9.


There's no reason to use sub_dict here; you can simply do:

cn 

[Freeipa-devel] [PATCH] 1039 fix selinux usermap config options

2012-07-24 Thread Rob Crittenden
The configuration options for the default user and map order were a bit 
broken in several ways.


I wasn't handling the case where one of the values was coming from LDAP 
so was a list vs as an option which was a string, so all sorts of bad 
interesting things were happening.


There is also the setattr problem. We would normally handle that in a 
validator so it is not a problem but in this case we may need to compare 
two options passed in and we can't do that in a validator. So 
potentially changes may come in as a option, in entry_attrs or from config.


I added a few tests to help keep this robust.

When testing this remember that the user map order list needs to be 
quoted otherwise the shell is going to interpret the $.


rob
From db602a401955dd65b2608afd7e3e90750fba590e Mon Sep 17 00:00:00 2001
From: Rob Crittenden rcrit...@redhat.com
Date: Tue, 24 Jul 2012 22:55:27 -0400
Subject: [PATCH] Fix validator for SELinux user map settings in config
 plugin.

We need to compare two values and need to be aware of where those
values are coming from. They may come from options, setattr or
existing config. The format of that data is going to be different
depending on its source (always a list internally).

One may also set both at the same time so a standard validator cannot
be used because it lacks the context of the other value being set.

https://fedorahosted.org/freeipa/ticket/2938
https://fedorahosted.org/freeipa/ticket/2940
---
 ipalib/plugins/config.py|   36 ++-
 tests/test_xmlrpc/test_config_plugin.py |   28 
 2 files changed, 49 insertions(+), 15 deletions(-)

diff --git a/ipalib/plugins/config.py b/ipalib/plugins/config.py
index d4d6ba7b58f473fe1f8990f6dbfb8a71ab395cc9..a94241b170457126987d6289202dfb5b462c859e 100644
--- a/ipalib/plugins/config.py
+++ b/ipalib/plugins/config.py
@@ -18,6 +18,7 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see http://www.gnu.org/licenses/.
 
+import copy
 from ipalib import api
 from ipalib import Bool, Int, Str, IA5Str, StrEnum
 from ipalib.plugins.baseldap import *
@@ -257,30 +258,35 @@ class config_mod(LDAPUpdate):
 error=_('%(obj)s default attribute %(attr)s would not be allowed!') \
 % dict(obj=obj, attr=obj_attr))
 
-if 'ipaselinuxusermapdefault' in options and options['ipaselinuxusermapdefault'] is None:
-raise errors.ValidationError(name='ipaselinuxusermapdefault',
-error=_('SELinux user map default user may not be empty'))
-
-# Make sure the default user is in the list
-if 'ipaselinuxusermapdefault' in options or \
-  'ipaselinuxusermaporder' in options:
+# Combine the current entry and options into a single object to
+# evaluate. This covers changes via setattr and options.
+# Note: this is not done in a validator because we may be changing
+#   the default user and map list at the same time and we don't
+#   have both values in a validator.
+validate = copy.deepcopy(options)
+validate.update(entry_attrs)
+if 'ipaselinuxusermapdefault' in validate or \
+  'ipaselinuxusermaporder' in validate:
 config = None
-if 'ipaselinuxusermapdefault' in options:
-defaultuser = options['ipaselinuxusermapdefault']
+failedattr = 'ipaselinuxusermaporder'
+if 'ipaselinuxusermapdefault' in validate:
+defaultuser = validate['ipaselinuxusermapdefault']
+failedattr = 'ipaselinuxusermapdefault'
 else:
 config = ldap.get_ipa_config()[1]
-defaultuser = config['ipaselinuxusermapdefault']
+defaultuser = config['ipaselinuxusermapdefault'][0]
 
-if 'ipaselinuxusermaporder' in options:
-order = options['ipaselinuxusermaporder']
+if 'ipaselinuxusermaporder' in validate:
+order = validate['ipaselinuxusermaporder']
+userlist = order.split('$')
 else:
 if not config:
 config = ldap.get_ipa_config()[1]
 order = config['ipaselinuxusermaporder']
-userlist = order[0].split('$')
+userlist = order[0].split('$')
 if defaultuser not in userlist:
-raise errors.ValidationError(name='ipaselinuxusermaporder',
-error=_('Default SELinux user map default user not in order list'))
+raise errors.ValidationError(name=failedattr,
+error=_('SELinux user map default user not in order list'))
 
 return dn
 
diff --git a/tests/test_xmlrpc/test_config_plugin.py b/tests/test_xmlrpc/test_config_plugin.py
index