Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-25 Thread John Dennis

On 07/23/2009 09:01 PM, Dmitri Pal wrote:

We are considering combining the log monitor with audit collection.
These are two separate functions it would be logical
to have them in separate threads.


The architecture is and will be event driven with a main event loop. In 
most cases event loops are superior to threads, both from a performance 
and complexity perspective. The only reason I could envision needing to 
use threads is if some operation was subject to blocking. But then again 
event loops are particularly adept at handling non-blocking IO so the 
only reason I could see a need for threads is if there were some reason 
we couldn't use asynchronous non-blocking methods.


Asynchronous non-blocking methods are to the best of my knowledge the 
preferred approach in the project.


--
John Dennis 

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-24 Thread Simo Sorce
On Fri, 2009-07-24 at 11:53 -0400, John Dennis wrote:
> Our coding standard recommends no braces when there is only a single 
> expression. However I strongly disagree with this, in fact many
> coding 
> standard recommend exactly the opposite, to always use braces! Why? 
> Because I've seen plenty examples of bugs introduced when a
> programmer 
> added a new expression to the conditional and totally broke the
> program 
> logic. Usually these mistakes aren't caught until much later when
> things 
> mysterously fail. I consider the use of braces a simple and effective 
> example of defensive programming.
> 

Please propose to amend the coding style in a separate mail.
I have had mixed feelings about this, but I think I might agree with you
on this point, so maybe we will get enough support to change the coding
style on this point.

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-24 Thread John Dennis

On 07/23/2009 09:01 PM, Dmitri Pal wrote:

1) There is generally a lack of comments in the code. it is very hard to
read.
IMO the lack of comments is a style issue rather then maturity of the
code issue.


> 13) I was amazed how well the path_util interface is defined and
> documented .

As you observe I generally document my code well, you can see similar 
good documentation the C++ logger I just submitted as well. FWIW I 
usually document once the code stablizes. To me this is more an issue of 
being asked to submit the code for review before I was ready to 
"publish" it.



2) In many places the spaces are missing around commas, "-", "+".


Will fix in next patch.


3) Heavy use of the complex constructs:


O.K. I'll break these up as I encounter them.


  4) In multiple places there is just one line of code inside the  {}
after if for example:


Our coding standard recommends no braces when there is only a single 
expression. However I strongly disagree with this, in fact many coding 
standard recommend exactly the opposite, to always use braces! Why? 
Because I've seen plenty examples of bugs introduced when a programmer 
added a new expression to the conditional and totally broke the program 
logic. Usually these mistakes aren't caught until much later when things 
mysterously fail. I consider the use of braces a simple and effective 
example of defensive programming.



7) It should be a practice of using thread safe versions of system
functions


 Just good defensive programming practice.

O.K. I'll switch over as time allows. FWIW the code wasn't designed to 
be thread safe, I never imagined it being run in seperate threads. But 
perhaps that should be one of our coding standards, that all code needs 
to be thread safe irrespective of it's intended usage. I'm aware of some 
other areas which are not thread safe at the moment, it was always my 
intention to clean those up anyway.




8) A lot of static global data. It would be beneficial to combine these
things together in some kind of context structure and pass it around.


Actually it's not static, but you're right a number of the major data 
structures are singletons. At the moment I can't think of a reason why 
we would need multiple independent copies of the state, but perhaps down 
the road such a need might come up and having them hung off of a context 
would be necessary in that case. It's not needed now, but if it's not 
too difficult I'll gather them up in a context.



Even if you keep it static for now it would be easier to refine code
later. Currently too many low level functions rely the global data
so it would require a significant refactoring effort to get rid of those
down the road.

9) SSSD and Common and Audit all three use tree different ways of
tracing/debugging things.
Shouldn't we get to some consistency there?


yes, consistency would be better, the audit server adds a 4th (but in 
the case of the audit server I believe it's logging must be independent 
of ELAPI etc. because otherwise it creates a vicious cycle).


I would have used ELAPI but it wasn't ready when the code was written.


It would be better if the log_msg be a macro from the very beginning.


It's a macro now :-)


11) From our design discussions I remember that  you mentioned that
there was an issue with recursive directories and monitoring nested
paths or something like this.
It seems that find_existing_directory_ancestor() has to do with this issue.
I was expecting to see some extended comment about this somewhere around
create_watch() function.


Yup, as indicated above, more extensive documentation is on the to-do list.


This is all for now.


Thank you for your review and attention. I'm preparing a new patch. I've 
got all of Stephen's issues fixed and I'll move on now to folding in 
your suggestions, the patch will follow.


--
John Dennis 

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-24 Thread Dmitri Pal
John Dennis wrote:
>>> 10)  Such code will not produce right results since there are multiple
>>> ways why function can return NULL and not in all cases
>>> the errno is set.
>>>
>>>  if ((pw = new_path_watch(existing_directory_ancestor)) ==
>>> NULL) {
>>> +error = errno;
>>> +log_msg(LOG_ERROR, _("%s: could not allocate new watch for
>>> \"%s\" (%s)\n"),
>>> +__FUNCTION__, existing_directory_ancestor,
>>> error_string(error));
>>> +return error;
>>>
>>> Better approach to such cases would be either
>>>
>>> pw = new_path_watch(existing_directory_ancestor,&error);
>>> if (!pw) {
>>> ...
>>>
>>> or which I prefer more:
>>>
>>> error = new_path_watch(&pw, existing_directory_ancestor);
>>> if(error) {
>>> ...
>>
>> This second approach is more in line with the rest of the code in the
>> SSSD. We do this all over the code. Also, our preference is for this
>> error code to remain in line with the errno set, so things like ENOENT
>> and ENOMEM are standard. (We also defined EOK which is identical to
>> EXIT_SUCCESS for convenience)
>
> Hopefully you'll have noticed the 2nd approach, the preferred one used
> in SSSD, is the one I used for virtually all function definitions. The
> function cited above is an anomaly. It's good your review caught it
> (BTW, the reason it was coded that way originally was it was
> originally just an "allocator" function whose only failure would have
> been no memory, but it grew new functionality and it should have been
> modified at that point to return an error status just like everything
> else does.
>
>>> 12) In search for some functions I looked at path_utils.
>>> First in general I am a bit scared about using static buffers to
>>> hold paths.
>>> Is it still an Ok practice? In the SSSD all code is allocated in
>>> collections
>>> and ini the only place  where the static buffer was used was some
>>> parsing of the
>>> file where the reasonable length of the line in the ini file is less
>>> than 100 bytes
>>> while buffer is about 64K. Should we use dynamic memory for paths?
>>> Start with malloc(MAX_PATH) and then realloc if needed?
>>>
>>
>> The idea behind MAX_PATH is that it's defined on every system to be the
>> largest legitimate value of a path that the system can handle
>> internally. malloc(MAX_PATH) doesn't gain anything over path[MAX_PATH]
>> except a need to manage the memory. I generally prefer, however, only to
>> allocate memory as-needed. This will come in handy in the future if
>> anyone should want to include this daemon on a limited-memory device
>> (such as a Linux-based router). That said, for speed and ease of use,
>> you can't beat this static buffer. I just balk at always using a full
>> memory page for every path, when in most cases you're going to need<
>> 100 bytes. Considering that the whole purpose of this tool is to easily
>> manage large numbers of paths, that 4kiB (on Linux; 8kiB or more on some
>> other systems) can really add up.
>
> I think one can make two equally valid arguments with respect to
> storing path names. Pre-declaring a variable or structure member of
> sufficient size to hold any pathname has the advantage of simplicity,
> simplicity tends to lead toward more robust and efficient code (less
> mistakes to make and fewer cycles spent trying to optimize memory use).
>
> However, Stephen is right on the money when he observes in most cases
> the bulk of the memory allocated won't be used and this *may* be a
> liability.
>
> First of all one needs to be aware that much of the use of the
> construct "path[MAX_PATH]" occurs in automatic variables on the stack.
> In this context it's hard to argue the unused memory in the string is
> a liability, the memory comes into existence and is freed by a simple
> adjustment of the stack pointer and has a very short lifetime (yes, it
> does cost extra stack space). The alternative is more complex memory
> management, either alloca on the stack or from the heap, both require
> more complex logic with a greater opportunity for program error and
> many more processor cycles (don't forget these path variables only
> come into existence because we need to manipulate them which means one
> must always be doing bounds checking and reallocations on virtually
> all operations).
>
> There are places where buffers of size MAX_PATH are members of heap
> allocated objects. These are better candidates for memory
> optimization, however it does incur much more complex program logic.
> In fact the problem really reduces to:
>
> Dynamic String Objects
>
> E.g. in C++ std::string. It's not just a matter of dynamic buffer
> management, it also requires functions to be bound to the underlying
> buffer which can perform various string operations.
>
> To the best of my knowledge we don't have a library available to us in
> C which has been blessed for our use which implements this. Just like
> we were required to write our own hash table implementation we w

Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-24 Thread John Dennis

10)  Such code will not produce right results since there are multiple
ways why function can return NULL and not in all cases
the errno is set.

 if ((pw = new_path_watch(existing_directory_ancestor)) == NULL) {
+error = errno;
+log_msg(LOG_ERROR, _("%s: could not allocate new watch for
\"%s\" (%s)\n"),
+__FUNCTION__, existing_directory_ancestor,
error_string(error));
+return error;

Better approach to such cases would be either

pw = new_path_watch(existing_directory_ancestor,&error);
if (!pw) {
...

or which I prefer more:

error = new_path_watch(&pw, existing_directory_ancestor);
if(error) {
...


This second approach is more in line with the rest of the code in the
SSSD. We do this all over the code. Also, our preference is for this
error code to remain in line with the errno set, so things like ENOENT
and ENOMEM are standard. (We also defined EOK which is identical to
EXIT_SUCCESS for convenience)


Hopefully you'll have noticed the 2nd approach, the preferred one used 
in SSSD, is the one I used for virtually all function definitions. The 
function cited above is an anomaly. It's good your review caught it 
(BTW, the reason it was coded that way originally was it was originally 
just an "allocator" function whose only failure would have been no 
memory, but it grew new functionality and it should have been modified 
at that point to return an error status just like everything else does.



12) In search for some functions I looked at path_utils.
First in general I am a bit scared about using static buffers to hold paths.
Is it still an Ok practice? In the SSSD all code is allocated in
collections
and ini the only place  where the static buffer was used was some
parsing of the
file where the reasonable length of the line in the ini file is less
than 100 bytes
while buffer is about 64K. Should we use dynamic memory for paths?
Start with malloc(MAX_PATH) and then realloc if needed?



The idea behind MAX_PATH is that it's defined on every system to be the
largest legitimate value of a path that the system can handle
internally. malloc(MAX_PATH) doesn't gain anything over path[MAX_PATH]
except a need to manage the memory. I generally prefer, however, only to
allocate memory as-needed. This will come in handy in the future if
anyone should want to include this daemon on a limited-memory device
(such as a Linux-based router). That said, for speed and ease of use,
you can't beat this static buffer. I just balk at always using a full
memory page for every path, when in most cases you're going to need<
100 bytes. Considering that the whole purpose of this tool is to easily
manage large numbers of paths, that 4kiB (on Linux; 8kiB or more on some
other systems) can really add up.


I think one can make two equally valid arguments with respect to storing 
path names. Pre-declaring a variable or structure member of sufficient 
size to hold any pathname has the advantage of simplicity, simplicity 
tends to lead toward more robust and efficient code (less mistakes to 
make and fewer cycles spent trying to optimize memory use).


However, Stephen is right on the money when he observes in most cases 
the bulk of the memory allocated won't be used and this *may* be a 
liability.


First of all one needs to be aware that much of the use of the construct 
"path[MAX_PATH]" occurs in automatic variables on the stack. In this 
context it's hard to argue the unused memory in the string is a 
liability, the memory comes into existence and is freed by a simple 
adjustment of the stack pointer and has a very short lifetime (yes, it 
does cost extra stack space). The alternative is more complex memory 
management, either alloca on the stack or from the heap, both require 
more complex logic with a greater opportunity for program error and many 
more processor cycles (don't forget these path variables only come into 
existence because we need to manipulate them which means one must always 
be doing bounds checking and reallocations on virtually all operations).


There are places where buffers of size MAX_PATH are members of heap 
allocated objects. These are better candidates for memory optimization, 
however it does incur much more complex program logic. In fact the 
problem really reduces to:


Dynamic String Objects

E.g. in C++ std::string. It's not just a matter of dynamic buffer 
management, it also requires functions to be bound to the underlying 
buffer which can perform various string operations.


To the best of my knowledge we don't have a library available to us in C 
which has been blessed for our use which implements this. Just like we 
were required to write our own hash table implementation we would be 
required to write our own dynamic string library.


So when the choice came as to whether to write a new string library or 
declare paths using MAX_PATH and utilize the existing clib string 
functions (plus some customs functions) it seemed to me to

Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-24 Thread Dmitri Pal

>
> Debug messages should NEVER be conditionally compiled. For an
> executable, they should always be set by a runtime flag, and for a
> library they should always be set by an argument to the initialization
> function or by an environment variable. It should always be possible for
> someone running the code in a production environment to get additional
> information about a crash without requiring an instrumented build.
>
> Furthermore, "tracing" should be a special case of debugging (i.e. it
> should just be a very high debug_level.
Steve,

Here we disagree.
Let us set terminology first:
logging - unconditional output of the program about its normal operation
or issues it encountered.
debugging - conditional output for testing at runtime
tracing - conditionally compiled low level output for debugging specific
area of the code.

There should be a good balance between all three.
logging and debugging is needed for any service as you said to be able
to  understand what is happening and troubleshoot at run time.

Tracing is more for the low level libraries. Keep in mind that it
becomes extremely costly to have tracing always compiled in.
The libraries should be delivered as .so so building an so with enabled
tracing and swapping it should not be a big problem.

Spo tracing is not a special form of debugging. This is "one size fits
all approach" and it is wrong in this case.
A caution and discretion should be used.

This however does not mean that we can't standardize on ways to do each
of the three things.
ELAPI should eventually become our logging api, it can be used for
debugging too so it will cover these two.
But tracing should be very low level and very simple conditionally
compiled printout statements that would allow tracing the ELAPI itself.

-- 
Thank you,
Dmitri Pal

Engineering Manager IPA project,
Red Hat Inc.


---
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-24 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/23/2009 09:01 PM, Dmitri Pal wrote:
> 
> 9) SSSD and Common and Audit all three use tree different ways of
> tracing/debugging things.
> Shouldn't we get to some consistency there?
> It would be better if the log_msg be a macro from the very beginning.
> This way it would be easier
> to make it conditionally compiled debugging. Currently is is function
> and would require significant
> effort to go through the code and make it consistent. I would argue that
> a) Tracing/debugging should be conditionally compiled day 1
> b) Debugging strings do not require translation
> c) Tracing pattern should be consistent - every error should be traced
> otherwise it is hard to find problems.
> 

Debug messages should NEVER be conditionally compiled. For an
executable, they should always be set by a runtime flag, and for a
library they should always be set by an argument to the initialization
function or by an environment variable. It should always be possible for
someone running the code in a production environment to get additional
information about a crash without requiring an instrumented build.

Furthermore, "tracing" should be a special case of debugging (i.e. it
should just be a very high debug_level.

> Just something to think about...
> 
> 10)  Such code will not produce right results since there are multiple
> ways why function can return NULL and not in all cases
> the errno is set.
>
> if ((pw = new_path_watch(existing_directory_ancestor)) == NULL) {
> +error = errno;
> +log_msg(LOG_ERROR, _("%s: could not allocate new watch for
> \"%s\" (%s)\n"),
> +__FUNCTION__, existing_directory_ancestor,
> error_string(error));
> +return error;
> 
> Better approach to such cases would be either
> 
> pw = new_path_watch(existing_directory_ancestor, &error);
> if (!pw) {
> ...
> 
> or which I prefer more:
> 
> error = new_path_watch(&pw, existing_directory_ancestor);
> if(error) {
> ...

This second approach is more in line with the rest of the code in the
SSSD. We do this all over the code. Also, our preference is for this
error code to remain in line with the errno set, so things like ENOENT
and ENOMEM are standard. (We also defined EOK which is identical to
EXIT_SUCCESS for convenience)

> 
> 11) From our design discussions I remember that  you mentioned that
> there was an issue with recursive directories and monitoring nested
> paths or something like this.
> It seems that find_existing_directory_ancestor() has to do with this issue.
> I was expecting to see some extended comment about this somewhere around
> create_watch() function.
> 
> 12) In search for some functions I looked at path_utils.
> First in general I am a bit scared about using static buffers to hold paths.
> Is it still an Ok practice? In the SSSD all code is allocated in
> collections
> and ini the only place  where the static buffer was used was some
> parsing of the
> file where the reasonable length of the line in the ini file is less
> than 100 bytes
> while buffer is about 64K. Should we use dynamic memory for paths?
> Start with malloc(MAX_PATH) and then realloc if needed?
> 

The idea behind MAX_PATH is that it's defined on every system to be the
largest legitimate value of a path that the system can handle
internally. malloc(MAX_PATH) doesn't gain anything over path[MAX_PATH]
except a need to manage the memory. I generally prefer, however, only to
allocate memory as-needed. This will come in handy in the future if
anyone should want to include this daemon on a limited-memory device
(such as a Linux-based router). That said, for speed and ease of use,
you can't beat this static buffer. I just balk at always using a full
memory page for every path, when in most cases you're going to need <
100 bytes. Considering that the whole purpose of this tool is to easily
manage large numbers of paths, that 4kiB (on Linux; 8kiB or more on some
other systems) can really add up.

> 13) I was amazed how well the path_util interface is defined and
> documented .
> I will consider using it whenever I need to deal with paths.
> 
> This is all for now.
> A general comment is that without running the code and playing with it
> for some time
> it is hard to find any specific issues (if any) in logic. On the surface
> things
> seem well thought through and the functions are not big and modular but
> lack of comments makes it really hard to digest.
> 


- -- 
Stephen Gallagher
RHCE 804006346421761

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkppn0UACgkQeiVVYja6o6Ps8wCgkDcTyeAVLbUMYilhGGoJBDiL
Cr4AnAlRVq+GYqLHzp8x/XPLDTfJghAk
=SEqh
-END PGP SIGNATURE-

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/l

Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-23 Thread Dmitri Pal
John Dennis wrote:
> This is a big patch, sorry, but there just isn't a realistic way to
> develop a major piece of code into a working state given our checkin
> policy which requires peer review for every change, it's just easier
> to develop elsewhere and submit the working result.
>
> This is the basic framework for the client audit code, it produces an
> executable called lwatch. In it's default mode lwatch watches a set of
> (log) files. It understands how log files are rotated. When a log file
> reaches a modification threshold it's newly appended data is prepared
> for transport to the audit server. When the log file is rotated,
> created, renamed, etc. the appropriate actions take place.
>
> lwatch maintains it's persistent state in a sqlite database. You can
> start and stop lwatch and it self synchronizes based on what is in the
> database and what it finds in the file system.
>
> lwatch is also capable of listing every log under a directory root,
> this can be handy for initializing the set of log files to watch.
>
> lwatch can also dump in a pretty format the current state of the SQL
> database.
>
> lwatch is not completely finished, but it's been in reasonable working
> shape for a while now, but sitting in my own git repository, it's time
> to get into the project's repository. What it needs next is better
> configuration options and to link in the code for audit server
> transmission (coming soon).
>
> Just one note about the patch, it includes a trivial one line fix to
> dhash.h which was missing a const declaration. I realize that should
> have been in a separate patch, but it got included here.
>
> 
>
> ___
> Freeipa-devel mailing list
> Freeipa-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-devel
John,

Steve wanted me to take a look at this code too.
Just to have a second pair of eyes.
I am doing it and will write notes as I go.
I do not know if any of those are blockers and would trigger nacks and
which can be
treated as "work" in progress. Let us use bast judgment.

I have not read Steven's note on purpose so that I do not narrow my vision.

Here we go:
1) There is generally a lack of comments in the code. it is very hard to
read.
IMO the lack of comments is a style issue rather then maturity of the
code issue. 
It seems funny that function that trims spaces is commented. It would
have been very easy to understand what the function does by its name.
And the code is pretty simple.   While more complex functions around do
not have any line on comments.
It would think that the following  practice should be employed:
Each function is commented on in the header or in the module (better in
both).
It least in one of these two places it is necessary to explain what the
function does, why it is needed and where it used.
For example:
/* Allocates and builds a memory buffer out the hash table entries. Used
when ... */
char *pathname_table_string(hash_table_t *table)

The more complex the function is the better it should be commented.

2) In many places the spaces are missing around commas, "-", "+".
It is hard to read. Again would be nice for it to follow standard.
I find these things in my code to and try to correct them as I see them.
Just a  suggestion to try a bit harder make things  more readable.
There are many places that can be improved.

3) Heavy use of the complex constructs:
if ((tbl_string = realloc(tbl_string, alloc_len =
alloc_len+MAX(needed_len,alloc_increment))) == NULL) {
Would be much more readable if things like that are split into two lines.
 alloc_len += MAX(needed_len, alloc_increment);
if ((tbl_string = realloc(tbl_string, alloc_len)) == NULL) {

 4) In multiple places there is just one line of code inside the  {}
after if for example:
if ((tbl_string = malloc(alloc_len = alloc_increment)) == NULL) {
return NULL;
}

should be:

if ((tbl_string = malloc(alloc_len = alloc_increment)) == NULL) 
return NULL;

but I would even prefer something like:

alloc_len = alloc_increment;
tbl_string = malloc(alloc_len);
if (!tbl_string) {
  
 return NULL;
}

5) Variables are not initialized at the top of the function.
It is generally a good practice to initialize things at the top of the
function.

6) In several places I noticed you miss va_end macro where you use
va_list and va_start().
>From vsnprintf man page:
   These functions do not call the va_end
   macro. Consequently, the value of ap is undefined after the 
call.  The
   application should call va_end(ap) itself afterwards.

7) It should be a practice of using thread safe versions of system
functions like localtime (localtime_r) , getgrgid (getgrgid_r) etc.
We are considering combining the log monitor with audit collection.
These are two separate functions it would be logical
to have them

Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-23 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/16/2009 04:49 PM, John Dennis wrote:
> This is a big patch, sorry, but there just isn't a realistic way to
> develop a major piece of code into a working state given our checkin
> policy which requires peer review for every change, it's just easier to
> develop elsewhere and submit the working result.
> 
> This is the basic framework for the client audit code, it produces an
> executable called lwatch. In it's default mode lwatch watches a set of
> (log) files. It understands how log files are rotated. When a log file
> reaches a modification threshold it's newly appended data is prepared
> for transport to the audit server. When the log file is rotated,
> created, renamed, etc. the appropriate actions take place.
> 
> lwatch maintains it's persistent state in a sqlite database. You can
> start and stop lwatch and it self synchronizes based on what is in the
> database and what it finds in the file system.
> 
> lwatch is also capable of listing every log under a directory root, this
> can be handy for initializing the set of log files to watch.
> 
> lwatch can also dump in a pretty format the current state of the SQL
> database.
> 
> lwatch is not completely finished, but it's been in reasonable working
> shape for a while now, but sitting in my own git repository, it's time
> to get into the project's repository. What it needs next is better
> configuration options and to link in the code for audit server
> transmission (coming soon).
> 
> Just one note about the patch, it includes a trivial one line fix to
> dhash.h which was missing a const declaration. I realize that should
> have been in a separate patch, but it got included here.
> 
> 
> 
> 
> ___
> Freeipa-devel mailing list
> Freeipa-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-devel


Review Part II

lwatch/watch_database.c:
sqlite3_result_string() - typo in the default case. "unkown" -> "unknown"

table_exists() - You're looping through results to see if the table
exists, but the SELECT query suggests that you should only be expecting
a single result. If you get more than one result back, shouldn't that
imply that the database has been corrupted and be treated as an error?

populate_path_info() - misuse of strncpy(). Not ensuring null
termination. You do this in many other places, so I will not repeat this
complaint again.
It would be prudent to have the positions of the values in the statement
be #defines instead of magic numbers, to make it easier to change later
if you add or remove columns.

path_info_flags_string() - Same snprintf() misuse as I mentioned in my
earlier review. Repeated in several other functions, so I will not
repeat myself.


util.c:
pathname_table_string() - You are misusing realloc(). If realloc()
returns NULL, you have leaked the original memory for tbl_string.

regexp.c:
Please prefer using the fully expanded error string PCRE_ERROR_* (as you
do with the SQLite error strings in watch_database.c)

inotify_watch.c:
I don't see any additional problems other than those indicated above.

This concludes my review of this iteration of the patch.

- -- 
Stephen Gallagher
RHCE 804006346421761

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkpotnUACgkQeiVVYja6o6MImgCfZf3qaMt+8wugE+WQTl5lKzYV
f3IAnjzMmsLz+g0s8iwH8NhexE6yoeya
=Utcx
-END PGP SIGNATURE-

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-23 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/22/2009 03:58 PM, John Dennis wrote:
> On 07/22/2009 03:05 PM, Stephen Gallagher wrote:
>> Nack. The following things must be corrected before this patch will be
>> accepted.
>>
>> General:
>> The log file /tmp/lwatch.log is not usable by two users simultaneously.
>> Each user should have his/her own log file or there should be a
>> mechanism in place to log safely to the common log.
>> There is a similar problem with the sqlite database. The first user to
>> execute lwatch gains permissions on the database. No other users have
>> write access after that (except root).
> 
> There is no concept of per user, this will morph into a system daemon
> thus there is no need for any of the above.
> 
> The use of of /tmp/lwatch.log is just pre-production staging, final
> production code will move it to a configured logging location (default
> /var/log/lwatch.log)
> 

>> DEBUG messages should be controlled by a commandline parameter.
> 
> of course :-)
> 
>>
>> /lwatch/configure.ac:
>> You need to have a configure test for sqlite3.
> 
> good point and I am aware of it.
> 
>> /lwatch/src/Makefile.am:
>> Don't use $(srcdir) as a path to built binaries. This doesn't support
>> parallel build trees. For linking dhash and path_utils, it needs to be
>> relative to $(builddir).
> 
> O.K., but note some of the existing Makefile.am's don't do this.
>

Can you point me at examples of this? I build exclusively in parallel
trees (mainly to catch cases where it fails) and the build system is
working correctly.

$(builddir) should be used wherever built files are needed, and
$(srcdir) should be used wherever the path needs to be relative to the
uncompiled sources.

>> Creation of /var/lib/lwatch needs to be part of 'make install' (e.g.
>> create an install-exec-hook rule)
> 
> FWIW I think the location of the sqlite database should be a (yet to be
> created) config parameter. I do realize things like install rules and
> configuration parameters are absent, it's pre-production, but I will add
> them now rather than later.
> 

Thanks. I agree that it should be a configure parameter. I was just
going to let that one slide for first commit.

>>
>> /lwatch/src/lwatch.c:
>> The return values from file_event_string should be localized.
>> log_msg() should be a macro. It's very ugly reading __FUNCTION__ all
>> over the code.
>> In process_file_rename() and many other places, you are using strncpy
>> unsafely. If the source string is equal or longer than the destination
>> string, strncpy() does not null-terminate. As a result, reading from it
>> could wander off the page. It's always wise to do strncpy(dest, orig,
>> sizeof(dest)-1);
>> dest[sizeof(dest)-1] = '\0';
>> lwatch_event_string() is using snprintf() unsafely. If the string (after
>> substitution) exceeds "buf_end-p", the function will only write
>> buf_end-p values, but it will return the TOTAL size that would have been
>> written if it had not been truncated. You need to test for that.
>>
>>
>>
>> Things that aren't necessarily worth holding up the commit for:
>>
>> I'd like to recommend that you switch to using tevent for processing
>> your mainloop and signal handlers. It's much safer for the latter than
>> just catching signals, as it will just queue the handler processing for
>> the next loop, instead of breaking execution.
> 
> yes, that's on the to-do list.
> 
>>
>> compare_stat_info() makes an assumption that if the filesize is larger,
>> then it's been appended to. It's possible that a larger file has been
>> moved into place atop the older file. Perhaps you can check whether the
>> inode number has changed?
> 
> It does check:
> 
> /* If the device or inode is different it must be a different file */
> if ((stat_info->st_dev != path_info->dev) || (stat_info->st_ino !=
> path_info->inode)) {
> result = FILE_REPLACED;
> }

Sorry missed that in my review.

> 
> 
> Thank you for your review Stephen, I'll push out a updated patch shortly
> to address your concerns.
> 

Thanks for the replies. I'm still working on my review. Expect more
comments later today.

- -- 
Stephen Gallagher
RHCE 804006346421761

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkpofNgACgkQeiVVYja6o6OI9QCcC2yS/QY4IcI2qwJ2bm3jGoKS
DB0AnRZ1yOgA/IZoLncXItWpMmz8a4GJ
=KyHJ
-END PGP SIGNATURE-

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-22 Thread John Dennis

On 07/22/2009 03:05 PM, Stephen Gallagher wrote:

Nack. The following things must be corrected before this patch will be
accepted.

General:
The log file /tmp/lwatch.log is not usable by two users simultaneously.
Each user should have his/her own log file or there should be a
mechanism in place to log safely to the common log.
There is a similar problem with the sqlite database. The first user to
execute lwatch gains permissions on the database. No other users have
write access after that (except root).


There is no concept of per user, this will morph into a system daemon 
thus there is no need for any of the above.


The use of of /tmp/lwatch.log is just pre-production staging, final 
production code will move it to a configured logging location (default 
/var/log/lwatch.log)



DEBUG messages should be controlled by a commandline parameter.


of course :-)



/lwatch/configure.ac:
You need to have a configure test for sqlite3.


good point and I am aware of it.


/lwatch/src/Makefile.am:
Don't use $(srcdir) as a path to built binaries. This doesn't support
parallel build trees. For linking dhash and path_utils, it needs to be
relative to $(builddir).


O.K., but note some of the existing Makefile.am's don't do this.


Creation of /var/lib/lwatch needs to be part of 'make install' (e.g.
create an install-exec-hook rule)


FWIW I think the location of the sqlite database should be a (yet to be 
created) config parameter. I do realize things like install rules and 
configuration parameters are absent, it's pre-production, but I will add 
them now rather than later.




/lwatch/src/lwatch.c:
The return values from file_event_string should be localized.
log_msg() should be a macro. It's very ugly reading __FUNCTION__ all
over the code.
In process_file_rename() and many other places, you are using strncpy
unsafely. If the source string is equal or longer than the destination
string, strncpy() does not null-terminate. As a result, reading from it
could wander off the page. It's always wise to do strncpy(dest, orig,
sizeof(dest)-1);
dest[sizeof(dest)-1] = '\0';
lwatch_event_string() is using snprintf() unsafely. If the string (after
substitution) exceeds "buf_end-p", the function will only write
buf_end-p values, but it will return the TOTAL size that would have been
written if it had not been truncated. You need to test for that.



Things that aren't necessarily worth holding up the commit for:

I'd like to recommend that you switch to using tevent for processing
your mainloop and signal handlers. It's much safer for the latter than
just catching signals, as it will just queue the handler processing for
the next loop, instead of breaking execution.


yes, that's on the to-do list.



compare_stat_info() makes an assumption that if the filesize is larger,
then it's been appended to. It's possible that a larger file has been
moved into place atop the older file. Perhaps you can check whether the
inode number has changed?


It does check:

/* If the device or inode is different it must be a different file */
if ((stat_info->st_dev != path_info->dev) || (stat_info->st_ino != 
path_info->inode)) {

result = FILE_REPLACED;
}


Thank you for your review Stephen, I'll push out a updated patch shortly 
to address your concerns.


--
John Dennis 

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

2009-07-22 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/16/2009 04:49 PM, John Dennis wrote:
> This is a big patch, sorry, but there just isn't a realistic way to
> develop a major piece of code into a working state given our checkin
> policy which requires peer review for every change, it's just easier to
> develop elsewhere and submit the working result.
> 
> This is the basic framework for the client audit code, it produces an
> executable called lwatch. In it's default mode lwatch watches a set of
> (log) files. It understands how log files are rotated. When a log file
> reaches a modification threshold it's newly appended data is prepared
> for transport to the audit server. When the log file is rotated,
> created, renamed, etc. the appropriate actions take place.
> 
> lwatch maintains it's persistent state in a sqlite database. You can
> start and stop lwatch and it self synchronizes based on what is in the
> database and what it finds in the file system.
> 
> lwatch is also capable of listing every log under a directory root, this
> can be handy for initializing the set of log files to watch.
> 
> lwatch can also dump in a pretty format the current state of the SQL
> database.
> 
> lwatch is not completely finished, but it's been in reasonable working
> shape for a while now, but sitting in my own git repository, it's time
> to get into the project's repository. What it needs next is better
> configuration options and to link in the code for audit server
> transmission (coming soon).
> 
> Just one note about the patch, it includes a trivial one line fix to
> dhash.h which was missing a const declaration. I realize that should
> have been in a separate patch, but it got included here.
> 
> 
> 
> 
> ___
> Freeipa-devel mailing list
> Freeipa-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-devel

Nack. The following things must be corrected before this patch will be
accepted.

General:
The log file /tmp/lwatch.log is not usable by two users simultaneously.
Each user should have his/her own log file or there should be a
mechanism in place to log safely to the common log.
There is a similar problem with the sqlite database. The first user to
execute lwatch gains permissions on the database. No other users have
write access after that (except root).
DEBUG messages should be controlled by a commandline parameter.

/lwatch/configure.ac:
You need to have a configure test for sqlite3.

/lwatch/src/Makefile.am:
Don't use $(srcdir) as a path to built binaries. This doesn't support
parallel build trees. For linking dhash and path_utils, it needs to be
relative to $(builddir).
Creation of /var/lib/lwatch needs to be part of 'make install' (e.g.
create an install-exec-hook rule)

/lwatch/src/lwatch.c:
The return values from file_event_string should be localized.
log_msg() should be a macro. It's very ugly reading __FUNCTION__ all
over the code.
In process_file_rename() and many other places, you are using strncpy
unsafely. If the source string is equal or longer than the destination
string, strncpy() does not null-terminate. As a result, reading from it
could wander off the page. It's always wise to do strncpy(dest, orig,
sizeof(dest)-1);
dest[sizeof(dest)-1] = '\0';
lwatch_event_string() is using snprintf() unsafely. If the string (after
substitution) exceeds "buf_end-p", the function will only write
buf_end-p values, but it will return the TOTAL size that would have been
written if it had not been truncated. You need to test for that.



Things that aren't necessarily worth holding up the commit for:

I'd like to recommend that you switch to using tevent for processing
your mainloop and signal handlers. It's much safer for the latter than
just catching signals, as it will just queue the handler processing for
the next loop, instead of breaking execution.

compare_stat_info() makes an assumption that if the filesize is larger,
then it's been appended to. It's possible that a larger file has been
moved into place atop the older file. Perhaps you can check whether the
inode number has changed?


I have so far only reviewed lwatch.c. I will review other components
over the next day or two, but I figured I'd send back these comments first.
- -- 
Stephen Gallagher
RHCE 804006346421761

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iEYEARECAAYFAkpnYuEACgkQeiVVYja6o6OOEACeIW5DTfraYeShxwMeRVb81Z63
hPwAnjr6jYd9/nMAnZ6qkN7SeTlkS6qG
=qL61
-END PGP SIGNATURE-

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel