Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-26 Thread Amar Tumballi
Hi Barak,

My replies inline.

On Thu, Nov 21, 2019 at 6:34 PM Barak Sason Rofman 
wrote:

> Hello Gluster community,
>
> My name is Barak and I’ve joined RH gluster development in August.
> Shortly after my arrival, I’ve identified a potential problem with
> gluster’s logging mechanism and I’d like to bring the matter up for
> discussion.
>
> The general concept of the current mechanism is that every worker thread
> that needs to log a message has to contend for a mutex which guards the log
> file, write the message and, flush the data and then release the mutex.
> I see two design / implementation problems with that mechanism:
>
>1.
>
>The mutex that guards the log file is likely under constant contention.
>2.
>
>The fact that each worker thread perform the IO by himself, thus
>slowing his "real" work.
>
>
Both  above points are true, and can have an impact when there is lot of
logging. While some of us would say we knew the impact of it, we had not
picked this up as a priority item to fix for below reasons.

* First of all, when we looked at log very early in project's life, our
idea was based mostly on kernel logs (/var/log/messages). We decided, as a
file-system, because it is very active with I/Os and should run for years
together without failing, there should be NO log messages when the system
is healthy, which should be 99%+ time.

* Now, if there are no logs when everything is healthy, and most of the
things are healthy 99% of the time, naturally the focus was not
'performance' of logging infra, but the correctness. This is where, the
strict ordering through locks to preserve the timestamps of logs, and have
it organized came by.


> Initial tests, done by *removing logging from the regression testing,
> shows an improvement of about 20% in run time*. This indicates we’re
> taking a pretty heavy performance hit just because of the logging activity.
>
>
That is interesting observation. For this alone, can we have an option to
disable all logging during regression? That would fasten up things for
normal runs immediately.


> In addition to these problems, the logging module is due for an upgrade:
>
>1.
>
>There are dozens of APIs in the logger, much of them are deprecated -
>this makes it very hard for new developers to keep evolving the project.
>2.
>
>One of the key points for Gluster-X, presented in October at
>Bangalore, is the switch to a structured logging all across gluster.
>
>
>
+1


> Given these points, I believe we’re in a position that allows us to
> upgrade the logging mechanism by both switching to structured logging
> across the project AND replacing the logging system itself, thus “killing
> two birds with one stone”.
>
> Moreover, if the upgrade is successful, the new logger mechanism might be
> adopted by other teams in Red Hat, which lead to uniform logging activity
> across different products.
>
>
This, in my opinion is a good reason to undertake this activity. Mainly
because we should be having our logging infra similar with other tools, and
one shouldn't be having a learning curve to understand gluster's logging.


> I’d like to propose a logging utility I’ve been working on for the past
> few weeks.
> This project is still a work in progress (and still much work needs to be
> done in it), but I’d like to bring this matter up now so if the community
> will want to advance on that front, we could collaborate and shape the
> logger to best suit the community’s needs.
>
> An overview of the system:
>
> The logger provides several (number and size are user-defined)
> pre-allocated buffers which threads can 'register' to and receive a private
> buffer. In addition, a single, shared buffer is also pre-allocated (size is
> user-defined). The number of buffers and their size is modifiable at
> runtime (not yet implemented).
>
> Worker threads write messages in one of 3 ways that will be described
> next, and an internal logger threads constantly iterates the existing
> buffers and drains the data to the log file.
>
> As all allocations are allocated at the initialization stage, no special
> treatment it needed for "out of memory" cases.
>
> The following writing levels exist:
>
>1.
>
>Level 1 - Lockless writing: Lockless writing is achieved by assigning
>each thread a private ring buffer. A worker threads write to that buffer
>and the logger thread drains that buffer into a log file.
>
> In case the private ring buffer is full and not yet drained, or in case
> the worker thread has not registered for a private buffer, we fall down to
> the following writing methods:
>
>1.
>
>Level 2 - Shared buffer writing: The worker thread will write it's
>data into a buffer that's shared across all threads. This is done in a
>synchronized manner.
>
> In case the private ring buffer is full and not yet drained AND the shared
> ring buffer is full and not yet drained, or in case the worker thread has
> not registered for a 

Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-22 Thread Yaniv Kaul
On Fri, Nov 22, 2019 at 11:45 AM Barak Sason Rofman 
wrote:

> Thank you for your input Atin and Xie Changlong.
>
> Regarding log ordering - my initial thought was to do it offline using a
> dedicated too. Should be straight forward, as the logs have time stamp
> composed of seconds and microseconds, so ordering them using this value is
> definitely possible.
> This is actually one of the main reasons I wanted to bring this up for
> discussion - will it be fine with the community to run a dedicated tool to
> reorder the logs offline?
> Reordering the logs offline will allow us to gain the most performance
> improvement, as keeping the logs order online will have some cost (probably
> through stronger synchronization).
> Moreover, we can take log viewing one step further and maybe create some
> GUI system (JAVA based?) to view and handle logs (e.g. one window to show
> the combined order logs, other windows to show logs per thread etc').
>

If you need an external tool (please not Java - let's not add another
language to the project), you might as well move to binary logging.
I believe we need to be able to do this sort using Linux command line tools
only.

>
> Regarding the test method - my initial testing was done by removing all
> logging from regression. Modifying the method "skip_logging" to return
> 'true' in all cases seems to remove most of the logs (though not all, "to
> be on the safe side", really removing all logging related methods is
> probably even better).
>

This is not a fair comparison:
1. The regression tests are running with debug log
2. Not logging at all != replacing the logging framework - the new one will
have its own overhead as well.
3. I believe there's a lot of code that is not real life scenarios there -
such as dumping a lot of data there (which causes a lot of calls to
inode_find() and friends - for example).
Y.

As regression tests use mostly single-node tests, some additional testing
> was needed. I've written a couple of very basic scripts to create large
> number of files / big files, read / write to / from them, move them around
> and perform some other basic functionality.
> I'd actually be glad to test this in some 'real world' use cases - if you
> have specific use cases that you use frequently, we can model them and
> benchmark against - this will likely offer an even more accurate benchmark.
>
> On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong  wrote:
>
> >
> > 在 2019/11/21 21:04, Barak Sason Rofman 写道:
> >
> > I see two design / implementation problems with that mechanism:
> >
> >1.
> >
> >The mutex that guards the log file is likely under constant contention
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-22 Thread Xie Changlong


在 2019/11/22 17:43, Barak Sason Rofman 写道:

Thank you for your input Atin and Xie Changlong.

Regarding log ordering - my initial thought was to do it offline using 
a dedicated too. Should be straight forward, as the logs have time 
stamp composed of seconds and microseconds, so ordering them using 
this value is definitely possible.
This is actually one of the main reasons I wanted to bring this up for 
discussion - will it be fine with the community to run a dedicated 
tool to reorder the logs offline?
Reordering the logs offline will allow us to gain the most performance 
improvement, as keeping the logs order online will have some cost 
(probably through stronger synchronization).
Moreover, we can take log viewing one step further and maybe create 
some GUI system (JAVA based?) to view and handle logs (e.g. one window 
to show the combined order logs, other windows to show logs per thread 
etc').


Regarding the test method - my initial testing was done by removing 
all logging from regression. Modifying the method "skip_logging" to 
return 'true' in all cases seems to remove most of the logs (though 
not all, "to be on the safe side", really removing all logging related 
methods is probably even better).



Thanks for sharing, i'll go back to your method and do some perf test : )


As regression tests use mostly single-node tests, some additional 
testing was needed. I've written a couple of very basic scripts to 
create large number of files / big files, read / write to / from them, 
move them around and perform some other basic functionality.
I'd actually be glad to test this in some 'real world' use cases - if 
you have specific use cases that you use frequently, we can model them 
and benchmark against - this will likely offer an even more accurate 
benchmark.


On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong > wrote:



在 2019/11/21 21:04, Barak Sason Rofman 写道:


I see two design / implementation problems with that mechanism:

1.

The mutex that guards the log file is likely under constant
contention.

2.

The fact that each worker thread perform the IO by himself,
thus slowing his "real" work.


Initial tests, done by *removing logging from the regression
testing, shows an improvement of about 20% in run time*. This
indicates we’re taking a pretty heavy performance hit just
because of the logging activity.



Hi Barak Sason Rofman.  Amazing perf improvement! Could show me
the detail test method ?

Thanks

    -Xie


In addition to these problems, the logging module is due for an
upgrade:

1.

There are dozens of APIs in the logger, much of them are
deprecated - this makes it very hard for new developers to
keep evolving the project.

2.

One of the key points for Gluster-X, presented in October at
Bangalore, is the switch to a structured logging all across
gluster.




--
*Barak Sason Rofman*

Gluster Storage Development

Red Hat Israel 

34 Jerusalem rd. Ra'anana, 43501

bsaso...@redhat.com   T: _+972-9-7692304_
M: _+972-52-4326355_



___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-22 Thread Barak Sason Rofman
Thank you for your input Atin and Xie Changlong.

Regarding log ordering - my initial thought was to do it offline using a
dedicated too. Should be straight forward, as the logs have time stamp
composed of seconds and microseconds, so ordering them using this value is
definitely possible.
This is actually one of the main reasons I wanted to bring this up for
discussion - will it be fine with the community to run a dedicated tool to
reorder the logs offline?
Reordering the logs offline will allow us to gain the most performance
improvement, as keeping the logs order online will have some cost (probably
through stronger synchronization).
Moreover, we can take log viewing one step further and maybe create some
GUI system (JAVA based?) to view and handle logs (e.g. one window to show
the combined order logs, other windows to show logs per thread etc').

Regarding the test method - my initial testing was done by removing all
logging from regression. Modifying the method "skip_logging" to return
'true' in all cases seems to remove most of the logs (though not all, "to
be on the safe side", really removing all logging related methods is
probably even better).
As regression tests use mostly single-node tests, some additional testing
was needed. I've written a couple of very basic scripts to create large
number of files / big files, read / write to / from them, move them around
and perform some other basic functionality.
I'd actually be glad to test this in some 'real world' use cases - if you
have specific use cases that you use frequently, we can model them and
benchmark against - this will likely offer an even more accurate benchmark.

On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong  wrote:

>
> 在 2019/11/21 21:04, Barak Sason Rofman 写道:
>
> I see two design / implementation problems with that mechanism:
>
>1.
>
>The mutex that guards the log file is likely under constant contention.
>2.
>
>The fact that each worker thread perform the IO by himself, thus
>slowing his "real" work.
>
>
> Initial tests, done by *removing logging from the regression testing,
> shows an improvement of about 20% in run time*. This indicates we’re
> taking a pretty heavy performance hit just because of the logging activity.
>
> Hi Barak Sason Rofman.  Amazing perf improvement! Could show me the detail
> test method ?
>
> Thanks
>
> -Xie
>
> In addition to these problems, the logging module is due for an upgrade:
>
>1.
>
>There are dozens of APIs in the logger, much of them are deprecated -
>this makes it very hard for new developers to keep evolving the project.
>2.
>
>One of the key points for Gluster-X, presented in October at
>Bangalore, is the switch to a structured logging all across gluster.
>
>

-- 
*Barak Sason Rofman*

Gluster Storage Development

Red Hat Israel 

34 Jerusalem rd. Ra'anana, 43501

bsaso...@redhat.com T: *+972-9-7692304*
M: *+972-52-4326355*

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-21 Thread Xie Changlong


在 2019/11/21 21:04, Barak Sason Rofman 写道:


I see two design / implementation problems with that mechanism:

1.

The mutex that guards the log file is likely under constant
contention.

2.

The fact that each worker thread perform the IO by himself, thus
slowing his "real" work.


Initial tests, done by *removing logging from the regression testing, 
shows an improvement of about 20% in run time*. This indicates we’re 
taking a pretty heavy performance hit just because of the logging 
activity.



Hi Barak Sason Rofman.  Amazing perf improvement! Could show me the 
detail test method ?


Thanks

    -Xie


In addition to these problems, the logging module is due for an upgrade:

1.

There are dozens of APIs in the logger, much of them are
deprecated - this makes it very hard for new developers to keep
evolving the project.

2.

One of the key points for Gluster-X, presented in October at
Bangalore, is the switch to a structured logging all across gluster.

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-21 Thread Atin Mukherjee
This is definitely a good start. In fact the experiment you have done which
indicates a 20% improvement of run time perf with out logger does put this
work for a ‘worth a try’ category for sure. The only thing we need to be
mindful here is the ordering of the logs to be provided, either through a
tool or the logger itself taking care of it.

On Thu, 21 Nov 2019 at 18:34, Barak Sason Rofman 
wrote:

> Hello Gluster community,
>
> My name is Barak and I’ve joined RH gluster development in August.
> Shortly after my arrival, I’ve identified a potential problem with
> gluster’s logging mechanism and I’d like to bring the matter up for
> discussion.
>
> The general concept of the current mechanism is that every worker thread
> that needs to log a message has to contend for a mutex which guards the log
> file, write the message and, flush the data and then release the mutex.
> I see two design / implementation problems with that mechanism:
>
>1.
>
>The mutex that guards the log file is likely under constant contention.
>2.
>
>The fact that each worker thread perform the IO by himself, thus
>slowing his "real" work.
>
>
> Initial tests, done by *removing logging from the regression testing,
> shows an improvement of about 20% in run time*. This indicates we’re
> taking a pretty heavy performance hit just because of the logging activity.
>
> In addition to these problems, the logging module is due for an upgrade:
>
>1.
>
>There are dozens of APIs in the logger, much of them are deprecated -
>this makes it very hard for new developers to keep evolving the project.
>2.
>
>One of the key points for Gluster-X, presented in October at
>Bangalore, is the switch to a structured logging all across gluster.
>
>
> Given these points, I believe we’re in a position that allows us to
> upgrade the logging mechanism by both switching to structured logging
> across the project AND replacing the logging system itself, thus “killing
> two birds with one stone”.
>
> Moreover, if the upgrade is successful, the new logger mechanism might be
> adopted by other teams in Red Hat, which lead to uniform logging activity
> across different products.
>
> I’d like to propose a logging utility I’ve been working on for the past
> few weeks.
> This project is still a work in progress (and still much work needs to be
> done in it), but I’d like to bring this matter up now so if the community
> will want to advance on that front, we could collaborate and shape the
> logger to best suit the community’s needs.
>
> An overview of the system:
>
> The logger provides several (number and size are user-defined)
> pre-allocated buffers which threads can 'register' to and receive a private
> buffer. In addition, a single, shared buffer is also pre-allocated (size is
> user-defined). The number of buffers and their size is modifiable at
> runtime (not yet implemented).
>
> Worker threads write messages in one of 3 ways that will be described
> next, and an internal logger threads constantly iterates the existing
> buffers and drains the data to the log file.
>
> As all allocations are allocated at the initialization stage, no special
> treatment it needed for "out of memory" cases.
>
> The following writing levels exist:
>
>1.
>
>Level 1 - Lockless writing: Lockless writing is achieved by assigning
>each thread a private ring buffer. A worker threads write to that buffer
>and the logger thread drains that buffer into a log file.
>
> In case the private ring buffer is full and not yet drained, or in case
> the worker thread has not registered for a private buffer, we fall down to
> the following writing methods:
>
>1.
>
>Level 2 - Shared buffer writing: The worker thread will write it's
>data into a buffer that's shared across all threads. This is done in a
>synchronized manner.
>
> In case the private ring buffer is full and not yet drained AND the shared
> ring buffer is full and not yet drained, or in case the worker thread has
> not registered for a private buffer, we fall down to the last writing
> method:
>
>1.
>
>Level 3 - Direct write: This is the slowest form of writing - the
>worker thread directly write to the log file.
>
> The idea behind this utility is to reduce as much as possible the impact
> of logging on runtime. Part of this reduction comes at the cost of having
> to parse and reorganize the messages in the log files using a dedicated
> tool (yet to be implemented) as there is no guarantee on the order of
> logged messages.
>
> The full logger project is hosted on:
> https://github.com/BarakSason/Lockless_Logger
>
> For project documentation visit:
> https://baraksason.github.io/Lockless_Logger/
>
> I thank you all for reading through my suggestion and I’m looking forward
> to your feedback,
> --
> *Barak Sason Rofman*
>
> Gluster Storage Development
>
> Red Hat Israel 
>
> 34 Jerusalem rd. Ra'anana, 43501
>
> 

[Gluster-devel] Modifying gluster's logging mechanism

2019-11-21 Thread Barak Sason Rofman
Hello Gluster community,

My name is Barak and I’ve joined RH gluster development in August.
Shortly after my arrival, I’ve identified a potential problem with
gluster’s logging mechanism and I’d like to bring the matter up for
discussion.

The general concept of the current mechanism is that every worker thread
that needs to log a message has to contend for a mutex which guards the log
file, write the message and, flush the data and then release the mutex.
I see two design / implementation problems with that mechanism:

   1.

   The mutex that guards the log file is likely under constant contention.
   2.

   The fact that each worker thread perform the IO by himself, thus slowing
   his "real" work.


Initial tests, done by *removing logging from the regression testing, shows
an improvement of about 20% in run time*. This indicates we’re taking a
pretty heavy performance hit just because of the logging activity.

In addition to these problems, the logging module is due for an upgrade:

   1.

   There are dozens of APIs in the logger, much of them are deprecated -
   this makes it very hard for new developers to keep evolving the project.
   2.

   One of the key points for Gluster-X, presented in October at Bangalore,
   is the switch to a structured logging all across gluster.


Given these points, I believe we’re in a position that allows us to upgrade
the logging mechanism by both switching to structured logging across the
project AND replacing the logging system itself, thus “killing two birds
with one stone”.

Moreover, if the upgrade is successful, the new logger mechanism might be
adopted by other teams in Red Hat, which lead to uniform logging activity
across different products.

I’d like to propose a logging utility I’ve been working on for the past few
weeks.
This project is still a work in progress (and still much work needs to be
done in it), but I’d like to bring this matter up now so if the community
will want to advance on that front, we could collaborate and shape the
logger to best suit the community’s needs.

An overview of the system:

The logger provides several (number and size are user-defined)
pre-allocated buffers which threads can 'register' to and receive a private
buffer. In addition, a single, shared buffer is also pre-allocated (size is
user-defined). The number of buffers and their size is modifiable at
runtime (not yet implemented).

Worker threads write messages in one of 3 ways that will be described next,
and an internal logger threads constantly iterates the existing buffers and
drains the data to the log file.

As all allocations are allocated at the initialization stage, no special
treatment it needed for "out of memory" cases.

The following writing levels exist:

   1.

   Level 1 - Lockless writing: Lockless writing is achieved by assigning
   each thread a private ring buffer. A worker threads write to that buffer
   and the logger thread drains that buffer into a log file.

In case the private ring buffer is full and not yet drained, or in case the
worker thread has not registered for a private buffer, we fall down to the
following writing methods:

   1.

   Level 2 - Shared buffer writing: The worker thread will write it's data
   into a buffer that's shared across all threads. This is done in a
   synchronized manner.

In case the private ring buffer is full and not yet drained AND the shared
ring buffer is full and not yet drained, or in case the worker thread has
not registered for a private buffer, we fall down to the last writing
method:

   1.

   Level 3 - Direct write: This is the slowest form of writing - the worker
   thread directly write to the log file.

The idea behind this utility is to reduce as much as possible the impact of
logging on runtime. Part of this reduction comes at the cost of having to
parse and reorganize the messages in the log files using a dedicated tool
(yet to be implemented) as there is no guarantee on the order of logged
messages.

The full logger project is hosted on:
https://github.com/BarakSason/Lockless_Logger

For project documentation visit:
https://baraksason.github.io/Lockless_Logger/

I thank you all for reading through my suggestion and I’m looking forward
to your feedback,
-- 
*Barak Sason Rofman*

Gluster Storage Development

Red Hat Israel 

34 Jerusalem rd. Ra'anana, 43501

bsaso...@redhat.com T: *+972-9-7692304*
M: *+972-52-4326355*

___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel