On 09/11/2013 08:28 AM, Jan Pazdziora wrote:
On Wed, Sep 04, 2013 at 09:41:15AM +0200, Silvio Moioli wrote:
We recently stumbled in some failing tests downstream and we were
wondering why it was chosen to use pltcl in the new logging schema
I've decided it was the optimal solution for the task. Faster than
temporary tables, less invasive than custom_variable_classes.

Problems arise because:
  - pltcl global variables are used [1];
That's the goal.

  - Postgres allocates a new Tcl interpreter for each database connection
[2];
  - thus, each connection will have its independent set of Tcl global
variables;
That's the hope. We use it for the logging purpose and wouldn't want
the sessions to influence each other.

  - Hibernate will by default release a database connection after a
transaction has ended [3];
Which for typical web application request handling is what you want,
isn't it? The whole request processing is one transaction and
releasing it gives you nice cleanup.

Of course, you could change the mode to ON_CLOSE if your really
wanted.

Normally the case yes, but there are at least a couple of instances where we commit a single request as multiple transactions to avoid overrunning Oracle's rollback buffer. One example that jumps to mind is generating yum repodata for channels. I'm not sure if that's relevant to this discussion or not, I just saw that comment and thought I'd chime in.

-Stephen

  - thus, any database operation occurring after, for example, a call to
ConnectionManager.commitTransaction(), could get a different connection
from the one used before the COMMIT;
Does it happen in our code base? We tried to catch and fix all such
occurences.

  - possibly, this different connection could be completely new - in that
case no global variables are set;
  - the above could also not happen if, in lucky cases, the same
connection is reused but this is (at least) Hibernate, c3p0, context,
JVM and load dependent.

Actually, there is not a lot of places in the Spacewalk codebase where
database operations are carried out after a COMMIT, nevertheless in
those cases trouble can happen. For example we found out that DELETE
queries on web_contact will fail on a new connection, because
web_contact_log_trig will call logging.get_log_id() that will in turn
call logging._get_log_id() which would raise an exception due to the
global variable the_log_id not being initialized.

In our case, this after-commit user deletion happens in
BaseTestCaseWithUser.tearDown() ->  OrgFactory.deleteOrg() ->
rhn_org.delete_org() -> rhn_org.delete_user() which is called because
ChannelManagerTest.testListCompatibleBaseChannels() commits in the middle.
This is in tests, right? They need to be fixed to properly initialize
the auth info in the logging infrastructure.

Questions:
  - is it really necessary to track variables per _database connection_
in the logging package?
Is there any other way to get the id of the authenticated user or of
the logging event in say an AFTER DELETE TRIGGER?

                        Since connections should be handled in a
completely transparent way by Hibernate sessions, I believe this should
not happen;
Our hope, supported by testing, is that even Hibernate is not so bold
to run part of its session using one database session and the other
part using other session. In other words, underneath Hibernate
session, there is still a RDBMS with its own notion of sessions and
Hibernate respects that.

  - is it really necessary to add the pltcl language and use globals?
No to the first part (we could use some other way), "I did not
find a better way" to the second part.

Wouldn't a simple database sequence or even plain table suffice?
The problem I faced with database sequences was, I did not find an
easy way to invalidate the currval. With the existing logging
infrastructure, we clear the log_id after we are done with it so if
the database session gets reused, the application logic needs to
reinitialize it or things will fail (basically, what you see in the
failing tests). That is a feature -- we do not want the log_id from
the previous HTTP request which happened to be handled by application
that used our database sessions previously to leak to the subsequent
operations.

As for plain tables -- I don't think it would work. How would you know
in that AFTER DELETE TRIGGER which record in the table is yours? And
if you did not COMMIT and depended on there being only one record (the
one that your session inserted), how would it work if the application
decided to ROLLBACK? And if the application COMMITed by mistake, the
assumption of there only being one record would immediately break.

I see the following possible solutions:
  - adopting another way to keep the log counter, possibly at transaction
or whole database level;
It's not about the log counter. There is a sequence there alright.

It's about being able to access in reliable way value which was
instantiated in the same database session in one of the previous
database operations, and being able to invalidate that information.

  - calling LoggingFactory.clearLogId() after COMMITs and ROLLBACKs, or
I hope we are doing it in the tomcat context at least.

anyway be sure that it is called before any operation that could involve
logging;
Before the first operation that could involve logging, setLogAuth is
actually more important. ;-)

  - patching pltcl stored procedures so that they never assume a global
has been initialized.
The fact that the global variables are not initialized means that the
user of the database (the applications) broke the contract which
states that they should provide the auth logging information before
doing anything with the database session because *any* database table
can have logging enabled and need that information.


_______________________________________________
Spacewalk-devel mailing list
Spacewalk-devel@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-devel

Reply via email to