On 03.09.2012 21:09, Bert Huijben wrote:


-----Original Message-----
From: Stefan Küng [mailto:tortoise...@gmail.com]
Sent: maandag 3 september 2012 20:48
To: Subversion Development
Subject: thread safe

Hi,

It seems that the svn_config_t structure isn't thread safe, i.e. can't
be shared among multiple threads.
See here for a detailed report on what problems this causes:
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=757&dsMessa
geId=3003152

Is this by design? I assumed that all svn APIs are thread safe for read
access at least unless noted otherwise (example: the API
svn_utf_initialize states its thread-unsafeness in the note section of
the docs).

I always assume(d) that all functions that are not documented otherwise are
thread safe, but that they also assume that the data is not passed to
multiple functions in multiple threads at once.

So mostly the same thing as what the C runtime or the .Net runtime promise:
Static functions are thread safe, but instances and functions on those are
not.

Well, for the c-runtime functions and structs/data objects it's documented that they are thread safe for read-access. Only for write access you have to write your own thread guards.

As for the svn_config_t object after initializing it (i.e., reading the config from the config file and the registry) I assume that it isn't written to anymore. So I assumed that read-only access from multiple threads wouldn't be a problem.


In my testing via SharpSvn (and the bug reports over the years) this mostly
proved safe. (There were a few issues fixed in early 1.6 development).


In SharpSvn I keep a config instance alive per SvnClient. And in AnkhSVN I
keep a small pool of recently used SvnClient instances at hand to avoid
re-reading the config files over and over again.

Do you have a fixed number of threads that you use?
Because in TSVN we use a variable number of threads, depending on how many processors/cores are available. we'd have to implement a somewhat complicated pool of config objects to re-use.


This keeps svn_client_context_t and everything below thread safe as an
SvnClient is documented to be only usable in one thread at a time.

While we can work around that thread problem a little bit, that
workaround has its own big problems:
* each thread would need its own svn_config_t structure
    --> for each thread the config file is read (open, read, close)
    which results in multiple unnecessary disk accesses
    --> when the config file is read multiple times, sometimes the read
    fails (at least on Windows) because the file is opened already -
    usually because virus scanners open and scan every file that a
    process accesses, even when only for reading.

I think this is the proper way to fix this issue at your side: keep data
thread safe.

How are you handling the authentication cache, and other settings. I would
just keep it safe for future extension.

All objects are created fresh and initialized for each thread. Except the svn_config_t object because of the very, very expensive initialization it requires. And because of the problems with accessing the config file from multiple threads.


Joel Jirak made a patch for subversion/libsvn_subr/config.c which would
fix the problem (see the thread linked above). Would that be an
acceptable solution?

The patch on
http://tortoisesvn.tigris.org/ds/getDSMessageAttachment.do/config.c.hack-fix
.patch?dsForumId=757&dsMessageId=3003152&dsAttachmentId=3606977&dsAttachment
Mime=application/octet-stream
creates a string in a very short living thread pool, in a static function
with a few callers that all have a pool available.

With a bit of cleanup it should be possible to apply this patch with a
proper scratch sub-pool and I don't see why such a patch wouldn't be
accepted.


But I don't think that this will be the solution to make the rest of
Subversion thread safe, or to guarantee that all other memory structures
will be thread safe in the future. (But it might make a read-once config
struct reusable in multiple threads)

I don't expect all memory structures to be thread safe since most of them will be written to in the APIs.

What would also help: a deep-copy function for the svn_config_t object, something similar to e.g. svn_wc_dup_status3. That way I could avoid initializing new objects by re-reading the config file but simply copy the already available data to the new object.

Stefan

--
       ___
  oo  // \\      "De Chelonian Mobile"
 (_,\/ \_/ \     TortoiseSVN
   \ \_/_\_/>    The coolest Interface to (Sub)Version Control
   /_/   \_\     http://tortoisesvn.net

Reply via email to