On 08/12/10 01:59 PM, Shawn Walker wrote:
On 08/11/10 07:00 PM, Brock Pytlik wrote:
On 08/ 9/10 01:03 AM, Shawn Walker wrote:
Greetings,
The following webrev contains changes or fixes for the following items:
16744 repository multi-publisher on-disk format should be formalized
and implemented
[snip]
webrev:
http://cr.opensolaris.org/~swalker/pkg-repo-format/
[snip]
...
I reviewed this mostly looking at search issues and potential collisions
with the manifest signing wad that's out now.
Currently the manifest signing wad adds/removes certs to/from the repo,
I assume that after both wads land, the right answer is to have the
certs added to particular publishers and as part of adding the cert, a
publisher property will be set?
So instead of:
/usr/bin/pkgrepo add-signing-ca-cert path ...
it will be:
/usr/bin/pkgrepo add-signing-ca-cert publisher path ...
Is that correct? I don't think the -p option is the right approach here
because by default, publishers should not be sharing certificates so
making the publisher argument optional seems like it would cause
problems.
You could certainly do it that way, although I'd ask Dan or Danek
about the user interface aspect of it.
If you land before me, I'll double check with those two about requiring
the publisher in the appropriate spot.
As an aside, it'd be nice to find a less wordy name for that
subcommand. Is there a reason it needs to be 'add-signing-ca-cert'
specifically?
Suggestions are welcome. It's add because you're adding, not removing.
It's signing because it's the manifest signing cert, not the certs used
for ssl. It's a ca because it's a CA cert and not an intermediate cert.
I suppose cert could be lopped off, though at this point in the dev
cycle, I'd rather not make changes like this unless they offer clear and
substantial benefits.
modules/indexer.py:
With the locking going in, is there any reason to retain the consistent
opening logic? Are you leaving it around purely for backward
compatibility? If so, can you file a RFE to remove that functionality
once we've decided enough time has passed?
I thought the consistent opening logic was needed because migrate
wasn't an atomic action, so despite the locking, it would still be
needed for read-only consumers.
Ah, I forgot readers weren't participating in the locking.
I think the locking here is fairly complex because they keep getting
picked up and dropped. I think it would be easier to understand and
maintain if we did either what we've done in other places w/ wrappers
which are responsible for taking and dropping the locks, or keep the
existing approach but make each function take the lock only if
self.__locked is False, and then dropping the lock only if it was taken
in that function's invocation.
I don't understand the view of the locking here being complex. The
lock() method is only called in two places. The only bit of logic
here that I could possibly understand that being applied to is the
unlock() that happens during _generic_update_index() if
rebuild_index_from_scratch has to be called.
I could try to do what you describe, but there are some additional
safeguards I'd have to put in place that would actually make this more
complicated. In the case of rebuild_index_from_scratch, since the
index directory is being removed anyway, it's necessary to drop the
lock and obtain it again because of the file lock that's needed.
*shrug* Ok, I had a lot of trouble understanding when locks were taken
and dropped, and how that interacted with the rats nest of calls that
the indexer makes, but if I'm the outlier, that's fine.
Why not have the indexer use the same makedirs as misc provides? The
reason for having it raise search_errors was that it was a module shared
by both client and server, while api_errors were client only. Since this
no longer the case (server modules import misc which imports api_errors)
we might as well consolidate both the exception space and the makedirs
functions.
It was only to preserve the existing behaviour.
I think this is a good time to nuke that behavior.
modules/server/depot.py:
What's the timeout of .5 seconds on line 2026 buy?
I'll add a comment, but that's basically necessary to keep the depot
server from burning through a large amount of CPU and so that the
depot server doesn't hang on shutdown (it waits at least timeout
seconds before exiting, which if you don't specify a timeout means
forever).
I could specify a longer timeout, but that directly affects shutdown
time and the amount of time before the depot might start a queued task.
.5 secs makes sens to me in that case. I wasn't clear whether having no
timeout would cause a hang or not.
Why can't more than one background task be queued?
Because I don't believe more than one should be allowed. It's a far
easier to understand (and debug) model where only one major operation
at a time can be queued.
I think that's the wrong approach (as I said in the other thread). I'll
settle for a low priority RFE filed to improve this.
Do the background processes happen in the same process or are they
forked out by cherrypy? If it's that same process, then that concerns me
as indexing previously happened in a separate process from the depot,
which meant the memory it used was returned to the system when the
indexing was done, instead of being attached to the depot for as long as
it was running.
If the memory usage of the depot server after indexing is a concern
for the user, they can restart it, or alternatively use pkgrepo on the
filesystem path of the repository the depot's serving. I'm happy to
document that in the man pages.
It was intentional that they happen in the same process. Ultimately,
I didn't feel that the complexity or problems involved with forking
were worth it to solve a resource issue.
Ok, please include that information at least in a flag day since it
means that people deploying pkg.depotd should potentially change how
they do things. Have you discussed this w/ Eric to make sure that we'll
be making the appropriate changes?
There's no way to abort a task running in the background?
No, and that's intentional. I don't want to implement an overly
complex mechanism here. There was no way to stop a background
indexing either in the past, so this isn't any worse than what we
already had in this way.
I don't understand why you don't want to allow the user some method of
stopping a potentially very long running action short of nuking their
entire depot. I'm fine with simple mechanisms, but not when the desire
for code simplicity makes users lives more difficult. If they can't
queue up more than one background task, I think some kind of abort
really is necessary (and I'd argue even with the ability to queue up
background processes, abort would be useful).
Is the user told that the operation they just instructed the depot to do
is being done in the background and that they'll need to check the logs
for errors? Suppose a user wanted to refresh-packages and wait to do the
next step until that was finished, how would they accomplish that? Would
they loop polling stauts/0 until the operation they cared about was
marked as finished? It seems that, based on what I understand about how
indexing locking currently works, this wouldn't necessarily work because
the user could have their status/0 request processed in one of the times
between indexing dropping and picking up the lock, meaning the status
would be reported as online instead of indexing.
The same problem existed previously with indexing. Long term, we need
to figure out a better way to provide more detailed status information
to the user but that's not this wad. I'll open an RFE for it.
If you weren't dropping the lock in the middle of indexing, then you
couldn't encounter the race condition I mentioned. Previously, a user
would have started a depot with exit ready. Does that mechanism still
exist? If so, I'm fine with this being put down as an rfe. If not, I
think this needs to be addressed.
[snip]
THanks,
Brock
Thanks,
-Shawn
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss