Danek Duvall wrote:
On Mon, Feb 09, 2009 at 03:33:29AM -0600, Shawn Walker wrote:

http://cr.opensolaris.org/~swalker/pkg-2691/

Any chance you can cook up an auxiliary webrev that does diffs between the
old repository.py and the new depot.py?  It's pretty hard to review these
changes as-is.  I haven't reviewed it yet; if you can't or don't want to do
this, say so, and I'll figure it out myself.

Sorry about that:
http://cr.opensolaris.org/~swalker/pkg-2691/repo-depot-diff.txt

I've removed any thing I've addressed from your reply.

depot.py:
...
action/__init__.py:

  - line 106: why the change from position to position+1?

Because I didn't believe users are going to realise that the position is zero-indexed; however, I'm willing to change this back if you think I'm being too cautious.

  - line 158: perhaps "... and action.attrs.get(ka, None) is None"?  I'd
    speed test both, to see.

Was consistently a few hundred actions/sec. slower.

action/generic.py:

  - line 36: you might use cStringIO instead; it should be faster, and
    maybe more memory efficient.  Though I see at least in one place, you
    changed it from cStringIO, so perhaps you have a reason?

I did it because of the recent rumblings about not using C-based python stuff. However, I've changed this back to cStringIO.


  - line 122: why is this a problem?

It causes the process to hang if pkg.size isn't provided. Call this "slap the hand of people trying to break my stuff on purpose."

  - line 155, 167: these clauses can be pretty bad on the memory usage.

Sorry, I'm dense at the moment.  Explain a bit more?

    Where do we end up using them?  It might help to do something like

The only time this gets hit at the moment is during publishing to a remote server. The receiving server, when it rebuilds the action, hands off a request rfile object to the action class which only supports read(). I was trying to be efficient and allow objects that support seek() and tell() to get their size calculated in a better way than sucking in all the data from a file.

Technically, I think I only need lines 164-166 if the object supports "read" and I could dump the rest for our current needs.

        data = StringIO.StringIO(data.read())
        data.seek(0, 2)
        sz = data.tell()
        data.seek(0)

  - line 160: you might take a look at the suggestion in

        http://mail.python.org/pipermail/python-list/2006-March/375280.html

    to get SEEK_SET, etc.

Why do I need SEEK_SET again?

action/license.py:

  - line 108: why are you doing this?  Ask for forgiveness, etc.  I'd say

Apparently a nun hit my head instead of my hand with a ruler, so I'm not sure I follow completely, but...

I'm going to assume that you meant I should catch an ENOENT instead of checking for os.path.exists here. So changed.

    the same about the size calculations in generic.py, but I see how that
    would get very indented (I'd still suggest you do it that way, though;
    perhaps splitting clauses out into separate functions).

I don't follow the comment about size calculations, unless the number of statements within a try/except is the concern.

publish/transaction.py:

  - line 114: Why is this necessary?  Can't you just use __repo_cache?

NameError: global name '_TransactionFile__repo_cache' is not defined

  - line 130: are there any issues with multiple simultaneous publishers?
    (Not necessarily just here, but in general.)

As long as they're all using the same repository object, no. Otherwise, probably. I suspect you would encounter the same problem as you would when running multiple pkg.depotd processes and publishing to them.

I was assuming bug 2696 would address this indirectly.

  - line 386: could you be overwriting a msg that the server provided?

Shouldn't be, as the server should have returned a 4xx/5xx error triggering the previous clause. However, it could be overwriting a message from urllib2 or httplib. I've changed this to include the original value of 'msg'.

server/catalog.py:

  - line 141: os.path.basename(cmd) in ("pkg.depotd", "depot.py")?  But
    perhaps this should be triggered by some global variable the depot
    sets (environment variable, maybe?), instead of the command name?

I've added a fork_allowed parameter to SvrConfig which it passes on to server/catalog.__init__, which is then checked.

server/config.py:

  - line 54: I'd say that if you have to tell it whether or not to create a
    repo, then it isn't really "auto".  "create_repo", maybe?

Actually, it's correct. You're not telling it whether to create one, you're telling if you want it to automatically create one if one does not already exist at the specified location. Only if auto_create is true, and the repository directory is empty or does not exist, will a repository be created.

  - line 102: we get here if the makedirs() up on line 97 failed, right?

Yes.

    But wouldn't an exception have been raised at that point?  You can also

Not if the directory was somehow removed immediately after it was successfully created; though I know that's unlikely.

    do

        if os.path.exists(d):
            if self.auto_create:
                raise ...
            raise ...

So changed, except *not* os.path.exists ;-)

server/transaction.py:

  - line 117: what needs to be done in init?  And why can't it?

cfg should be assigned in __init__; I don't want to add the extra churn necessary to make it happen at this moment. There is a lot of cleanup work left to do with making server/config sane and moving more of the depot logic into depot.py.

  - line 212: previously, we had a bunch of stub code here that would deal
    with package completeness.  Shouldn't that stay?

Why? I'm not a big fan of stub code; especially stub code that was getting called and doing nothing. If you'd like, I'd be happy to open a bug with a summary of the stubs I removed.

  - line 216: what happens if this fails?  Previously we'd log it and drive
    on; now it throws an exception and the fact that the actual transaction
    succeeded is lost.

I was only considering the logging of the exception, and hadn't realised that it was allowing a Transaction success to go through. The main problem here is that I can't rely on cherrypy to be present, nor do I have any desire to reference a transport specific module in unrelated code. That and it was horribly wrong in catching all exceptions.

The best alternative I have for the moment is to catch EnvironmentError and print to stderr using misc.emsg(). We need a logging framework or some way for the modules to communicate information like this.

  - line 276: I know it was this way before, but how does this raise a
    KeyError?  I could see how line 273 could.

No idea.  Changed.

  - line 357: I'm not sure why the comment that was previously here about
    multiple packages moving to PUBLISHED at the same time was removed.

Overzealous cleaning; added back.

server/repository.py:
  - line 96: should this have a line break in here somewhere?

My opinion was that I don't know what client is using my messages, so I should be using a minimal amount of formatting and relying on the client to format messages as they see fit (think of the GUI wrapping the text to fit its box according to font size for visually impaired users).

  - line 146: again, why the test?  Just try to create it.  If it fails,
    create one from scratch.

There already; but in this case, an EnvironmentError/ENOENT won't be thrown as we're using SafeConfigParser(), so I've changed this a little bit different accordingly.

  - line 214: would "if isinstance(last_modified, basestr):" be better?

basestring, but yes.

  - line 261: probably should have a blank line before it.  But doe these
    calls belong in repository?  Or should they go in depot?  I have the
    same feeling about the search-related stuff (well, maybe indexing makes
    sense here).

I believe the inc_*() calls belong here for now until I can sort out the server/config.py mess.

I'm positive that the search related stuff should stay here as that is a set of functions that operates on repository data and I intended consumers of the repository API to have access to it.

Thanks,
--
Shawn Walker
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to