Thanks for getting back to me so fast. The updated webrev is at:
http://cr.opensolaris.org/~bpytlik/ips-search/ Comments in line: [EMAIL PROTECTED] wrote: > Hey Brock, > > I took a look at your webrev. You've written enough new code that I'm > probably not able to review it as thoroughly as I can with the existing > code. It would be great if you could write up a description of your > design, and how the different parts work. That may give some of us who > aren't experts in search issues a chance to provide overall feedback on > the design. > > From a high level, there are two components to this change: the indexer, which maintains the information needed for search, and the query engine, which actually performs a search of the information provided. The indexer is responsible for creating and updating the indexes and ensuring they're always in a consistent state. It does this by maintaining a set of inverted indexes as text files (details of which can be found in the comments at the top of indexer.py). On the server side, it's hooked into the publishing code so that the index is updated each time a package is published. If indexing is already happening when packages are published, they're queued and another update to the indexes happens once the current run is finished. On the client side, it's hooked into the install, image-update, and uninstall code so that each of those actions are reflected in the index. The query engine is responsible for processing the text from the user, searching for that token in its information, and giving the client code the information needed for a reasonable response to the user. It must ensure that the information it uses is in a consistent state. On the server, an engine is created during the server initialization. It reads in the files it needs and stores the data internally (this is why the server RAM footprint increased). When the server gets a search request from a client, it hands the search token to the query engine. The query engine ensures that it has the most recent information (locking and rereading the files from disk if necessary) and then searches for the token in its dictionaries. On the client, the process is the same except that the indexes are read from disk each time instead of being stored (since a new instance of pkg is run for each search). Was that the kind of information you were looking for? >> Here's the webrev for the new search backend: >> http://cr.opensolaris.org/~bpytlik/ips-search/ >> > > Here's some specific feedback that I've come up with so far: > > client.py: > > - line 399-416, 500-517, 1432-1451: You've duplicated a lot of the > error handling code. Would it make sense to commonize this in the > top-level exception handlers that we use to wrap main_func()? > The exception handling in 399-416 and 500-517 have to be there. They critically prevent the exception from reaching the handler on lines 417 and 518. The handlers at 417/518 conclude the installation has failed and restore the boot environment. It used to be the case that indexing was the last thing the client did before returning, which is why the boot environment should have been the new image, simply reindexing is much faster than redownloading all the packages. I've changed imageplan so that even if the indexing fails, the pkg_plans still go through their post-execution steps, so the image should, imo, still be considered bootable and consistent. 1432-1451 could be pulled out to the top level, but it wouldn't be a case of making code common, merely relocating (and the duplication of catching the same exceptions at two different levels might well encourage the accidental removal of the lower level ones). I have removed my catch of runtime errors and moved that to the top level so that it's handled consistently with other runtime errors. > Meta-question here too: Some of these errors instruct the user to > remove directories and then run, or re-run commands. Would it make > more sense to: > > a) Provide a command that repairs the indexes without any further > user intervetion > Yes, I'll make this fix. > b) If yes to a, then can we just run the repair when we encounter an > error instead of returning a message? > I think think notification is the right way to go here. The permissions error couldn't be handled internally as they'd need to reindex with sufficient permissions to change the directory. The other two errors can only happen if either indexing has crashed during a previous run or indexing is happening at the same time. For example, imagine someone starts rebuilding the index for an image in one shell and then manipulates that image in another shell. I don't have a good way of distinguishing a run that's still happening from one that broke. I could simply remove the directory (possibly out from under the running indexer) and start indexing again, but that made me feel a bit queasy. If that's the preferred solution, I'll implement that. > depot.py: > > - line 150: Can we use a lockfile insead of a scary warning here? > I've changed the comment slightly to make it clearer that this argument should never be used by anyone other than the depot when it's reexecing itself. > pkg.1.txt: > > - I would omit "should be generally unnecessary, but" > > I've reworded things a bit. I just want to make it clear in the man page that it's not necessary to run rebuild-index unless something has broken. > catalog.py: > > - line 43: The catalog code is shared between the client and the > server. Adding this import means that the client now depends upon > the server for correct functionality. I don't think we want to do > this. > > - line 147: This suggests, perhaps, that we want a subclass of > Catalog for the server? > I agree, I've moved server side search code out to a subclass. I won't do a full refactoring of catalog in this change as I think it would just confuse things more. > - line 208 & 209: Is there a reason why the tempfile needs to have the > same name for every invocation? Would it be better to use > tempfile.mkstemp()? > No, there's no reason. If tempfile.mkstemp is the preferred way of doing things, I'll switch to that. My only argument for using the other method is that then the file is in a known location when something breaks, so it might be easier to get debugging info from that file, rather than trying to figure out which file in /tmp is the relevant one. But, again, not something I feel strongly about. > - line 220: Shouldn't we drop the lock here? > I don't think so. If you dropped it here, two threads could both acquire writing file handles to the same file, or, if we moved to tempfile.mkstemp, then it's possible one would blow away the changes of the other. It's for the same reason that I hold the lock past the point where Catalog.save_pkg_names is called. It would be bad to have two threads both dumping their pickled content to the same file. Perhaps there's something I'm not seeing here, if what I've said above doesn't make sense, let me know and we can talk through it some more. > - line 249: The hold time on this lock looks long. Who's contending > for this? > Basically, contention scales linearly on this lock by the number of publishers publishing simultaneously to the depot. I didn't think this would be a problem, but if it is, I can take a look at redesigning it. > - line 349: Who releases searchdb_lock? > searchdb_lock is released in the child_handler, roughly line 471 > More importantly, what's the lock ordering for the update lock > and searchdb lock? This should be documented. You should also > document what each lock protects, preferably in the constructor, or > somewhere conspicuous. > After looking carefully at the code, I believe the searchdb lock is redundant given the other code that's in place. For now I've removed it, but I could be convinced to put it back if it would help the code be less brittle. As designed now, the code never tries to acquire that lock except exactly in those situations where it can get the lock immediately, which suggests there's no need for the lock now. As long as any future code which tries to index does the same check at the top of refresh_index, there won't be any problems. I've commented this to make it clearer, but I'd be open to the idea that a symbolic/failsafe lock might be worthwhile. > - line 376: Shouldn't we drop the update and searchdb locks here? > Yes > - line 377: Why don't we drop the updatelock for non-posix systems? > > We should, I've moved the location of the lock release to make sure the updatelock is released > indexer.py: > > - line 105: It seems like it would make more sense for the Indexer to > be able to determine it's max memory use when it's created. You've > duplicated the max memory calculation in image and imageplan. I > would either have __init__ figure this out, or pass __init__ a > function that it can call to figure this out. > > Good point, I've moved the code around a bit > - line 658 / 669: Why do we need separate update index functions for > the server and the client? > > The short answer is that one needs to pass in PKGS(=TRUE) while the other passes in FMRIS(=FALSE). The longer answer is that the functions, while generally very similar, do have a few differences. They have different input data structures, the client provides a list of pkg plans while the server provides a list of FMRI's, or to put it another way, the client has a way of specifying was to remove packages from the index while the server doesn't have that capability. The client also has filters it needs to apply while the server doesn't. Rather than expose the interface of passing in a value to determine which option to use, I decided to use separate functions to encode the difference as I found that to be a clearer interface. > misc.py: > > - line 328,329: Shouldn't we determine these dynamically, or at a > minimum not let this number exceed the amount of memory in the > machine? > > Ok, thanks for the pointer to sysconf. It now tries to use sysconf, and if that blows up it falls back to the defaults I've given. > search_storage.py: > > - lines 492-495: Remove dead code, please. > > > oops > Thanks, > > -j > Other changes to the code: Added logging on the server side to indicate when indexes became available or where updated Added locking on the server side rereading of dictionaries to ensure that multiple search threads won't cause the internal representations to be used when they're only partially built. A better solution may be for the indexer to update the internal structures as well as the file structures. I will give some thought to that for the next round of search updates. _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
