Hiya Emma,
Good luck on your project. A couple of things to be weary of are disk I/O, metadata cache backends and overlays. Disk I/O can be a significant bottleneck. Loading up a lot of files from disk (be it the metadata cache or whatever) can take a long time initially, but then be cached in RAM and so be much faster to access in the future. Portage allows for its internal metadata cache to be stored in a variety of formats, as long as there's a backend to support it. This means simple speedups can be achieved using cdb or sqlite (if you google these and portage you'll get gentoo-wiki tips, which unfortunately you'll have to read from google's cache at the moment). It also means that if you want to make use of this metadata from within portage, you'll have to rely on the API to tell the backend to get you all the data (and it may be difficult to speed up without writing your own backend). Finally there are overlays, and since these can change outside of an "emerge --sync" (as indeed can the main tree), you'll have to reindex these before each search request, or give the user stale data until they manually reindex. If you're interesting in implementing this in python, you may be interested in another package manager that can handle the main tree, also implemented in python, called pkgcore. From what I understand, it's a similar code-base to portage, but its internal architecture may have changed a lot. I hope some of that helps, and isn't off putting. I look forward to seeing the results! 5:)
        Mike  5:)

Reply via email to