Hi Kirill, On Sat, 2010-10-02 at 17:41 +0400, Kirill Smelkov wrote: > First of all I'm sorry it took me so long to reply. No worries, thanks for your reply and your patience.
> On Tue, Sep 28, 2010 at 10:05:22AM +0200, Jelmer Vernooij wrote: > > On Mon, 2010-09-27 at 12:35 +0400, Kirill Smelkov wrote: > > > On Mon, Sep 20, 2010 at 01:16:49PM +0400, Kirill Smelkov wrote: > > > Jelmer, others, what I'm maybe doing wrong here? I just wanted to use > > > tdb from python without major constraints compared to C version. > > Sorry, as Andrew mentioned most of us were at a conference last week so > > I haven't had much time to look at your patches again. > > > > With regard to the chainlock functions; I can see the use in exposing > > these, but am not convinced mapping them one to one from the C functions > > is necessarily the best idea. The header warns to use the chainlock > > functions with care; > That warning dates back to 2000 (see 7e4c4721). To me it's like > low-level locking should be always used with care, because it's much > more easier to use one global lock. But should this prevent exposing > locking functionality? Ok, fair enough. Perhaps I'm being overly cautious, I just got a bit worried reading that comment. I'd just like to make sure it's not too easy for users to shoot themselves in the foot. > > Does using these functions from Python not cause unexpected segfaults in > > some situations? With some more unit tests I'd be happy to accept those > > patches. > I've ported tdbtorture to verify this, and yes, they don't produce > segfaults and also tdb_check says OK. > However, please note - I've got occasional infrequent torture failures > with -k (kill random) for _both_ C tdbtorture and python one. > > I'll send updated patches shortly. Would pytdbtorture be enough or maybe > you've meant some other approach to test it? Sorry for being unclear. What I meant was that I'd like to see some more unit tests for these functions - e.g. testing that trying to chainlock the same key multiple times does the right thing, attempting to remove a chainlock where no exists works, etc. Either way, thanks for providing pytdbtorture, it's nice to have an extra test. > > With regards to some of the other functions, I don't think completeness > > is a valid reason for adding bindings per se. I really don't see why > > tdb_fd would need to be exposed on the Python level (or at all) for > > example. > I thought it's just a matter of consistency and completenes. Many > python's file-like objects have .fileno(), so if C tdb itself exposes > tdb_fd(), why don't we expose it to Python? As I mentioned earlier, I think it's a bad idea to map all C functions one to one to Python functions. E.g. we don't have tdb.traverse() either, we have Python iterators. At the very least, if we're trying to match file-like objects in Python objects we should name this method "fileno" for consistency, not "fd". > Maybe someone crazy enough would want to get fd from open tdb, dup it, > lseek it to file start, and then sendfile it across network. > I agree, this example is maybe a bit artificial, but my point here is > that bindings should not set up policy of what is doable and what is > not. If C tdb exposes some functionality, so should bindings unless > there are strong reaseong _not_ to do it, imho. I think it should be the other way around. There should be a reason for having a method there. I would like to understand why it is in the C bindings before blindly adding Python bindings for it. > > The more functions are exposed the harder it becomes to find > > something by browsing the API and the harder it becomes to change that > > API. > > To me it was exactly vice versa - I've browsed C tdb api, and then > discovered that a lot of functions were missing in pytdb. By the way, > the only description we have is docs/README, so I assume those who want > to use tdb start from there, and then those who turn to pytdb are like > me. I don't see how methods that you don't need to use being missing in the Python API could be an issue. > > You mention corruption, but that would be due to bugs in tdb. I > > don't understand why we'd need to expose any other function than > > tdb_check in this regard. > I've mentioned corruption not because of bugs in tdb, but because of > storage I'm going to keep my tdbs is unreliable -- flashes don't have > sectors/blocks/etc, they have eraseblock (e.g. 128K). > > So for flashes which mimic IDE or SATA drives, there is on-card > controller which translates block-language into what flash understands. > > And when e.g. host writes new sector into flash, that controller first > _erases_ much more bigger eraseblock, and only then writes back old > eraseblock conntent with sector. > > Now, if power suddenly goes down in the middle, we are potentially > getting corruption - becase write operation touches sectors block layer > though it was keeping intact. > > This property means journaling does not work reliably on flash storage > with FTL - this is described in LWN article I've mentioned. Sure, I understand that. What I'm interested in is how you think those methods will help you deal with those issues. > > See my reply about dump_all() etc writing to stdout. I think if we > > expose these functions in Python we should at least make them write to a > > file-like object, not to stdout. That will also make testing easier. > Yes, agree. Also let's postpone dump_all() and friends until we sort out > more important ones, e.g. locking, check, traverse, log and > pytdbtorture. Works for me. I'll merge the patches on which I haven't commented. Cheers, Jelmer
signature.asc
Description: This is a digitally signed message part

