Dear kwant-developers, I would like to share with you the recent progress in the general Operators module. My github repository has been updated: https://github.com/PhiReck/generalOperator/
Good news first: In the last mail I said that the path finding is of O(N^2) or at least O(N logN), since for each hopping of the i-th operators we have to find all connected hoppings in the (i+1)th operator (connected means hop[1] of i-th operator is the same as hop[0] of (i+1)-th operator). However, we can do it in O(N), provided accessing a dictionary is O(1): For each Operator we create a dictionary whose keys are the site Ids of the where[0]-Sites and the values are list of the relative positions in this where. (Note that in the summary 'genOp.pdf' on github, there was a mistake since it was stated that each operator creates the dictionary for the next operator.) For a given site related to the i-th operator, we only have to ask the dictionary of the (i+1)-th operator (dict[site]) which returns the position of connected hoppings of the (i+1)-th operator. Creating these dictionaries should be of order of the elements in where for each operator, which should be O(N_sites*N_Ops), where the number of operators, N_Ops, is probably O(1) -- at least at the moment, we need maximally 2-3 operators. Then we have to go for each hopping in the i-th operator through the dictionary of the (i+1)th operator, which should be O(N_sites) as long as accessing the dictionary is O(1). Indeed, I see that for the energy current (psi^\dagger_i H_ij H_jk psi_k), where the path finding is necessary, the initialization of the operator is linear in system size. More precisely, __init__() takes roughly 2 times longer than the system creation in the example I studied the case where each of the Hamiltonians is considered over all hoppings in the system. I have not yet tried to increase the efficiency for the init, i.e. changing the prefactor for the linear dependence on system size, so I guess it might be better than the system creation. For a plot of the data, see on my github: 'init-times-Ecurr.pdf'. Concerning the progress of the project, I finished including the suggested changes by Joe: instead of alternating Onsite-Hopping-Operators, allow any combination by having the option of making the product of arbitrary operators. By that, the 'where' of each operator is directly connected to the operator instead of having an additional where-list, the latter being suggested by my very first approach. Before discussing the efficiencies, let me first comment on a consequence of this change -- from alternating operators with separate wherelist and individual operators having directly its own 'where' and allowing for products of operators -- which I had not anticipated: Before, we only had to find the path between hopping-operators, whereas now, an onsite operator might have some special 'where' (e.g. to be calculated only in the left half of the system), which makes it in general necessary to have also path finding in products with onsite operators. In some sense, this is a good development since before, the user would have had to generate the wherelist by hand such that the desired requirements of the onsite-where are taken into account, whereas now, it is done by the code itself. However, having easy products of operators like the spin-current (sigma_xyz \times Hamiltonian), this path-finding would lead to an unnecessary increase of the calculation time. Therefore, I decided to give onsite-operators the boolean attribute 'willNotBeCalled' (maybe a better name would be 'onlyToBeUsedInProduct'?) in the case it is defined over the whole system (where=None). This provides the possibility of not creating the 'where=[site for site in syst.sites]' but instead, when initializing the product with another operator, to use the other operator's where. Most importantly, no paths have to be found this way. As mentioned before, this is only possible if the onsite operator does not have any 'where'-restrictions because these restrictions would be neglected in the approach so far. (If there are restrictions in the onsite-where, I don't think that the path-finding can be omitted.) So let's come to the efficiencies that I have tested so far. I've shown you previously the operator-__call__() times in the case of the density (psi^dagger_i M_i psi_i) for different methods of calculating it (kwant, python-recursive function, c-recursive functions,...). Back then, the result was that the alternating-operators is roughly *2 times slower* than the 'specialized' kwant.Density, which is probably due to the fact that the recursive functions are still called (1 time for each site). Now, I checked the new 'Operator Product'. For some reason that I don't understand yet, it is rouglhy *3 times slower* than the kwant.results although I don't see a reason why we do not achieve the same speed as with the alternating Operators. For a plot of the data, see on my github: 'call-times-dens.pdf' Before, I had only checked the __call__-efficiency, now, I also checked the __init__() of the operators, again for the Density. First of all, it is indeed (somewhat) linear, as expected (no path finding is needed for the density anyway). However, the prefactors are quite different. Compared to the kwant-results, the alternating operator is again *2-3 times slower*, and unfortunately, the 'Operator Product' is roughly *20-30 times slower* than kwant. Again, I don't see a reason why there should be a difference between the old (alternating operators) and the new (Op_prods) way, but I have had no time yet to consider it more carefully and find the reasons. However, since the system creation is still *~10 times slower* than the __init__() of 'Operator Product', probably the __init__-efficiency is not so important anyway, as long as not too many operators are initialized. For a plot of the data, see on my github: 'init-times-dens.pdf' In future, I would like to try to use c++ maps in cython in order to avoid having the confusing auxiliary lists made from the python dict. Moreover, I would like to understand why there is a difference in the efficiency between the old (alternating Ops) and the new (Op Prods) way and enhance the efficiency if possible. Concerning the efficiencies, I would like to hear your opinions. Of course, in the best case, the old operators (Density, Current, Source) should be as fast with the generalized Operator class, as with the specialized kwant.Operators, both in calculation and initialization. However, I guess that this will not be perfectly possible. Still, since creating the system as well as calculating the scattering states is in most cases anyway heavier than the operator, this loss in efficiency might be unimportant compared to the benefit of the generalized Operator: creating easily new operators (see the beginning of generalOperator.pyx on my github for examples of examples like the Density, Current, etc.) and having the calculation routines only once (so far, each of the Density, Current and Source have their own _operate(), although its very similar in all cases). I would be very interested in what you think what an acceptable loss in efficiency is, or in which cases, the efficiency is of major importance. To finish, I have a technical question. So far, for each __call__ of the operator, _get_orbs(..) is called in _eval_operator(..) to get the number of orbitals and starting index of the wave function related to a given site. However, I expect that for a finalized systems, these quantities cannot change (e.g. there cannot be a parameter which, when changed in value, would change the ordering of the wave function or the number of orbitals on a site), right? If this is the case, _get_orbs could be done already in the __init__. This might be a minor win in efficiency. As always, I am looking forward to your comments and suggestions. I assume that I might not be always 100% clear so if there is any confusion, please let me know. Best regards, Phillipp