diegobruno...@gmail.com wrote:

> Kwant Community, I would like to know if it is possible to do parallel
> computing using kwant?
> To do parallel computing can be used numba, CUDA, multiprocessing,
> Parallel. This work for kwant?

If you search the list for “parallel”, you will find various discussions
about the topic.  In the following, I summarize the current situation.

If your computing workload consists of independent bits (like averaging
an observable over different disorder realizations), then it is best to
parallelize on this level.  Depending on your hardware and your
preferences you can then use concurrent.futures, mpi4py or any other
parallelization framework.  If you go this way, be sure to disable
OpenBLAS parallelization [1].

If your workload cannot be split up in the above way, we have to
differentiate between the three phases of system construction,
Hamiltonian evaluation and solving.

• The system construction phase (i.e. using kwant.Builder) is inherently
  sequential due to the way CPython works internally (search for GIL,
  the infamous “Global Interpreter Lock”).  Luckily, for large systems,
  the construction phase is seldom the bottleneck since the time it
  takes is O(N) where N is the number of sites.  If it is problematic in
  your case, you could bypass the builder by creating a custom low-level
  system.  One day we might provide a much faster builder (successor).

• The Hamiltonian evaluation phase is what happens inside the method
  ‘hamiltonian_submatrix’.  This phase runs in Python and is sequential
  as well.  It involves evaluating all the value functions (wherever
  these are used) and currently this happens site by site and hopping by
  hopping.  Speeding this up is something we have been working on for
  a long time and the next Kwant release should provide a way to
  vectorize Hamiltonian evaluation.  The process will still be
  sequential, but should get a very significant speed up.  Still, even
  today, for large systems Hamiltonian evaluation is typically not the
  bottle neck since it as well takes time O(N).

• For large systems it is typically the solving phase that dominates
  total running time.  This is to be expected since it takes a time that
  is polynomial in the size of system.  (The exponent depends on system
  dimensionality and the used solver.)

  Very often the solving phase will involve Kwant calling into a linear
  algebra library that ultimately calls some sort of BLAS.  The default
  setup on the platforms I know best (Debian-like) is such that OpenBLAS
  is used and it is parallelized using OpenMP so that all available
  cores are used.  This is not a very useful parallelization, but it is
  better than nothing.  See [1].

  The solvers use different libraries that support parallelization to
  a different degree.  No one has seriously worked on this because
  vectorization needs to be done first (this is being worked on), and
  quite often people have the opportunity to parallelize on the level of
  independent tasks.

  For very large systems solving (and everything else) is likely
  dominated by the MUMPS library which is inherently parallel and which
  Kwant uses currently only in a sequential way.  This is something that
  could be most likely done with little effort currently, but no one so
  far has finished that work.  There exists an open issue: [2].  I do
  not know how well the MUMPS parallelization works for the kind of
  linear systems that Kwant creates.

Hope that the above provides some insight into the state of things.
Christoph

[1] 
https://mail.python.org/archives/list/kwant-discuss@python.org/thread/SSU7PFE3BKHY2EMFSSWDVLXFIDDM6O3H/
[2] https://gitlab.kwant-project.org/kwant/kwant/-/issues/54

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to