Hi,

Thanks Christoph, this is just what was needed!


> Note that the point (3) is not compatible with the “dumb” format of (1),
> since with the “dumb” format there’s no way (other than searching) to even
> know whether some site or hopping is present in the system.

I totally agree.


> We would like to have site/hopping-indexable arrays [4] soon, but we don’t
> want to introduce something now that will be inconsistent with something
> else that we’re working on.  I think the best way forward is to finalize a
> new low-level format as soon as possible.

Probably correct.


> (a) Multiple translational symmetries.  (It looks like more general
> symmetries can be added seamlessly later.)

I can vaguely see how this can be acheived given that the translational
part can be split off from the point group part, but the detail is not
clear to me.


> (d) Smartness: high-level sites and site families are a concept that is
> known to low-level systems.  It must be possible to check (at O(1) cost)
> whether a system contains some site/hopping and what the corresponding value
> is.  This should also work in a vectorized way.

I believe that all that is necessary is that the low-level system be
able to map between high level sites (and hoppings) and integers.
The role of this mapping is purely a convenience for the user;
the solvers don't need this information.

The upshot of this is that the low-level system does not actually need
to "know" about high-level sites and families beyond how to map them to
integers and vice versa.

The advantage is that this would allow concrete implementations to
define what their high-level sites look like (and we don't need to make
the restriction that sites are tagged by vectors of integers).

On the other hand, if we also want vectorized operations then we need
a way to efficiently represent sequences of sites, which essentially
limits us to sites tagged by vectors of integers, and once we have
made this restriction we might as well make a site a concept known
to low-level systems.


> (e) Storage efficiency: it shouldn’t be necessary to store the graph (think
> of large regular graphs).  That’s why we need a low-level system *API*
> rather than a format.  But since most often the graph will be stored, and
> the API should not stand in the way, I propose to think of a format, and
> then design an API that would be mostly transparent for that format but also
> could be implemented for large regular graphs without any storage.  (Note
> that the CSR sparse matrix format can be reformulated as an API in a
> transparent way.)

Couldn't one already do this with the current API for `Graph`?


> • An *entry* is a small submatrix of the Hamiltonian that is  identified by a
> pair of sites and a group element.  Entries can  correspond either to on-site
> entries of the Hamiltonian (in that  case both sites are the same and the
> group element is 0), or to  hoppings (all the other cases).

Just to clarify, the pair of sites and group element, *uniquely*
identifies the *entry*? I remember at some point there was talk of
having multiple entries with the same pair of sites and group element,
and the meaning was that they should be summed in the final Hamiltonian.
(I don't really have an opinion either way, but disallowing this
seems simpler).


> • The entries are enumerated by integers starting with the  hopping-entries,
> followed by the onsite-entries.  Every  hopping-entry occurs only once thus
> defining a direction for  each hopping (this direction has no physical
> consequences and is  only a convention e.g. for storing currents).  Thus,
> each  hopping gets assigned exactly one integer, between 0 and the  number
> of hopping-like entries minus 1.  (This numbering can be  used for storing
> data *on* the hoppings.)

This means that hoppings are not "repeated" as the currently are,
correct?


> • An entry range or *term* is a consecutive sub-sequence of  entries that all
> share the same source & destination site range  and the same group element.
> Terms can be efficiently stored in  a sequence of (entry_start, term_params).

what are `term_params`, in the general case? Some opaque object?

Am I correct in thinking that for finalized builders this will be a
number / matrix / function i.e. we store the matrix elements
of the Hamiltonian only term-wise, as opposed to for individual
*entry*s, as we currently do with `onsite_hamiltonians` and `hoppings`?


> Now follows a sketch of the public API of system objects.  The basic idea is
> to use [1], but instead of specifying a storage format, provide methods that
> take sequence of integers (may take also slices).

Ah, this is the crucial change that will allow us to handle implicit
systems etc. right?


> • Site ranges and entry ranges are stored explicitly as (typically  short)
> sequences of entries, just like it is proposed for site  ranges in [1].

when you say "stored as sequences of entries" you mean "entries" in the
general sense, rather than *entry* in the nomenclature defined previously?


> • A method takes a sequence of entry indices and returns a  sequence of
> small matrices.  Since this is a low-level system,  we may pose limitations,
> i.e. that all entries must be from the  same range.

This is like a vectorized version of the method currently known as
`hamiltonian`, yes?


> • A method takes a sequence of entry indices and returns a  sequence of pairs
> of site indices.  This method could be  combined with the previous one.

Combining this with the previous one, and using the *site ranges*,
we can construct the full Hamiltonian in sparse format.


> • We want to support high-level sites.  This can be done by  providing a
> method that takes a site and returns the  corresponding index.  There must
> be an opposite method as well  (index -> site).  This functionality of
> course needs to be  suitably vectorized, for example using “SiteArray”
> objects.  The  “high-level sites” here are not necessarily full builder
> sites,  perhaps something intermediary that is better C-compatible.

This is related to my reply to (d) above. Do we even need to concretely
specify what these sites are? Isn't it sufficient to just specify that
there is a mapping, and specific system implementations can decide
what "high-level sites" mean for them?


> • An observable object simply needs to point to the system (it  does that),
> and store a list of integers of “where” the  observable is defined.
> 
> • The result of an observable is simply an array of numbers, but  with a
> reference to an observable, everything is known.
> 
> As you can see, there is minimal duplication of storage of things.

This is nice. We will also be able to get rid of a bunch of code in the
operator module for doing this conversion for finalized builders.

--------------------------------------------------

This all seems pretty solid, but I may have other comments as I let the
concepts sink in more.

Joe

Attachment: signature.asc
Description: PGP signature

Reply via email to