Re: ReadHITRAN

Richard Larsson Mon, 20 Sep 2021 00:06:05 -0700

Hi Patrick,

We can of course optimize the reading routine but there's no point in doing
that.  The methods that read external catalogs should only ever be used
once per update of the external catalog, so it's fine if they are slow but
not too slow.


New memory is allocated for every absorption line always.  This is because
we keep line data local, and the model for the line shape and the local
quantum numbers don't have to be known at compile-time.

Additionally, the line data is pushed into arrays, so they will double in
size every time you reach the current size.

If we knew the number of lines and broadening species and local quantum
numbers, then these allocations happen once for the entire band, but we
don't in ReadHITRAN or any of the external reading routines.  So you will
have many-many system calls asking for more memory.  This of course also
means that you are over-allocating memory since that's how Arrays work
in ARTS (because that's standard C++).  Again, this is also fine since the
external catalog when read again will allocate only exactly what is
required.

With hope,
--Richard

Den mån 20 sep. 2021 kl 08:09 skrev Patrick Eriksson <
patrick.eriks...@chalmers.se>:

> Richard,
>
> Thanks for the clarification.
>
> Is the allocation of more memory done in fixed chunks? Or something
> "smart" in the process? If the former and the chunks are too small, then
> maybe I am doing a lot of reallocations. My impression was that memory
> usage increased quite monotonically, not in noticeable steps.
>
> If the lines have to be sorted into bands, then the complexity of the
> reading will increase in line with what I have noticed. And likely not
> much to do about it.
>
> Bye,
>
> Patrick
>
>
>
> > There are two possible slowdowns there could be still. One is that you
> > hit some line count where you need to reallocate the array of lines
> > because you have too many. The other is that the search for placing the
> > line in the correct band is slow when there are more bands to look
> through.
> >
> > The former would be just pure bad luck, so there's nothing to do about
> it.
> >
> > I would suspect the latter is your problem.  You need to search through
> > the existing bands for every new line to find where it belongs.  Since
> > bands are often clustered closely together in frequency, this could slow
> > down the reading as you get more and more bands. A smaller frequency
> > range means fewer bands to look through.
> >
> > //Richard
> >
> > On Sun, Sep 19, 2021, 22:39 Patrick Eriksson
> > <patrick.eriks...@chalmers.se <mailto:patrick.eriks...@chalmers.se>>
> wrote:
> >
> >     Richard,
> >
> >      > It's expected to take a somewhat arbitrary time.  It reads ASCII.
> >
> >     I have tried multiple times and the pattern is not changing.
> >
> >
> >      > The start-up time is going to be large because of having to find
> the
> >      > first frequency, which means you have to parse the text
> nonetheless.
> >
> >     Understood. But that overhead seems to be relatively small. In my
> test,
> >     it seemed to take 4-7 s to reach the first frequency. Anyhow, this
> goes
> >     in the other direction. To minimise the parsing to reach the first
> >     frequency, it should be better to read all in one go, and not in
> parts
> >     (which is the case for me).
> >
> >     Bye,
> >
> >     Patrick
> >
>

Re: ReadHITRAN

Reply via email to