"Yes, a bit perhaps."

I tested, it is of no consequence (at least for my applications), given one
transaction per second for a full year, fetching a random +Ref +String day
takes a fraction of a second on my PC equipped with SSD, here is the code:

Note that it's only the collect at the end that takes a fraction of a
section, the insertions do NOT.

(class +Transaction +Entity)
(rel amount (+Number))
(rel createdAt (+Ref +String))

(dbs
   (4 +Transaction)
   (4 (+Transaction createdAt)) )

(pool "/opt/picolisp/projects/test/db/db" *Dbs)

(setq Sday (date 2013 01 01))
(setq Eday (+ Sday 364))
(setq F (db: +Transaction))

(for (D Sday (>= Eday D) (inc D))
   (for (S 1 (>= 86400 S) (inc S))
      (let Stamp (stamp D S)
         (println Stamp)
         (new F '(+Transaction) 'amount 100 'createdAt Stamp) ) )
   (commit)
   (prune) )

(commit)
(prune T)

(println (collect 'createdAt '+Transaction "2013-10-05 00:00:00"
"2013-10-05 23:59:59"))

(bye)




On Sat, Feb 8, 2014 at 5:44 PM, Alexander Burger <a...@software-lab.de>wrote:

> Hi Henrik,
>
> On Fri, Feb 07, 2014 at 08:29:07PM +0700, Henrik Sarvell wrote:
> > Given a very large amount of external objects, representing for instance
> > transactions, what would be the quickest way of handling the creation
> stamp
> > be with regards to future lookups by way of start stamp and end stamp?
> >
> > It seems to me that using two relations might be optimal, one +Ref +Date
> > and an extra +Ref +Time. Then a lookup could first use the +Date relation
> > to filter out all transactions that weren't created during the specified
> > days followed by (optionally) a filter by +Time.
>
> You could use two separate relations, but then I would definitely
> combine them with '+Aux'
>
>    (rel d (+Aux +Ref +Date) (t)) # Date
>    (rel t (+Time))               # Time
>
> In this way a single B-Tree access is sufficient to find any time range.
> For example, to find all entities between today noon and tomorrow noon:
>
>    (collect 'd '+Mup
>       (list (date) (time 12 0 0))
>       (list (inc (date)) (time 11 59 59)) )
>
>
> Another possibility is using not two separate relations, but a single
> bag relation
>
>    (rel ts (+Ref +Bag) ((+Date)) ((+Time)))  # Timestamp
>
> This saves a little space in the objects, but results in the same index
> entry format.
>
>
> But anyway, in both cases a single index tree is used. In the first case
> you also have the option to define the time as
>
>    (rel t (+Ref +Time))          # Time
>
> with an additional separate index, so that you can search also for
> certain time ranges only (no matter what the date is).
>
>
> > Or am I over-thinking it, is a simple +Ref +Number with a UNIX timestamp
> an
> > easier approach that is just as fast?
>
> I think this would not make any difference in speed (regarding index
> access), but would have some disadvantages, like having to convert this
> format to/from PicoLisp date and time values, and being limited in range
> (the Unix timestamp cannot represent dates before 1970).
>
>
> > A +Ref +String storing the result of a call to stamp would be ideal as
> the
> > information is human readable without conversions. However, I suspect
> that
> > a start-end lookup on it would be much slower than the above, or?
>
> Yes, a bit perhaps. Parsing and printing human readable date and time
> values is simple in PicoLisp (e.g. with 'date', 'stamp', 'datStr' and
> related functions, see http://software-lab.de/doc/refD.html#date).
>
> ♪♫ Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>

Reply via email to