"Yes, a bit perhaps."
I tested, it is of no consequence (at least for my applications), given one
transaction per second for a full year, fetching a random +Ref +String day
takes a fraction of a second on my PC equipped with SSD, here is the code:
Note that it's only the collect at the end that takes a fraction of a
section, the insertions do NOT.
(class +Transaction +Entity)
(rel amount (+Number))
(rel createdAt (+Ref +String))
(4 (+Transaction createdAt)) )
(pool "/opt/picolisp/projects/test/db/db" *Dbs)
(setq Sday (date 2013 01 01))
(setq Eday (+ Sday 364))
(setq F (db: +Transaction))
(for (D Sday (>= Eday D) (inc D))
(for (S 1 (>= 86400 S) (inc S))
(let Stamp (stamp D S)
(new F '(+Transaction) 'amount 100 'createdAt Stamp) ) )
(println (collect 'createdAt '+Transaction "2013-10-05 00:00:00"
On Sat, Feb 8, 2014 at 5:44 PM, Alexander Burger <a...@software-lab.de>wrote:
> Hi Henrik,
> On Fri, Feb 07, 2014 at 08:29:07PM +0700, Henrik Sarvell wrote:
> > Given a very large amount of external objects, representing for instance
> > transactions, what would be the quickest way of handling the creation
> > be with regards to future lookups by way of start stamp and end stamp?
> > It seems to me that using two relations might be optimal, one +Ref +Date
> > and an extra +Ref +Time. Then a lookup could first use the +Date relation
> > to filter out all transactions that weren't created during the specified
> > days followed by (optionally) a filter by +Time.
> You could use two separate relations, but then I would definitely
> combine them with '+Aux'
> (rel d (+Aux +Ref +Date) (t)) # Date
> (rel t (+Time)) # Time
> In this way a single B-Tree access is sufficient to find any time range.
> For example, to find all entities between today noon and tomorrow noon:
> (collect 'd '+Mup
> (list (date) (time 12 0 0))
> (list (inc (date)) (time 11 59 59)) )
> Another possibility is using not two separate relations, but a single
> bag relation
> (rel ts (+Ref +Bag) ((+Date)) ((+Time))) # Timestamp
> This saves a little space in the objects, but results in the same index
> entry format.
> But anyway, in both cases a single index tree is used. In the first case
> you also have the option to define the time as
> (rel t (+Ref +Time)) # Time
> with an additional separate index, so that you can search also for
> certain time ranges only (no matter what the date is).
> > Or am I over-thinking it, is a simple +Ref +Number with a UNIX timestamp
> > easier approach that is just as fast?
> I think this would not make any difference in speed (regarding index
> access), but would have some disadvantages, like having to convert this
> format to/from PicoLisp date and time values, and being limited in range
> (the Unix timestamp cannot represent dates before 1970).
> > A +Ref +String storing the result of a call to stamp would be ideal as
> > information is human readable without conversions. However, I suspect
> > a start-end lookup on it would be much slower than the above, or?
> Yes, a bit perhaps. Parsing and printing human readable date and time
> values is simple in PicoLisp (e.g. with 'date', 'stamp', 'datStr' and
> related functions, see http://software-lab.de/doc/refD.html#date).
> ♪♫ Alex
> UNSUBSCRIBE: mailto:email@example.com?subject=Unsubscribe