"Yes, a bit perhaps." I tested, it is of no consequence (at least for my applications), given one transaction per second for a full year, fetching a random +Ref +String day takes a fraction of a second on my PC equipped with SSD, here is the code:
Note that it's only the collect at the end that takes a fraction of a section, the insertions do NOT. (class +Transaction +Entity) (rel amount (+Number)) (rel createdAt (+Ref +String)) (dbs (4 +Transaction) (4 (+Transaction createdAt)) ) (pool "/opt/picolisp/projects/test/db/db" *Dbs) (setq Sday (date 2013 01 01)) (setq Eday (+ Sday 364)) (setq F (db: +Transaction)) (for (D Sday (>= Eday D) (inc D)) (for (S 1 (>= 86400 S) (inc S)) (let Stamp (stamp D S) (println Stamp) (new F '(+Transaction) 'amount 100 'createdAt Stamp) ) ) (commit) (prune) ) (commit) (prune T) (println (collect 'createdAt '+Transaction "2013-10-05 00:00:00" "2013-10-05 23:59:59")) (bye) On Sat, Feb 8, 2014 at 5:44 PM, Alexander Burger <a...@software-lab.de>wrote: > Hi Henrik, > > On Fri, Feb 07, 2014 at 08:29:07PM +0700, Henrik Sarvell wrote: > > Given a very large amount of external objects, representing for instance > > transactions, what would be the quickest way of handling the creation > stamp > > be with regards to future lookups by way of start stamp and end stamp? > > > > It seems to me that using two relations might be optimal, one +Ref +Date > > and an extra +Ref +Time. Then a lookup could first use the +Date relation > > to filter out all transactions that weren't created during the specified > > days followed by (optionally) a filter by +Time. > > You could use two separate relations, but then I would definitely > combine them with '+Aux' > > (rel d (+Aux +Ref +Date) (t)) # Date > (rel t (+Time)) # Time > > In this way a single B-Tree access is sufficient to find any time range. > For example, to find all entities between today noon and tomorrow noon: > > (collect 'd '+Mup > (list (date) (time 12 0 0)) > (list (inc (date)) (time 11 59 59)) ) > > > Another possibility is using not two separate relations, but a single > bag relation > > (rel ts (+Ref +Bag) ((+Date)) ((+Time))) # Timestamp > > This saves a little space in the objects, but results in the same index > entry format. > > > But anyway, in both cases a single index tree is used. In the first case > you also have the option to define the time as > > (rel t (+Ref +Time)) # Time > > with an additional separate index, so that you can search also for > certain time ranges only (no matter what the date is). > > > > Or am I over-thinking it, is a simple +Ref +Number with a UNIX timestamp > an > > easier approach that is just as fast? > > I think this would not make any difference in speed (regarding index > access), but would have some disadvantages, like having to convert this > format to/from PicoLisp date and time values, and being limited in range > (the Unix timestamp cannot represent dates before 1970). > > > > A +Ref +String storing the result of a call to stamp would be ideal as > the > > information is human readable without conversions. However, I suspect > that > > a start-end lookup on it would be much slower than the above, or? > > Yes, a bit perhaps. Parsing and printing human readable date and time > values is simple in PicoLisp (e.g. with 'date', 'stamp', 'datStr' and > related functions, see http://software-lab.de/doc/refD.html#date). > > ♪♫ Alex > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe >