Stefan -

128GB of memory to hold everything? that is not so much, i think we may be able 
to work around the limitstions of juat dumping it all in memory, perhaps.

  have you calculated i/o speeds based on memory specs?  if so, please let me 
know which specs of memory you are calcualting with as,  
in my testing, high-end SSDs were not too shabby for lots of random read/write 
tasks, but i dont know what the data you are getting from wikipedia looks like 
and id have to confirm those results anyway since they were from testing i did 
about 5 years ago, i think we had an engineering sample of an nvme pci-e 16x 
card from micron and we were atocking intel enterprise SSDs via SATA 3.0 and we 
might have had an intel nvme, but they didnt take my opinion when it came time 
to order hardware ;

so im looking at clustering with rdma or something since high-end NICs, too, 
can approach memory bus speeds or surpass them i guess, although this might be 
a HUGE, fundamental misunderstanding on my part, since memory speed on, say 
video cards, is measured in GB/s, but NICS are in Gb/s, so we can say 6.4GB/s 
for a latest nvidia card's memory bus compared to a 45Gb ROCe nic (may be 
prohibitvely expensive, unsure) its the same-ish, since  6.4GBps (bytes into 
bits) is 51.2Gbps, its close, and anyways that is GPU land.

plus, there would be huge performance hits when switching to a network stack 
due to memory addressing vs net addressing, unless maybe u could work directly 
with hwaddressing somehow on the net stack?  not sure what that might look like 
and really just brings us back to IP land, maybe, idk i was never a network 
engineer so the OSI model is very faded at this point.

i try to work with commodity hardware tho, since 1Gbps Nics are still far more 
common than 10Gbps Nics in consumer land in the USA since p much no one's home 
internet is faster than 1Gbps anyway.  i think even the local webhosting 
comoany hosts 20k-30k servers with god-knows how many webapps on lile 4-5 
redundant 10Gbps links

OORRRR you could avoid most of that and find a software solution, maybe.  here 
is an idea i had:

maybe consider a caching mechanism on the server where, say, a disk image 
containing a local copy of wikipedia is mounted read-only is updated by some 
other, separate process and is presented as an immutable, read-only volume for 
your application (wikify)'s threads to consume according to some set of rules 
(rules will be necessary beyond an acl, inorder to ensure data consistency on 
read) and rhen u can let the OS handle memory caching if you want.

you could also look at sharding the dataset across multiple disks for better 
threading, but more expenseive and unnecessary optimization rn anyway


lmk if you have any thoughts!




------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T6322565b7d29a2a0-M84c7bfd6bd548751277ca98e
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to