On 29.03.2013 20:17, Jeff Archer wrote:
I have previously made an apparently bad assumption about this so now I
would like to go back to the beginning of the problem and ask the most
basic question first without any preconceived ideas.
This use case is from an image processing application. I have a large
amount of intermediate data (way exceeds physical memory on my 24GB
machine). So, I need to store it temporarily on disk until getting to next
phase of processing. I am planning to use a large SSD dedicated to holding
this temporary data. I do not need any recoverability in case of hardware,
power or other failure. Each item to be stored is 9 DWORDs, 4 doubles and
2 variable sized BLOBS which are images
I could write directly to a file myself. But I would need to provide some
minimal indexing, some amount of housekeeping to manage variable
sized BLOBS and some minimal synchronization so that multiple instances of
the same application could operate simultaneously on a single set of data.
So, then I though that SQLite could manage these things nicely for me so
that I don't have to write and debug indexing and housekeeping code that
already exists in SQLite.
So, question is: What is the way to get the fastest possible performance
from SQLite when I am willing to give up all recoverability guarantees?
Or, is it simple that I should just write directly to file myself?
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Suggestion : Put the fixed Data with small sizes into a sqlite
database. You won't search in the blobs with a database engine and the
amount of data you have to process is large to make it fast you should
write the image data into files. The other data which is necessary for
processing ordering, indexing, searching comparision is best stored in a
sqlite database.
To improve the speed of access for your images use full pages fill
lesser images to the next page boundaries (as an example 4 k, 8 k ...)
splitt long files into smaller clusters (16 to 64 MB) sequentially
numbered this makes OS file operations faster because you have to cache
the block index while opening and processing a file the positions can be
indexed in sqlite.
I have a similar application for vectorized digitalization of
handwritten old scripts and i use a database for searchable information
while using external files (splitt as described) for raster images and
vector files sqlite would be to slow for blobs like you need them put
them outside but the indexes inside. Another advantage of this approach
is that you can process many binary files simultanously while by putting
them inside a database like sqlite you have only one writer.
The use of transactions makes inserting of data faster especially when
you have indexes. Then try to create your indexes after fully inserting
your data because that makes the process faster.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users