Re: [sqlite] BLOB data performance?
Hi all, I don't want to drive anyone away from SQLite (don't think that I can anyway :-)) but a good solution for storing large amounts of data is HDF5. HTH -- ds 2007/11/15, Roger Binns <[EMAIL PROTECTED]>: > > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Asif Lodhi wrote: > > Interestingly, Microsoft's SourceSafe (at least VS-6.0) apparently > > uses file system > > It basically uses a whole bunch of directories and uses a scheme very > similar to RCS to store the versioning content. > > > while SVN uses Berkeley DB as I read once. > > SVN initially only had Berkeley DB and it drove people nuts. In > particular it used to keep getting wedged and required manual > administrator intervention to fix. See this and the two following > questions: > > http://subversion.tigris.org/faq.html#stuck-bdb-repos > > SVN added a filesystem based backend using a directory to store the > deltas for each revision and that is by far the most popular. > > The moral of the tale is to make sure your backend database library > never needs human attention. I always wondered why they didn't use > SQLite. > > Roger > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.6 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFHO9QNmOOfHg372QQRAh5vAKC4zRw0Uwq7Og8aKNLrIWiIE0XpRgCfashV > LgC0Y4jomgU+o7SXh8xHE6M= > =cVgu > -END PGP SIGNATURE- > > > - > To unsubscribe, send email to [EMAIL PROTECTED] > > - > > -- What is the difference between mechanical engineers and civil engineers? Mechanical engineers build weapons civil engineers build targets.
Re: [sqlite] Saving binary files
Hello John, this is extremely helpful. Thanks a lot!!! Dimitris
Re: [sqlite] Saving binary files
In the sense that the legacy code produces files ~100MB. The collection is not legacy, that's what I am trying to setup. Unless I don't understand what you mean 2007/3/19, guenther <[EMAIL PROTECTED]>: On Sun, 2007-03-18 at 23:51 +0200, Dimitris Servis wrote: > in my wildest dreams... if you read carefully, *each* file is about > 100-200MB. I now end up wit ha collection of 100-200 of them and need to > bundle in one file Yes, I did read carefully. 100 (source) files, each 100 MByte, stuffed into a single (target, database) file results into that database file being 100*100 MByte. Considering "possibly 200 or more", this easily could result in a single 64+ GByte file. So, in what way was this meant to be a response regarding my concerns? ;) guenther -- char *t="[EMAIL PROTECTED]"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}} - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Saving binary files
Hello Guenther, in my wildest dreams... if you read carefully, *each* file is about 100-200MB. I now end up wit ha collection of 100-200 of them and need to bundle in one file BR dimitris 2007/3/18, guenther <[EMAIL PROTECTED]>: Well, actually I did not mean to post at this stage but resort to lurking and learning, since I am still doing some rather basic experimenting with SQLite. Anyway, I followed this thread and it strikes me as a crack idea. But aren't these the most fun to hack on? ;) On Sun, 2007-03-18 at 01:06 +0200, Dimitris P. Servis wrote: > I want to do the following; save a set of 100-200 (or even more) binary > files into a single DB file. The binary files are of 100-200 (or a bit > more :-) ) MB each. My requirements are: One thing that just popped up in my mind when reading this thread... The above calculates to 10-40 (or more) GByte. On the other hand, you recently mentioned "good old untouchable legacy software". It may be just me, but "legacy" and "10 GByte files" just don't mix. Did you think about this yet? Does your legacy system use a file storage backend that easily can handle files of this size? Just a thought... guenther -- char *t="[EMAIL PROTECTED]"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}} - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Saving binary files
Hello John, You do not have to load the entire file into memory. The best way is to memory map it and use the returned pointer to copy it into the RDBMS. You can retrieve it to a file in a similar way. It helps if you store the file size in the DB so that you can create a file the correct size to act as a destination for your memcopy. It is only a few lines of code to wrap such logic along with the current Sqlite API. That's just a great idea. Is there an API in SQLite or should I wrap the native OS APIs? THANKS!!! dimitris
Re: [sqlite] Saving binary files
Hello Daniel, Personally I think that files should be save like files on the filesystem. Personally I think that each tool should be used for the purpose it has been created, just to generalize what you said above. Nevertheless, there are situations like mine, where you need the good old untouchable legacy software that was once run on a stanfalone platform, to work over a network in a parallel computing scheme. So you either develop a full transaction/communication/locking etc system yourself, or you try to use what's there and robust to do it... BR dimitris
Re: [sqlite] Saving binary files
Hello Eduardo, this is one of the alternatives, for sure. It would bundle many files into one very effectively, and even without compression, you have a filesystem. However, my real problem is that I don't want to develop software for handling file access, locking, concurrency etc myself. What interests me though is your suggestion to combine the zipped (tared or whatever) file with SQLite. Thanks a lot!!! BR dimitris 2007/3/18, Eduardo Morras <[EMAIL PROTECTED]>: At 19:00 18/03/2007, you wrote: >Hello John, > >thanks for the valuable piece of advice. The idea is that either > >1) I store data in tabular form and work with them >2) I create a table of blobs and each blob is the binary content of a file > >(2) is my method in question, for (1) we all know it works. So I turned to >SQLite just because it seems that it is a lighweight single file database. >So, even if i don't like (2), I can setup an implementation where I have a >file system inside a fully portable file. > >BR > >dimitris You can use zlib to dwhat you want. It has functions for add and delete files, it's flat file and provides medium/good compression. You can store your file metadata on SQLite as zip filename, name of the binary file, an abstract or even a password for zip file. HTH - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Saving binary files
Hello John, thanks for the valuable piece of advice. The idea is that either 1) I store data in tabular form and work with them 2) I create a table of blobs and each blob is the binary content of a file (2) is my method in question, for (1) we all know it works. So I turned to SQLite just because it seems that it is a lighweight single file database. So, even if i don't like (2), I can setup an implementation where I have a file system inside a fully portable file. BR dimitris 2007/3/18, John Stanton <[EMAIL PROTECTED]>: A word of warning if you use the traditional method, an RDBMS table with descriptive data and a reference to the name of the file storing the binary data. If you store a lot of files in a directory you can get into trouble. A robust design uses some form of tree structure of directories to limit the size of individual directories to a value which the system utilities can handle. It is very tedious to discover that "ls" does not work on your directory! Martin Jenkins wrote: > Dimitris P. Servis wrote: > >> I have to provide evidence that such an anorthodox solution is also >> feasible > > > If it was me I'd "investigate" the problem by doing the "right" thing in > the first place, by which time I'd know enough to knock up the "wrong" > solution for the doubters before presenting the "proper" solution as a > fait accompli. > >> I have to compare access performance with flat binary files > > > If I remember correctly, there's no random access to BLOBs so all you'd > be doing is storing a chunk of data and reading the whole lot back. I > don't think that's a realistic test - the time it takes SQLite to find > the pages/data will be a tiny fraction of the time it will take to read > that data off the disk. You can't compare performance against reading > "records" out of the flat file because "they" won't let you do that. In > all it doesn't sound very scientific. ;) > > Martin > > - > > To unsubscribe, send email to [EMAIL PROTECTED] > - > > - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Saving binary files
Hello Martin, If it was me I'd "investigate" the problem by doing the "right" thing in the first place, by which time I'd know enough to knock up the "wrong" solution for the doubters before presenting the "proper" solution as a fait accompli. That's already been done. It is more or less that I now harvest the misinterpretation of my own words and have to implement pretty nonsense stuff It seems like you're right inside my mind ;-) If I remember correctly, there's no random access to BLOBs so all you'd be doing is storing a chunk of data and reading the whole lot back. I don't think that's a realistic test - the time it takes SQLite to find the pages/data will be a tiny fraction of the time it will take to read that data off the disk. You can't compare performance against reading "records" out of the flat file because "they" won't let you do that. In all it doesn't sound very scientific. ;) That's absolutely correct, that's why I am so relaxed considering that I have to prove that elephants rarely fly. I'll be using SQLite as a file system as already pointed out, so the only overhead compared to reading the flat binary file is that tiny little time needed to access the record. Unless I miss something, there'll be no penalty there. However, accessing the ij-th element of each array stored in a rational database and collect all these in a vector, will be much faster if I use a nice schema rather than reading digging int o the binary files. This is my major usecase. Thanks a lot for the help dimitris
Re: [sqlite] Saving binary files
That's not a bad idea at all and I'll check it out. However, since the data is written from a client, I can only do arbitrary chopping without separating them in a sensible manner. Maybe I don't need it though, as I could use it for setting up a pageing system in memory. Thanks!!! 2007/3/18, Teg <[EMAIL PROTECTED]>: Hello Dimitris, If I was going to do this, I'd chop up the binary file in some manageable length like, say 1 meg, insert each chunk as a blob,including an index record for each chunk so, you can select them in order and re-assemble them piece by piece as you enumerate through the records. In that way you never have to hold the entire 200 meg file in memory and you get some kind of random access. Basically you're using SQlite as a file system and each record becomes a "cluster". I think that's very doable as far as storage is concerned. Don't know about the locking part. Since the bottleneck is the disk drive, I'd probably use a single worker and a queue to serialize access to the DB. C Saturday, March 17, 2007, 7:06:46 PM, you wrote: DPS> Hi all, DPS> I want to do the following; save a set of 100-200 (or even more) binary DPS> files into a single DB file. The binary files are of 100-200 (or a bit DPS> more :-) ) MB each. My requirements are: DPS> 1) Many clients should be able to connect to the database to save their DPS> file. Clients are actually client programs that calculate the binary DPS> file. So the db server must be able to handle the concurrency of requests. DPS> 2) The file would be portable and movable (i.e. copy-paste will do, no DPS> special arrangement to move around) DPS> Ideally I would like to provide client programs with a stream to read DPS> and write files. I guess the files should be stored as blob records in a DPS> single table within the database. So my question is whether all this is DPS> possible, since I am not very familiar with SQLite (I have been DPS> redirected here). Just to straighten things, I know this is not the DPS> orthodox use of a DBMS, i.e. I should store my nice scientific data in DPS> tables and define good relations and suff. I really do believe in this DPS> scheme. However at this point, I have to provide evidence that such an DPS> anorthodox solution is also feasible (not to mention that I have to DPS> compare access performance with flat binary files :-/ ). DPS> TIA DPS> -- ds DPS> - DPS> To unsubscribe, send email to [EMAIL PROTECTED] DPS> - -- Best regards, Tegmailto:[EMAIL PROTECTED] - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Saving binary files
Hello Richard, I have to admit you're right. Probably not any DBMS is the right tool for that... However I have to prove it can be done, though I of course favor a relational tables solution. thanks a lot -- dimitris 2007/3/18, [EMAIL PROTECTED] <[EMAIL PROTECTED]>: "Dimitris P. Servis" <[EMAIL PROTECTED]> wrote: > Hi all, > > I want to do the following; save a set of 100-200 (or even more) binary > files into a single DB file. The binary files are of 100-200 (or a bit > more :-) ) MB each. SQLite is not really the right tool for that. -- D. Richard Hipp <[EMAIL PROTECTED]> - To unsubscribe, send email to [EMAIL PROTECTED] -