Re: [sqlite] New SQLite Forum established - this mailing list is deprecated
I am happy to do that, but I can't promise any kind of timeline as it will have to be in my free time only. Can't really sign well, though, as I am totally blind. But I guess we'll cross that bridge if I come up with a working audio captcha solution. I'll have a look at the Fossil captcha code this evening and write back if I have questions. Should I write to you directly, or to the Fossil forum? Thanks! Philip On 3/13/2020 5:16 PM, Richard Hipp wrote: On 3/13/20, Philip Bennefall wrote: I submitted a thread a while back offering to work on an audio captcha for Fossil, I don't recall that thread. But if you want to submit code that generates an audio file of some kind that speaks the text of a Fossil Captcha, that would be great. I will build it into the system, assuming it works, does not have onerous external dependencies, and you can get me a signed CLA. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] New SQLite Forum established - this mailing list is deprecated
If you are comfortable doing that for any blind users who stop by, then of course that would be a quick fix. But it feels like taking a step backwards, from being able to subscribe without assistance to having to contact the author of whatever Fossil forum this applies to. I submitted a thread a while back offering to work on an audio captcha for Fossil, but cannot locate the exact post. I did not receive much response at the time, but am wondering if this would be of interest? Thanks! Kind regards, Philip Bennefall On 3/13/2020 4:53 PM, Richard Hipp wrote: On 3/13/20, Philip Bennefall wrote: Is there a solution in the pipeline for the inaccessible captcha in the forum for visually impaired users? The solution is for you to send me a private email asking for me to create your account for you, as doing that is way, way easier than trying to engineer an audible captcha. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] New SQLite Forum established - this mailing list is deprecated
Hi Richard, Is there a solution in the pipeline for the inaccessible captcha in the forum for visually impaired users? I brought this up when the forum was first released and you kindly created an account manually for me, but it doesn't seem as though the problem was actually solved in the system itself. I just tried to create an account on the SqLite forum, but could not. The absolute majority of captcha systems offer some kind of audio version now, which is very important for me and other blind users. Kind regards, Philip Bennefall On 3/13/2020 4:22 PM, Richard Hipp wrote: On 3/13/20, Huỳnh Trần Khanh wrote: [On a mailing nlist] I can filter the posts, sort them, search through them, archive them, forward them to a friend, You can do all of that with the SQLite Forum. Remember, all content is still delivered directly to your in-box, just like with a mailing list, so anything you can do with content received from a mailing list can also be done with content from the forum. But there are many things that the forum provides that a mailing list does now. For example, if you want an archive of the forum activity, you can clone the entire history with one command: fossil clone https://sqlite.org/forum sqlite-forum.fossil Then periodically "sync" to keep your private archive up-to-date. Now you have all historical content, neatly packaged in an SQL database. You can extract and search and manage the content in this archive in any way you want. The only thing that you can do with a mailing list that the SQLite Forum does not allow is to submit new postings via email. You must use the web interface in order to post a message. In my experience, this forces people to take a little extra time to think about what they are saying, and to format and arrange their thoughts for clarity, and hence results in a better experience for the readers. There are other important features that the forum provides that mailinglists typically do not: 1. You can format your postings using Markdown 2. You can add hyperlinks to your postings that are consistently displayed and are not dependent on the idiosyncrasies of various email clients. 3. You can edit prior posts to fix typos or mistakes. 4. Your email address is never displayed, even to subscribers. 5. It is much easier to contribute anonymously to a web-based forum than it is to contribute on a mailing list. There is no verification process to go through. You just type in what you want to say and press "Submit". 6. Moderators have much better control over spam and other malicious content. The first point (use of Markdown) is the killer feature for me. There was a recent thread on this mailing list that involved people posting EXPLAIN output. That text gets hopelessly jumbled on most email readers. If those messages had been formatted with Markdown, they would have been much easier to read and understand. I've been using both this mailing list and the Forum on Fossil regularly for two years now. The forum is so much nicer that I have come to dread having to work with the legacy mailing list, at least for complex subjects. It is time for a switch to better technology. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance vs. memory trade-off question
Hi Richard, I can see the advantage of both, but personally I prefer the greater speed with slightly more memory approach (e.g. what we have now). Kind regards, Philip Bennefall On 12/14/2019 2:27 PM, Richard Hipp wrote: A new feature on a branch has the following disadvantages: (1) It uses about 0.25% more CPU cycles. (Each release of SQLite is normally about 0.5% faster, so enabling this feature is sort of like going back by one-half of a release cycle.) (2) The code space (the size of the library) is between 400 and 500 bytes larger (depending on compiler and optimization settings). The this one advantage: (3) Each database connection uses about 72 KB less heap space. QUESTION: Should this feature be default-on or default-off? What's more important to you? 0.25% fewer CPU cycles or about 72KB less heap space used per database connection? The feature can be activated or deactivated at start-time, but you take the disadvantages (the performance hit and slightly larger library size) regardless, unless you disable the feature at compile-time. If the feature is compile-time disabled, then the corresponding code is omitted and and it cannot be turned on at start-time. If you have opinions, you can reply to this mailing list, or directly to me. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Tutorial on SQLite Internals - 2019-11-05 in Houston, TX
I will unfortunately not be able to participate, but I am wondering if it will be recorded and put on YouTube or similar? I have seen some talks of yours from TCL and other conferences and enjoyed them very much. Kind regards, Philip Bennefall On 9/3/2019 3:12 AM, Richard Hipp wrote: There will be a full-day tutorial on SQLite Internals on Tuesday 2019-11-05 in Houston TX. See details at: https://www.tcl.tk/community/tcl2019/tutorials.html#drh:sqlite-tour1 This will be an intensive tutorial. Bring your laptop, with a C-compiler already installed, and also with TCL development libraries (needed to compile SQLite from canonical sources) and Fossil installed, and be comfortable using the compiler tools before you arrive, as we have a lot of ground to cover and will need to move quickly. This tutorial is part of the 2019 Tcl/Tk conference, which will continue on the following three days. To maximize your learning experience, plan stay as long as you can, as these events normally involve a lot of socializing with a group of very smart people. I will be giving the tutorial (obviously) but I am not a conference organizer and am not up on all the details. If you have questions, please ask and I will see if I can find answers. Or, some of the conference organizers are on this mailing list, so perhaps they will speak up. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] database like file archive
The earliest version of the shell which ships with the archive support is 3.22.0, according to the page I linked to. If you have an earlier version you could simply grab the Mac OSX precompiled binaries from the download page on sqlite.org and you'll be good to go. Kind regards, Philip Bennefall On 8/27/2019 4:18 PM, Peng Yu wrote: There is the sqlar archive format, which you can test using the official sqlite3 command line shell. There is also a library for it as part of the Sqlite3 repository. https://www.sqlite.org/sqlar.html https://sqlite.org/sqlar/doc/trunk/README.md This is good to know. How to install it? In homebrew's sqlite package, I don't find sqlar. I use Mac OS X. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] database like file archive
There is the sqlar archive format, which you can test using the official sqlite3 command line shell. There is also a library for it as part of the Sqlite3 repository. https://www.sqlite.org/sqlar.html Kind regards, Philip Bennefall On 8/27/2019 3:56 PM, Peng Yu wrote: Hi, I haven't found an archive format that allows in-place delete (I know that .zip, .7z and .tar don't). This means that whenever delete is needed, the original archive must be copied first. This can be problematic when the archive is large and the file to delete is small. Something along the line of the ability of sqlite3 to perform in-place delete might be a useful feature for archives. But I haven't found any such archive format. Does anybody know one? Thanks. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] CTE to Get Path In a Tree
On 5/12/2019 2:14 PM, Igor Tandetnik wrote: On 5/12/2019 6:19 AM, Philip Bennefall wrote: Hi everyone, I have a tree of folders and I want to find the complete path from any arbitrary point back to the top level directory. The schema is: CREATE TABLE IF NOT EXISTS folders( id INTEGER PRIMARY KEY, parentFolderId INTEGER REFERENCES folders(id) ON DELETE CASCADE ON UPDATE CASCADE, name TEXT NOT NULL); I made the following CTE: WITH RECURSIVE folderTree (id, path) AS(SELECT id, name FROM folders WHERE parentFolderId IS NULL UNION ALL SELECT folders.id, folderTree.path || '/' || folders.name FROM folders, folderTree WHERE folders.parentFolderId = folderTree.id) SELECT path FROM folderTree WHERE id=?1; This produces the correct result, but I am wondering if there is a more efficient way? This query seems to generate the entire tree and then do a table scan to find just the one row I am looking for. Can I start from the given row and *only* traverse upwards through the levels until I find a node with no parent? Just reverse the conditions. Something like this (not tested): WITH RECURSIVE folderPath(id, parentId, path) AS(SELECT id, parentFolderId, name FROM folders WHERE id=?1) UNION ALL SELECT f.id, f.parentFolderId, f.name || '/' || fp.path FROM folders f join folderPath fp on (f.id = fp.parentId)) SELECT path FROM folderPath WHERE parentId is null; Thanks, that seems to work with a couple of very minor tweaks. The query plans are somewhat different and my gut feeling is that the one you wrote is better, but I will measure against some larger datasets just to be sure. Thanks for the quick response! Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] CTE to Get Path In a Tree
Hi everyone, I have a tree of folders and I want to find the complete path from any arbitrary point back to the top level directory. The schema is: CREATE TABLE IF NOT EXISTS folders( id INTEGER PRIMARY KEY, parentFolderId INTEGER REFERENCES folders(id) ON DELETE CASCADE ON UPDATE CASCADE, name TEXT NOT NULL); I made the following CTE: WITH RECURSIVE folderTree (id, path) AS(SELECT id, name FROM folders WHERE parentFolderId IS NULL UNION ALL SELECT folders.id, folderTree.path || '/' || folders.name FROM folders, folderTree WHERE folders.parentFolderId = folderTree.id) SELECT path FROM folderTree WHERE id=?1; This produces the correct result, but I am wondering if there is a more efficient way? This query seems to generate the entire tree and then do a table scan to find just the one row I am looking for. Can I start from the given row and *only* traverse upwards through the levels until I find a node with no parent? Thanks in advance for any pointers. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fossil Delta Compression in SqLite
I am far from an expert in this field myself so I don't know whether including it in the text section of the binary would be enough, and the main issue for me is when clients of mine redistribute middleware in their turn as I mentioned in an earlier post. But either way, Richard already cleared this up so there's no more ambiguity on my end. Kind regards, Philip On 5/7/2018 11:11 PM, Warren Young wrote: On May 7, 2018, at 9:53 AM, Philip Bennefall <phi...@blastbay.com> wrote: It was merely an idea to possibly avoid some potential ambiguity regarding public domain, which is a bit of a gray area in many places. So take the code under the explicit license, then. In my non-expert opinion, the worry over attribution is bogus. Restriction 2 in the 2-clause BSD license just requires that the license text itself be in the binary, not that you “attribute” the software in your documentation or in a startup banner as is required by some other licenses. Fossil itself doesn’t bother to do even that: $ strings `which fossil` | grep 'IMPLIED WARRANTIES' ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fossil Delta Compression in SqLite
It was merely an idea to possibly avoid some potential ambiguity regarding public domain, which is a bit of a gray area in many places. Obviously not a requirement for anyone to do anything, it was but a friendly question. Kind regards, Philip On 5/7/2018 5:44 PM, R Smith wrote: On 2018/05/07 5:33 PM, Philip Bennefall wrote: Thanks very much for that information, Richard! :) I don't know if it would make any difference legally, but perhaps this could be made explicit in the comments? So it's not enough to get it free... the free giver has to now put some extra labour for zero reward into convincing people it is free to use. This is a crazy World. :) ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fossil Delta Compression in SqLite
Thanks very much for that information, Richard! :) I don't know if it would make any difference legally, but perhaps this could be made explicit in the comments? Thanks again. Kind regards, Philip On 5/7/2018 5:22 PM, Richard Hipp wrote: You are welcomed to use the public-domain version of the delta encoding routines found in the SQLite source tree for whatever purpose you want, without attribution. I am the sole author of that code, and I am a citizen of a country that allows people to disavow intellectual property claims, so it is possible for me to say this. On 5/7/18, Philip Bennefall <phi...@blastbay.com> wrote: As far as I can judge, you need to include the entire license - or at least the majority of it - in the documentation (not just a single line). For an end user product that's fine, but I would rather not have to ask clients to do so if I am distributing middleware simply because of a component that I use internally and which they never see. It's a pretty big difference from public domain in that respect, though I realize that it is a difference that many people don't care about. I would be curious to hear what the developers think about this, since this license differs from the rest of the SqLite codebase. Of course it is an extension so you don't need to include it, but I'm curious nonetheless. Thanks in advance for any clarification. Kind regards, Philip Bennefall On 5/7/2018 4:54 PM, Peter Da Silva wrote: On 5/6/18, 11:23 AM, "sqlite-users on behalf of Philip Bennefall" <sqlite-users-boun...@mailinglists.sqlite.org on behalf of phi...@blastbay.com> wrote: Only the requirement for attribution in binaries. That can be significant in certain use cases. One line of text in the documentation provided with the distribution doesn't seem burdensome. It's not like the advertising clause in the original BSD license... is that what you're thinking of? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fossil Delta Compression in SqLite
As far as I can judge, you need to include the entire license - or at least the majority of it - in the documentation (not just a single line). For an end user product that's fine, but I would rather not have to ask clients to do so if I am distributing middleware simply because of a component that I use internally and which they never see. It's a pretty big difference from public domain in that respect, though I realize that it is a difference that many people don't care about. I would be curious to hear what the developers think about this, since this license differs from the rest of the SqLite codebase. Of course it is an extension so you don't need to include it, but I'm curious nonetheless. Thanks in advance for any clarification. Kind regards, Philip Bennefall On 5/7/2018 4:54 PM, Peter Da Silva wrote: On 5/6/18, 11:23 AM, "sqlite-users on behalf of Philip Bennefall" <sqlite-users-boun...@mailinglists.sqlite.org on behalf of phi...@blastbay.com> wrote: Only the requirement for attribution in binaries. That can be significant in certain use cases. One line of text in the documentation provided with the distribution doesn't seem burdensome. It's not like the advertising clause in the original BSD license... is that what you're thinking of? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Fossil Delta Compression in SqLite
Only the requirement for attribution in binaries. That can be significant in certain use cases. Kind regards, Philip Bennefall On 5/6/2018 6:19 PM, Richard Hipp wrote: On 5/6/18, Philip Bennefall <phi...@blastbay.com> wrote: Hi all, I had a quick question regarding the licensing of the delta compression code found in the sqldiff and the RBU extensions for SqLite. I see that this code is extracted from Fossil, which is under the BSD license. But the header of the source files in the SqLite repository which contain the delta compression code are marked as public domain. Does that mean that the delta compression code extracted from Fossil is also public domain? Could I borrow it from SqLite and use it without being bound by the BSD license? Does it matter whether the code is public domain or two-clause BSD? There are no restrictions on its use in either case, are there? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Fossil Delta Compression in SqLite
Hi all, I had a quick question regarding the licensing of the delta compression code found in the sqldiff and the RBU extensions for SqLite. I see that this code is extracted from Fossil, which is under the BSD license. But the header of the source files in the SqLite repository which contain the delta compression code are marked as public domain. Does that mean that the delta compression code extracted from Fossil is also public domain? Could I borrow it from SqLite and use it without being bound by the BSD license? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] sqlite3_expert Status
Dear SqLite developers, I was curious to know the status of the sqlite3_expert extension? I followed its development with great interest earlier in the year and was wondering if there are any plans to merge it to trunk? What kind of work remains (if any) before it can be considered complete? Thanks, Philip Bennefall ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Question about Memsys5 Internals
Hi Richard, Our library runs in video games (on both consoles and mobile devices), and many game developers detest calls to malloc/free at run-time but they are happy to give me a sizable chunk in one go. So I'm using memsys5 as an optional memory pool in an attempt to reduce the number of *user* allocations. The user is able to plug in their own malloc/free replacements as well in which case the memory pool is redundant, but the pool is there as a shield in the default configuration. Kind regards, Philip Bennefall On 9/5/2016 7:39 PM, Richard Hipp wrote: On 9/5/16, Philip Bennefall <phi...@blastbay.com> wrote: I am using [memsys5] as a general purpose memory pool in my application. There is a strong feeling in some parts of the open-source community that nobody should ever attempt to build their own memory allocator. The malloc/free from the standard library is believed to be good enough for every situation and any attempt to work around standard malloc/free is discouraged. SQLite uses system malloc/free by default, as you know. But even the presence of memsys5 in the source tree as a compile-time option is controversial in some circles. I put memsys5 there for good reasons, but they are obscure reasons, which is why memsys5 is usually omitted from the build. So I'm wondering: Why are *you* using memsys5 rather than malloc/free from the standard library? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Question about Memsys5 Internals
Thanks, Richard. I actually arrived at that by experimenting, but I had no idea whether I was doing it right. Thanks for the explanation - it confirms that I've got it set up correctly. Kind regards, Philip Bennefall On 9/5/2016 12:13 PM, Richard Hipp wrote: On 9/5/16, Philip Bennefall <phi...@blastbay.com> wrote: Hi all, First, if this is the wrong place to ask this question, please let me know and I will ask elsewhere. I have a question about memsys5, which I am using as a general purpose memory pool in my application. I am allocating fixed size objects, and I have set mReq to the size of that object (it is a power of 2). I know that I will only be using N objects at any given time. Is there an easy way to calculate how much space I need to reserve in order to hold N objects simultaneously? If I allocate a total of N*sizeof(object) bytes, it doesn't seem to let me store N objects at the same time. Am I doing something wrong or is this expected? Let sz be the number of bytes of memory you provide to the memory allocator. The number of minimum-size objects that can be stored is sz/(mnReq+1). Not sz/mnReq. The extra +1 is some space taken from the front of the provided memory and used for internal bookkeeping. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Question about Memsys5 Internals
Hi all, First, if this is the wrong place to ask this question, please let me know and I will ask elsewhere. I have a question about memsys5, which I am using as a general purpose memory pool in my application. I am allocating fixed size objects, and I have set mReq to the size of that object (it is a power of 2). I know that I will only be using N objects at any given time. Is there an easy way to calculate how much space I need to reserve in order to hold N objects simultaneously? If I allocate a total of N*sizeof(object) bytes, it doesn't seem to let me store N objects at the same time. Am I doing something wrong or is this expected? Thanks in advance for any tips! Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sweet 16
Incredible. Congratulations to Richard, Dan, Joe and the rest of the team on this incredible achievement. I'm not very active on this list, but I've been a casual sqLite user for years and it just keeps getting better. Keep up the great work! Kind regards, Philip Bennefall On 5/29/2016 7:28 PM, Richard Hipp wrote: The first check-in of SQLite code occurred 16 years ago today. https://www.sqlite.org/src/timeline?c=2000-05-29 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Delta Compression in RBU
Thanks for the clarification, Dan. Might be too picky but perhaps a short note should be added to the sources verifying this for people as paranoid as myself? :D Kind regards, Philip Bennefall From: sqlite-users-bounces at mailinglists.sqlite.org [sqlite-users-bounces at mailinglists.sqlite.org] on behalf of Dan Kennedy [danielk1...@gmail.com] Sent: Friday, November 13, 2015 3:11 PM To: sqlite-users at mailinglists.sqlite.org Subject: Re: [sqlite] Delta Compression in RBU On 11/13/2015 08:06 AM, Philip Bennefall wrote: > Something I forgot in my prior post; I found the delta creation code > in sqldiff.c so my question really concerns the combined delta code > found in RBU and sqldiff.c (both creating and applying deltas). The versions of the delta creation and application code checked in to the sqlite source project are public domain. The original authors of the code re-licensed it. Dan. ___ sqlite-users mailing list sqlite-users at mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Delta Compression in RBU
Something I forgot in my prior post; I found the delta creation code in sqldiff.c so my question really concerns the combined delta code found in RBU and sqldiff.c (both creating and applying deltas). Kind regards, Philip Bennefall
[sqlite] Delta Compression in RBU
Hi Richard and others, I am looking at the RBU extension, and the delta compression functionality in particular. I am interested in using the delta compression code (both as part of the RBU extension itself but possibly also externally by extracting it from RBU). I see that the delta compression was lifted from Fossil which is under the BSD license, but of course I also see that the RBU extension has a public domain dedication at the top just like all the other official SqLite extensions. So I really just wanted to verify that even though it is lifted from Fossil, the delta compression code has the same public domain clearance guarantee as the rest of sqLite? Is this assumption true, or could there potentially be contributions from other Fossil developers that are still under the terms of the BSD license? Thanks! Kind regards, Philip Bennefall
[sqlite] fts5aux?
Thanks, Dan. I'll keep my eyes on this. Thanks for the great work on FTS5! Kind regards, Philip Bennefall On 10/16/2015 11:10 AM, Dan Kennedy wrote: > On 10/15/2015 03:35 AM, Philip Bennefall wrote: >> Hi all, >> >> Are there any plans in fts5 to add support for something equivalent >> to fts4aux? This would be of great use to me. > > There is, but it's not actually finished or documented yet. It still > does a linear scan of the entire FTS index for every query. To create > the table: > > CREATE VIRTUAL TABLE vvv USING fts5vocab(, ); > > where may be either "row" or "column". > > Code is here: http://www.sqlite.org/src/artifact/85ebf2e93089c > > Dan. > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > . >
[sqlite] fts5aux?
Hi all, Are there any plans in fts5 to add support for something equivalent to fts4aux? This would be of great use to me. Thanks! Kind regards, Philip Bennefall
[sqlite] Possible documentation error regarding recursive triggers
Sorry for the unrelated content below my last message; I responded to a prior post to make sure I got the address right and forgot to clear it. On 9/18/2015 12:32 AM, Philip Bennefall wrote: > Hi all, > > I have found what I believe is a mistake in the SqLite documentation. > On the page listing the supported pragmas, in the section called > recursive_triggers, it says: > > Support for recursive triggers was added in version 3.6.18 but was > initially turned OFF by default, for compatibility. Recursive triggers > may be turned on by default in future versions of SQLite. > > However, in sqlite.org/limits.html it says: > > Beginning with version 3.7.0, recursive triggers are enabled by > default but can be manually disabled using PRAGMA recursive_triggers. > > Kind regards, > > Philip Bennefall > > On 9/17/2015 10:49 PM, Scott Hess wrote: >> On Thu, Sep 17, 2015 at 1:24 PM, Ralf Junker wrote: >> >>> On 17.09.2015 20:14, Scott Hess wrote: >>> >>>> The problem is that there are LOCALE settings where tolower() does >>>> things >>>> C >>>> programmers don't expect. I think tr_TR was one case, the handling >>>> of 'I' >>>> (Google "tr_tr locale bug" and you'll see lots of people hitting >>>> the same >>>> general problem). It isn't a problem of type safety, it's a >>>> problem that >>>> the same inputs might have different outputs for certain library >>>> functions >>>> when you change environment variables. I don't remember whether there >>>> were >>>> specific problems with other ctype functions, or if I just thought >>>> it was >>>> a >>>> good idea to be careful, once I realized the class of problem. >>>> >>> And this check-in therefore misses the point as it does not address >>> this >>> LOCALE problem IMHO: >>> >>> http://www.sqlite.org/src/info/6713e35b8a8c997a >> >> Hmm. Well, it might miss _a_ point, while solidly landing some other >> point >> :-). >> >> Current fts3 seems to spread this code differently, under fts2.c it was >> like: >> >> https://code.google.com/p/chromium/codesearch#chromium/src/third_party/sqlite/src/ext/fts2/fts2.c=336 >> >> >> where it's not a wrapper around the library, instead it was a direct >> implementation of the functions which only acted on 7-bit values in the >> ASCII set. >> >> I think these should really be in terms of sqlite3UpperToLower >> and sqlite3CtypeMap. That might be an issue to expose to an extension >> sensibly. >> >> -scott >> ___ >> sqlite-users mailing list >> sqlite-users at mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> . >> >
[sqlite] Possible documentation error regarding recursive triggers
Hi all, I have found what I believe is a mistake in the SqLite documentation. On the page listing the supported pragmas, in the section called recursive_triggers, it says: Support for recursive triggers was added in version 3.6.18 but was initially turned OFF by default, for compatibility. Recursive triggers may be turned on by default in future versions of SQLite. However, in sqlite.org/limits.html it says: Beginning with version 3.7.0, recursive triggers are enabled by default but can be manually disabled using PRAGMA recursive_triggers. Kind regards, Philip Bennefall On 9/17/2015 10:49 PM, Scott Hess wrote: > On Thu, Sep 17, 2015 at 1:24 PM, Ralf Junker wrote: > >> On 17.09.2015 20:14, Scott Hess wrote: >> >>> The problem is that there are LOCALE settings where tolower() does things >>> C >>> programmers don't expect. I think tr_TR was one case, the handling of 'I' >>> (Google "tr_tr locale bug" and you'll see lots of people hitting the same >>> general problem). It isn't a problem of type safety, it's a problem that >>> the same inputs might have different outputs for certain library functions >>> when you change environment variables. I don't remember whether there >>> were >>> specific problems with other ctype functions, or if I just thought it was >>> a >>> good idea to be careful, once I realized the class of problem. >>> >> And this check-in therefore misses the point as it does not address this >> LOCALE problem IMHO: >> >> http://www.sqlite.org/src/info/6713e35b8a8c997a > > Hmm. Well, it might miss _a_ point, while solidly landing some other point > :-). > > Current fts3 seems to spread this code differently, under fts2.c it was > like: > > https://code.google.com/p/chromium/codesearch#chromium/src/third_party/sqlite/src/ext/fts2/fts2.c=336 > where it's not a wrapper around the library, instead it was a direct > implementation of the functions which only acted on 7-bit values in the > ASCII set. > > I think these should really be in terms of sqlite3UpperToLower > and sqlite3CtypeMap. That might be an issue to expose to an extension > sensibly. > > -scott > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > . >
[sqlite] Possible error in FTS5 docs
Thanks, Dan. Do you have any idea when FTS5 will be merged into the amalgamation, if at all? Kind regards, Philip Bennefall On 6/19/2015 8:54 PM, Dan Kennedy wrote: > On 06/19/2015 08:34 PM, Philip Bennefall wrote: >> Hi all, >> >> While reading through the draft documentation for FTS5, I noticed the >> following in section 4.1.1: >> >> "The built-in auxiliary function bm25() returns a real value >> indicating how well the current row matches the full-text query. The >> better the match, the >> larger the value returned." >> >> Then, a little further down it says: >> >> "In order to avoid this pitfall, the FTS5 implementation of BM25 >> multiplies the result by -1 before returning it, ensuring that better >> matches are assigned >> numerically lower scores." >> >> Is this a mistake, or did I misunderstand something? > > It's a mistake. Should be "The better the match, the numerically lower > the value returned" or similar. > > Thanks for pointing this out. > > Dan. > > ___ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > . >
[sqlite] Possible error in FTS5 docs
Hi all, While reading through the draft documentation for FTS5, I noticed the following in section 4.1.1: "The built-in auxiliary function bm25() returns a real value indicating how well the current row matches the full-text query. The better the match, the larger the value returned." Then, a little further down it says: "In order to avoid this pitfall, the FTS5 implementation of BM25 multiplies the result by -1 before returning it, ensuring that better matches are assigned numerically lower scores." Is this a mistake, or did I misunderstand something? Also, I am curious to know whether the developers are able to estimate roughly when the FTS5 extension might appear in an official release of SqLite? Kind regards, Philip Bennefall
[sqlite] Extending VFS documentation
Thanks, Jeff. That's an interesting project; redirecting to iostream. I'm sure it'll be useful, though an authoritative source for documentation on the SqLite website itself would be ideal. The methods themselves and their arguments seem easy enough; my main concern is what error codes I may return when, etc. Kind regards, Philip Bennefall On 5/24/2015 6:19 PM, Jeff M wrote: > I searched stack overflow for "sqlite3_vfs_register" and got a few hits. For > example, this was a good starting point on which I based my own VFS. > > http://stackoverflow.com/a/3842409/236415 > > Jeff > > >> On May 24, 2015, at 4:39 AM, Philip Bennefall wrote: >> >> Hi all, >> >> I have been interested in the VFS layer of SqLite for a while, and found the >> following article to be very useful: >> >> http://www.sqlite.org/vfs.html >> >> However, it ends right when it is about to go into the actual details of how >> to implement a VFS. It would be great if that essay could be completed. I >> have looked at the example VFS implementations, but it is hard to determine >> which parts of the code that are implementation details as opposed to being >> part of the stable public API so to speak. >> >> Kind regards, >> >> Philip Bennefall >> > . >
[sqlite] Extending VFS documentation
Hi Roger, For experimentation, I definitely agree that using a higher level language is better. What I am after is an authoritative source for exact behavior of the various functions, what different errors they may return and what the ramifications of not implementing various methods are, etc etc. There is some information of that nature in the reference, but it would be ideal if the article on the sqLite website itself would be completed as that would eliminate some guesswork. The rest of the sqLite documentation, and indeed the beginning of the VFS article, is excellent. It allows me to implement things immediately without having to search third party sources and hack around to find what works and what doesn't. It would be ideal if the same could be true for the VFS layer. Kind regards, Philip Bennefall On 5/24/2015 10:38 PM, Roger Binns wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 05/24/2015 02:39 AM, Philip Bennefall wrote: >> I have looked at the example VFS implementations, but it is hard >> to determine which parts of the code that are implementation >> details as opposed to being part of the stable public API so to >> speak. > The answer is that it matters what you are exposing. A vfs for a csv > file is a very different beast that one exposing bits of Amazon's > services. I wrote one that exposes couchdb, but it wouldn't help > someone else exposing something different. > > What I recommend you do is use one of the higher level languages that > exposes the VFS in their SQLite bindings. You will get better error > messages, be able to write the code quicker, and explore behaviour > easier (eg xBestIndex). ie you'll get a better on the VFS specific > issues, rather than having to build out C level plumbing. Once you > have that understood, converting to C is simpler than having started > with C. > > For example my Python wrapper (APSW) exposes the VFS, along with an > example and documentation. It also easily lets you "inherit" from an > existing VFS so you only have to write methods where you want > different behaviour (eg mangling file names, or modifying how the > database file contents are stored). > >http://rogerbinns.github.io/apsw/vfs.html > > You should find similar bindings in your higher level languages of choice. > > Roger > -BEGIN PGP SIGNATURE- > Version: GnuPG v1 > > iEUEARECAAYFAlViNrAACgkQmOOfHg372QRrrQCYxRX6XbpyS39O5V+tJyyjXjx6 > vACgxZ/Uu+TDvT22u1FN083YFMu2muc= > =wo2G > -END PGP SIGNATURE- > . >
[sqlite] Extending VFS documentation
Hi all, I have been interested in the VFS layer of SqLite for a while, and found the following article to be very useful: http://www.sqlite.org/vfs.html However, it ends right when it is about to go into the actual details of how to implement a VFS. It would be great if that essay could be completed. I have looked at the example VFS implementations, but it is hard to determine which parts of the code that are implementation details as opposed to being part of the stable public API so to speak. Kind regards, Philip Bennefall
Re: [sqlite] Checking whether a given date is valid
Hi Keith, Thanks for the explanation. From what I can see in your example, the only difference from mine is that I used date(x) and you used datetime(x). If I am understanding things correctly, this would give the same behavior if I am only concerned about dates? In this case I don't care about the time of day; just that the date is valid. So based on your description I gather that I'm doing it correctly in my example? I hope? Kind regards, Philip Bennefall On 8/10/2014 12:00 AM, Keith Medcalf wrote: The result of converting an "invalid" datetime string with julianday() is a floating point julianday value which corresponds to the normalized datetime. In your example, date(julianday('2005-02-29')) is '2005-03-01' which is the correct normalized date. Therefore, a function which will validate that a string is a valid normalized date / datetime / time string is the following: datetime(julianday(x))=datetime(x) This returns 1 if the x is a valid normal date, datetime, or time and 0 if not; and, can be used as a check constraint or otherwise. sqlite> create table mytable ...> ( ...>data text collate nocase check (datetime(julianday(data)) = datetime(data)) ...> ); sqlite> insert into mytable values('01:00'); sqlite> insert into mytable values('24:00'); Error: CHECK constraint failed: mytable sqlite> insert into mytable values('2004-02-29'); sqlite> insert into mytable values('2005-02-29'); Error: CHECK constraint failed: mytable sqlite> insert into mytable values('-01-29'); sqlite> insert into mytable values('-0502-02-29'); Error: CHECK constraint failed: mytable sqlite> insert into mytable values('-0502-01-29'); Of course, inserting a "numeric" value also passes the test since all strictly numeric values can be converted to valid datetime strings. -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- boun...@sqlite.org] On Behalf Of Philip Bennefall Sent: Saturday, 9 August, 2014 14:53 To: General Discussion of SQLite Database Subject: [sqlite] Checking whether a given date is valid Hi all, I'm working on a project where I need to verify that a given date actually existed (like February 29 in a particular year). I am using SqLite to store a lot of data already and I didn't feel like hunting for a datetime library online. I figured that there should be a way to use SqLite's date functions to check this, and came up with the following query: select 1 where date(julianday('2004-02-29'))=date('2004-02-29'); The above query returns 1 as expected. The following one returns nothing, as I would also expect: select 1 where date(julianday('2005-02-29'))=date('2005-02-29'); Now, my question is simply this. Is the query sound? All the tests I have run thus far have given correct results (invalid days of months, leap years etc) but are there any pitfalls that I should be aware of? Is it safe to rely on the conversion between the floating point julianday and the date string being accurate if the date is in fact valid? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users . ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Checking whether a given date is valid
Hi Simon, That is what I tried to do in my query examples that I included in the original message. Based on those, would you say that I am doing it correctly? Kind regards, Philip Bennefall On 8/9/2014 11:49 PM, Simon Slavin wrote: On 9 Aug 2014, at 9:53pm, Philip Bennefall <phi...@blastbay.com> wrote: Is it safe to rely on the conversion between the floating point julianday and the date string being accurate if the date is in fact valid? Probably not. The safest thing is to turn the string into a number (probably Julian day or epoch) and then turn it back into a date again. If you get the original string back, it's a legit date. If you get something else, it might be something like the 30th of February. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Checking whether a given date is valid
Hi all, I'm working on a project where I need to verify that a given date actually existed (like February 29 in a particular year). I am using SqLite to store a lot of data already and I didn't feel like hunting for a datetime library online. I figured that there should be a way to use SqLite's date functions to check this, and came up with the following query: select 1 where date(julianday('2004-02-29'))=date('2004-02-29'); The above query returns 1 as expected. The following one returns nothing, as I would also expect: select 1 where date(julianday('2005-02-29'))=date('2005-02-29'); Now, my question is simply this. Is the query sound? All the tests I have run thus far have given correct results (invalid days of months, leap years etc) but are there any pitfalls that I should be aware of? Is it safe to rely on the conversion between the floating point julianday and the date string being accurate if the date is in fact valid? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [Spellfix] Searching for short words is very slow
Hi Richard, My application is basically just to take a text file as a command line argument and run the spellchecker on it, showing an alert for each word that is not found in the dictionary and giving the user some options. After a bit of experimentation I concluded that one way to speed things up is to store the entire dictionary in memory as a hash map and look for exact matches. Only when an exact match isn't found do I fall back to the spellfix table. This allowed me to scan a document with just over 86000 words in less than 500 milliseconds, which is more than acceptable for my needs. Certainly not ideal if you aren't on a workstation, but it's a reasonable tradeoff if memory is not an issue. Perhaps something similar could be done in the spellfix table itself? Have an indexed integer column containing a crc32 or similar for each word in the dictionary so that we can look for exact matches very quickly. We only fall back to the fuzzy search if no match is found. Can you see any obvious drawbacks with this? If not, I'd like to put this optimization forth as an initial suggestion. I'll write again if I can think of anything else after reading the code more thoroughly. Kind regards, Philip Bennefall On 7/24/2014 12:25 AM, Richard Hipp wrote: On Wed, Jul 23, 2014 at 6:18 PM, Philip Bennefall <phi...@blastbay.com <mailto:phi...@blastbay.com>> wrote: I have to amend my last message. The timings I just gave was for looking up that word 10 times, not 1. So the longest time I've seen would be about 150 ms. However, if you have a document with a few thousand words we would still be looking at a significant total searching time. Is this to be expected? There is no expectation. Spellfix is an experiment in doing fuzzy matching. It was designed for a specific customer who is doing spell-checking in real-time, as the text is being entered. Spellfix works way faster than the end user can enter text, so performance is not an issue in its original purpose. Perhaps you are using spellfix in a different way? You are welcomed to do so. If you want to contribute ideas on how to improve spellfix for use in different scenarios, we will welcome your input. There are comments in the code explaining how spellfix works. Please review the principles of operation and then perhaps run a performance analysis using gprof or cachegrind. Then describe exactly what you are doing and why it isn't working out for you and perhaps we can help. -- D. Richard Hipp d...@sqlite.org <mailto:d...@sqlite.org> ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [Spellfix] Searching for short words is very slow
I have to amend my last message. The timings I just gave was for looking up that word 10 times, not 1. So the longest time I've seen would be about 150 ms. However, if you have a document with a few thousand words we would still be looking at a significant total searching time. Is this to be expected? Kind regards, Philip Bennefall On 7/23/2014 11:57 PM, Philip Bennefall wrote: Hi all, I have been running some tests with spellfix using a table containing about 30 words, extracted from the Moby project's single word list as well as names and places. Moby can be found at: http://icon.shef.ac.uk/Moby/ I have noticed that searching for medium length to very long words is very fast, but when I start searching for short words like "hi" and "bye", the search time skyrockets. I think the longest search time was about 1500 milliseconds (the average is somewhere around the 500 ms mark). My table is set up as follows: create virtual table if not exists dictionary using spellfix1(edit_cost_table=editcosts); When searching, I specify top=5 to get these timings. Is there anything I can tweak to speed up the search for short words, or is there anything that can be done by the developers to optimize this further? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] [Spellfix] Searching for short words is very slow
Hi all, I have been running some tests with spellfix using a table containing about 30 words, extracted from the Moby project's single word list as well as names and places. Moby can be found at: http://icon.shef.ac.uk/Moby/ I have noticed that searching for medium length to very long words is very fast, but when I start searching for short words like "hi" and "bye", the search time skyrockets. I think the longest search time was about 1500 milliseconds (the average is somewhere around the 500 ms mark). My table is set up as follows: create virtual table if not exists dictionary using spellfix1(edit_cost_table=editcosts); When searching, I specify top=5 to get these timings. Is there anything I can tweak to speed up the search for short words, or is there anything that can be done by the developers to optimize this further? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] [Spellfix] Avoiding exact duplicates
Hi all, Is it possible to tell the spellfix extension not to accept duplicate words if the rank and langid are exactly the same? I am collecting words from many different sources and after inserting them I very often get the same word back several times. Thanks in advance for any tips. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Questions from a novice - basic browsing of records in a listview.
How about using prepared statements in conjunction with bind? http://www.sqlite.org/c3ref/bind_blob.html Kind regards, Philip Bennefall On 2014-07-09 15:03, - wrote: Hello all, I'm quite new at SQLite3, and have a bit of a problem with grasping the handling of a database. After having used the OFFSET and LIMIT 1 method (in conjuction with a userdata listview) and finding a past post into this forum describing it as a rookie mistake I'm now trying to implement the "scrolling cursor" method in that same post. It leads to a few questions though. For the above method to work for any database it means I need, for each-and-every next/previous page request, to send *all* the bottom/top records data back to the SQLite engine so it knows where to continue. Even when assuming the default maximum of columns the accumulated column names and related data for the "WHERE" clause could get quite big. Add to that a possible the "SORT BY" clause and I'm looking at quite a large query, which has to be created and transferred for every "scroll". Which is something I do not really like ... 1) Is it possible to refer to the columns in a kind of shorthand (index perhaps) ? 2) Is it possible to have the SQLite engine initialize and remember certain WHERE and ORDER clauses (without creating another database please :-) ), so they can be used again-and-again (for the duration of a connection). 3) Is it possible, for the above 'scrolling cursor' method, to refer to a starting record other than by sending the exact data of such a record back to the SQLite engine ? Ofcourse, feel (very) free to include other things that I've not thought about and could be usefull. :-) Regards, Rudy Wieser ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users . ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Hints for the query planner
Hi Richard, What about "probability" or "likelyhood"? This works in both the case where the likelyhood is great as well as when it is low. From the list you provided, I would pick "unlikely". Kind regards, Philip Bennefall - Original Message - From: "Richard Hipp" <d...@sqlite.org> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Tuesday, September 10, 2013 9:26 PM Subject: [sqlite] Hints for the query planner There is a survey question at the bottom of this message. But first some context... Over on the sqlite-dev mailing list, a debate has been going on about the best way to provide some useful hints to the query planner. The query under discussion looks like this: SELECT DISTINCT aname FROM album, composer, track WHERE cname LIKE '%bach%' AND composer.cid=track.cid AND album.aid=track.aid; Assuming that the schema has appropriate indices and ANALYZE has been run, SQLite does a good job of selecting an efficient query plan for the above. But the query planner lacks a key piece of information that could help it to do a better job. In particular, the query planner does not know how often the subexpression "cname LIKE '%bach%'" will be true. But, it turns out, the best query plan depends critically on this one fact. By default, the query planner (in SQLite 3.8.0) assumes that a subexpression that cannot use an index will always be true. Probably this will be tweaked in 3.8.1 so that such subexpressions will be assumed to usually, but not always, be true. Either way, it would be useful to be able to convey to the query planner the other extreme - that a subexpression is usually not true. (Pedantic detail: "not true" is not the same as "false" in SQL because NULL is neither true nor false.) There is currently code in a branch that provides a hinting mechanism using a magic "unlikely()" function. Subexpressions contained within "unlikely()" are assumed to usually not be true. Other than this hint to the query planner, the unlikely() function is a complete no-op and optimized out of the VDBE code so that it does not consume any CPU cycles. The only purpose of the unlikely() function is to let the query planner know that the subexpression contained in its argument is not commonly true. So, if an application developer knows that the string "bach" seldom occurs in composer names, then she might rewrite the query like this: SELECT DISTINCT aname FROM album, composer, track WHERE unlikely(cname LIKE '%bach%') AND composer.cid=track.cid AND album.aid=track.aid; The query planner might use this "likelihood" hint to choose a different query plan that works better when the subexpression is commonly false. Or it might decide that the original query plan was good enough and ignore the hint. The query planner gets to make that decision. The application developer is not telling the query planner what to do. The application developer has merely provided a small amount of meta-information about the likelihood of the subexpression being true, meta-information which the query planner may or may not use. Note that the subexpression does not have to be a LIKE operator. PostgreSQL, to name one example, estimates how often a LIKE operator will be true based on the pattern on its right-hand side, and adjust query plans accordingly, and some have argued for this sort of thing in SQLite. But I want a more general solution. Suppose the subexpression involves one or more calls to application-defined functions about which the query planner cannot possible know anything. A general mechanism for letting the query planner know that subexpressions are commonly not true is what is desired - not a technique for making LIKE operators more efficient. SURVEY QUESTION: The question for today is what to call this magic hint function: (1) unlikely(EXPR) (2) selective(EXPR) (3) seldom(EXPR) (4) seldom_true(EXPR) (5) usually_not_true(EXPR) Please feel free to suggest other names if you think of any. ADDITIONAL INFORMATION: The current implementation allows a second argument which must be a floating point constant between 0.0 and 1.0, inclusive. The second argument is an estimate of the probability that the expression in the first argument will be true. The default is 0.05. Names like "unlikely" or "seldom" work well when this probability is small, but if the second argument is close to 1.0, then those names seem backwards. I don't know if this matters. The optional second argument is not guaranteed to make it into an actually release. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DLL Size differences and other info requested
Hi Dan, Thanks for that info. Do you have any views on compiling SqLite optimized for speed rather than size? Is the difference in performance generally small enough to be ignored? I am using Vc++ 2010 express and have been optimizing for speed up until now. Kind regards, Philip Bennefall - Original Message - From: "Dan Kennedy" <danielk1...@gmail.com> To: <sqlite-users@sqlite.org> Sent: Thursday, July 04, 2013 5:12 PM Subject: Re: [sqlite] DLL Size differences and other info requested On 07/04/2013 05:49 AM, Philip Bennefall wrote: Hi Stephen, I don't know what compiler is used to build the official SqLite dll, but provided it is some version of Vc++ my experience is that MinGw often produces larger and sometimes significantly slower binaries on Windows than VC++ does. In an unrelated project of mine, the binary size dropped by about 500 KB when I switched from Dev-C++/MinGw to Code::Blocks/MSVC++. This is mere speculation on my part in this case, however, as I don't actually know what compiler that is being used to build the official dll. Those on the website are built with mingw 4.5.2 using: -Os -DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_RTREE -DSQLITE_ENABLE_COLUMN_METADATA In case anyone was wondering. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DLL Size differences and other info requested
Hi Stephen, I don't know what compiler is used to build the official SqLite dll, but provided it is some version of Vc++ my experience is that MinGw often produces larger and sometimes significantly slower binaries on Windows than VC++ does. In an unrelated project of mine, the binary size dropped by about 500 KB when I switched from Dev-C++/MinGw to Code::Blocks/MSVC++. This is mere speculation on my part in this case, however, as I don't actually know what compiler that is being used to build the official dll. Kind regards, Philip Bennefall - Original Message - From: "Stephen Chrzanowski" <pontia...@gmail.com> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, July 04, 2013 12:37 AM Subject: [sqlite] DLL Size differences and other info requested Just for kicks because I wanted to learn "how it was done", I decided to download the amalgamation code and compile to a Win32 DLL. The secondary purpose of my doing so was that I also wanted to write a 3rd party app that'd export all headers of the amalgamation to a different format that I can then use to build into my own units in Delphi. Basically convert the C headers into Delphi headers so I can link to the DLL of the same version. You know, just for "fun". Whether or not my rearranging code works or not.. well.. Its all on my dime, a learning experience, and probably to the professionals in the crowd it'd seem like a waste of time. To them I say "meh". ;) The one thing I DID notice though is that my compiled version of the DLL in size is much different than what the compiled version is on the sqlite.orgsite. Is there a particular reason for this? What compiler options do I set to bring my the size of the DLL down or is it just the nature of the amalgamation? Just for reference, my DLL, which I baked in the oven about 30 minutes ago now, weighs in at about 850k while the pre-compiled Win32 download version looks to be about 620k. The wife says I need to cut back on calories, so I thought I might start with my hard drives and see how that goes. ;) I'm compiling under Bloodshed Dev-C++ v4.9.9.2. The other thing I wanted to know is whether or not it'd be possible to get source code snapshots of the amalgamation code starting from v3.0 (EDIT: Seemingly 3.2?). or whatever this process started at starting at the major build version 3, and get the incremental "published builds". I've looked at the timeline and at http://www.sqlite.org/cgi/src/brlist and I see that the individual files have been modified and checked in, etc, but I don't seem to be able to find sqlite3.c and sqlite.h, or even how I can download a snapshot of the package of that revision even if it is just the individual C files. Is everything in the downloaded archive basically a merge (Think "copy file1+file2+file3 file_final") of all of the required C files when a full build to make the different pre-built packages are done? I'm also a complete newb when it comes to the version control being used for SQLite. I know of version control software, and I use a Win32 only VCS, so have absolutely zero clue on how to obtain a particular version of the code through the means graciously provided. If someone has the time, I'd greatly appreciate a walk through on what to do, and what software has to be put onto my dev machines. I'm a pack rat when it comes to any and all software I come across, so what I'm doing I think flies in the face of what my wife has asked me to do in regards to calories. {smirk} ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] regexp.c
- Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 07, 2013 1:14 PM Subject: Re: [sqlite] regexp.c On Fri, Jun 7, 2013 at 5:30 AM, Philip Bennefall <phi...@blastbay.com> wrote: Hello all, I was having a quick look at the extensions provided in ext/misc in the source tree, and I am pleasantly surprised at the number of useful things that I found in there that I didn't know about. I was wondering about the regular expression parser in particular. Does it support capture groups? The source seems to indicate no, but I just wanted to make sure. Capture groups are not supported by the regexp.c implementation in ext/misc. On the other hand, that means that run-time is guaranteed linear in the size of the input. Thanks, Richard. That does make sense. On another note, generally how well tested are these extensions? Should I assume that since they're not part of the SqLite amalgamation, they aren't as extensively tested and maintained as the rest of SqLite's codebase? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] regexp.c
Hello all, I was having a quick look at the extensions provided in ext/misc in the source tree, and I am pleasantly surprised at the number of useful things that I found in there that I didn't know about. I was wondering about the regular expression parser in particular. Does it support capture groups? The source seems to indicate no, but I just wanted to make sure. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: Donald Griggs To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Thursday, June 06, 2013 3:13 PM Subject: Re: [sqlite] Serialize an in-memory database Hi Philip, Maybe neither of these two thoughts are helpful, but fyi: 1. Licensing for existing memory vfs. Regarding this memory vfs implementation referenced earlier: http://article.gmane.org/gmane.comp.db.sqlite.general/46450 http://spserver.googlecode.com/files/spmemvfs-0.1.src.tar.gz Would it not be worth an email to the author ( gmail user stephen.nil ) to see if he might quickly release his code (already open source) to public domain or another acceptable license? 2. Performance of existing solutions. Regarding, "I would like to avoid saving the data out to a temporary disk file... That seems wasteful to me" I can certainly understand why you'd write that, but it's different from saying, "I've tested that solution (or put numerical bounds on its maximum likely performance) and found its performance to be unacceptable for my intended use." Of course, using sqlite at all for your purpose (mainly to avoid writing custom sorts, as I understand) is wasteful in some sense of the word -- but I suspect its nevertheless an entirely appropriate application. One of Donald Knuth's famous quotes was, "Premature optimization is the root of all evil (or at least most of it) in programming." If there's a chance that's applicable here (maybe its not), then since the code to backup to a temp file is already present, would it be worth a try? Respectfully, Donald G. (definitely NOT Donald K!) Hi Donald, You have several good points. Let's see if I can respond to them properly: 1. I can definitely drop a line to the author and ask about the licensing. But one appealing part of the vfs idea is actually sitting down and learning enough to implement it myself. If I write the code, I will also be able to debug it much more easily and will learn something new to boot. I have no urgent need for a solution to this problem, so while I certainly would use an existing memory vfs if it was available in SqLite itself I would also enjoy the challenge of writing my own. 2. Let me change the word wasteful to unnecessary. I certainly have no doubt that writing the temporary file and reading it back in would still be acceptable performance wise in my scenario, and I have no figures to prove otherwise. But it seems unnecessary to do so if an alternative method exists. Copying it from one memory location to another seems a lot more elegant if nothing else. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: <phi...@blastbay.com>; "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 06, 2013 1:45 PM Subject: Re: [sqlite] Serialize an in-memory database On 6 Jun 2013, at 10:45am, Philip Bennefall <phi...@blastbay.com> wrote: I have a bunch of data structures in memory that I am looking to replace with an SqLite database, primarily for the purpose of avoiding reinventing the wheel with various sorts etc. I would then like to serialize the data into a memory buffer and do additional processing before finally rendering it to disk. The additional processing might include compression, encryption, and a few other things specific to my application. Two problems: Unlike the SQLite file format, the format SQLite uses when it keeps things in memory is not published, and changes from version to version. Because the writers of SQLite expect the in-memory format to be accessed only by things built into the SQLite API, you have to read the source code to know what's going on. So any routines you come up will have to just deal with whatever they find rather than trying to understand its structure. Also your data will be able to restored only back to versions of SQLite where the internal data format hasn't changed. SQLite does not, by its nature, keep everything in one long block of memory. It allocates and frees smaller blocks of memory as data is stored or deleted, and also as it needs to create temporary structures such as indexes needed to speed up a specific command. So turning a stored database into one stream of octets takes more than just reading a section of memory. Rather than try to mess with the internals of SQLite I suspect you would be better served by doing the following: 1) Using SQLite's existing in-memory databases to keep your data in memory while your app executes. 2) Writing your own routine in your preferred programming language to dump your data into text or octets in memory or disk in whatever format you want. One standard way to do this is to generate the SQL commands needed to reproduce your database. Since these are very repetitive standard ASCII commands they compress down extremely well and you can do encryption at the same time using any of a number of standard libraries. Data in this format has the added advantages that it is human-readable (after decompression) and can be passed straight to sqlite3_exec() to rebuild the database. However, you might prefer to invent your own format, perhaps more like CSV, that makes implicit use of your data structures. Simon.= Hi Simon, Oh I never intended to attempt to rip the data right out of an SqLite memory database. I realize that it is not at all the same as the disk file that I could create with, say, the backup API. I am considering two options: 1. Writing a memory vfs that I use when I want to save my data, backing up the existing in-memory database to a new database that uses this memory vfs and then taking the data from the resulting block where SqLite writes what it thinks is the database file. 2. Doing something like .dump in the shell, but writing the output to memory and then processing that. This seems to be the simplest approach, but would waste a lot of space and import/export would be slower as far as I can judge. This would primarily be the case if I export as SQL, as I would then not be able to reuse prepared statements with parameters but would have to use sqlite3_exec. The memory vfs seems like the most appealing choice in the longterm, but the second approach is much more straightforward. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: <phi...@blastbay.com>; "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 06, 2013 10:51 AM Subject: Re: [sqlite] Serialize an in-memory database On 6 Jun 2013, at 9:01am, Philip Bennefall <phi...@blastbay.com> wrote: Since I don't believe that Windows for example has tmpfs (seems to be a Unix thing), would the idea of constructing a vfs that just reads and writes a huge memory block be doable? Doable ? Yes. Use the code from one of the file-based VFSes and replace all the 'read/write to file offset N" with 'read/write to memory offset N', then sort out locking and a few other problems. Tedious and annoying to do but doable if you write C. Of advantage to many users ? I don't know. I don't see what the advantage of doing this is over SQLite's standard ways of storing data in memory or in a file. Who would use this ? Also, I have a question. How big do you expect to make that block of memory you grab when someone creates a new database ? One of the advantages of SQL databases is that they grow as you get more data. You can't do this if you're going to pre-grab a continuous block of memory. Do you expect to use the C function realloc() a lot ? Simon. Hi Simon, For my own part, I would usually have a database that is no more than a few megabyte in size. A generic solution would be a lot harder than the one I am considering for my own project, where I can cut corners due to the fact that I know the size of my data at least roughly. What I want to achieve is to serialize the data in such a way so that I can do other processing on it before I render it to disk, such as custom compression and/or other things. I am aware that there is an SqLite add-on to do this, but aside from the fact that I cannot afford it I don't need to do this processing on the fly either. I just want to take an in-memory database and put it in a compressed and possibly encrypted file on disk in the end, without having to use a temporary file as an intermediary. I write C, so would have no trouble modifying one of the existing vfs example implementations. Correct me if I am wrong, but do I really need to do any kind of locking if I am not working with disk files? I am not working with shared cache, either. I would have one database connection that would only be accessed from one thread. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Paolo Bolzoni" <paolo.bolzoni.br...@gmail.com> To: <phi...@blastbay.com>; "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 06, 2013 10:33 AM Subject: Re: [sqlite] Serialize an in-memory database What is you use case? Why do you need this? I am asking because maybe it helps thinking alternate solutions... Hi Paolo, I have a bunch of data structures in memory that I am looking to replace with an SqLite database, primarily for the purpose of avoiding reinventing the wheel with various sorts etc. I would then like to serialize the data into a memory buffer and do additional processing before finally rendering it to disk. The additional processing might include compression, encryption, and a few other things specific to my application. I would like to avoid saving the data out to a temporary disk file, reading it back in, doing my processing, writing it out into a new file and then finally deleting the temporary file. That seems wasteful to me, and so that's why I am looking into solutions that avoid the temporary files. So far, the memory vfs seems like the best approach. Kind regards, Philip Bennefall P.S. I have looked at the encryption and compression add-ons for SqLite, but I don't need encryption/compression on the fly (just on the entire database in one go), and I don't have the money to purchase the code in the first place. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Paolo Bolzoni" <paolo.bolzoni.br...@gmail.com> To: <phi...@blastbay.com>; "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 06, 2013 10:02 AM Subject: Re: [sqlite] Serialize an in-memory database Sorry I am missing a bit, What is the problem of using sqlite3_backup again? Hi Paolo, I would like to avoid using a temporary file, but rather just save and load the database as a memory block. Serialize to and from memory, in other words. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: <phi...@blastbay.com>; "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 06, 2013 12:15 AM Subject: Re: [sqlite] Serialize an in-memory database On 5 Jun 2013, at 8:38pm, Philip Bennefall <phi...@blastbay.com> wrote: On 5 Jun 2013, at 8:32pm, Petite Abeille <petite.abei...@gmail.com> wrote: write to tmpfs… read the file into byte[]… do what you meant to do… to reload… write byte[] do tmpfs… open db… and be merry… or something along these lines... I don't want it in a file, however. I want it in a memory block. That's why you read from tmpfs (or any other file stored in any other file system) into byte[]. Once your data is in byte[] you will have entire SQLite database in one run of memory. You can't usefully store a memory database of SQLite because SQLite's data in memory isn't all in one big run of memory. The data is stored in various little chunks, some here, some there. If you tried to read the data directly out of those chunks you would have to read lots of little chunks, not one big run of continuous memory. And you'd be storing lots of pointers to various locations in memory. When you 'restore' the data back into memory you're not going to be allocated the same locations in memory so those pointers won't mean anything any more. A database stored in a file, however, has pointers to locations in that file instead of pointers to locations in memory. If you save and restore the whole file in one big run, those pointers will become valid again: the same bits of data will be at the same offsets of the file. Doesn't have to be tmpfs. You can use any file system that SQLite thinks is file storage rather than memory storage. Simon. Hi Simon, Since I don't believe that Windows for example has tmpfs (seems to be a Unix thing), would the idea of constructing a vfs that just reads and writes a huge memory block be doable? If so, how difficult of a task do you estimate that this might be? I want to reuse as much of the existing vfs code as possible (e.g. I don't want to reimplement randomness, date etc). Could you possibly give me some pointers? I read the chapter about the virtual file systems, but it seems incomplete. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Petite Abeille" <petite.abei...@gmail.com> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 10:04 PM Subject: Re: [sqlite] Serialize an in-memory database On Jun 5, 2013, at 10:02 PM, Philip Bennefall <phi...@blastbay.com> wrote: That is exactly the sort of thing I am looking for. If anything, I think it'd be great to have such a vfs in SqLite if only for completeness/customizability, seeing as how there are so many different allocators for example. It'd be great to have the same level of freedom with the vfs backends. Of course, I want to use the existing methods in the native vfs for randomization, time etc - just override the file I/O methods to operate on a large byte array/blob. memvfs? http://article.gmane.org/gmane.comp.db.sqlite.general/46450 I looked into that one, but the license is unclear. There is a copyright notice in the source but no explicit license. I need something public domain, just like SqLite is. Otherwise I will attempt to roll my own. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Jay A. Kreibich" <j...@kreibi.ch> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:55 PM Subject: Re: [sqlite] Serialize an in-memory database On Wed, Jun 05, 2013 at 09:15:21PM +0200, Petite Abeille scratched on the wall: On Jun 5, 2013, at 9:10 PM, Philip Bennefall <phi...@blastbay.com> wrote: > Yes, I have seen the backup API. But I would like to avoid the disk > file entirely and just serialize to and from memory. Lateral thinking? write your db to tmpfs? A few months back, wasn't there talk about a VFS that used a giant byte array, rather than a file? If someone actually wrote one of those, you could use the Backup API to blast the DB into a big chunk of memory. If such a VFS does not actually exist, it shouldn't be all that hard to write, and might come in useful for this and other reasons. -j Hi there, That is exactly the sort of thing I am looking for. If anything, I think it'd be great to have such a vfs in SqLite if only for completeness/customizability, seeing as how there are so many different allocators for example. It'd be great to have the same level of freedom with the vfs backends. Of course, I want to use the existing methods in the native vfs for randomization, time etc - just override the file I/O methods to operate on a large byte array/blob. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Petite Abeille" <petite.abei...@gmail.com> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:54 PM Subject: Re: [sqlite] Serialize an in-memory database On Jun 5, 2013, at 9:44 PM, Philip Bennefall <phi...@blastbay.com> wrote: I use Windows. This looks like it is purely for Unix variants? I suspect one call these 'RAM disk/drive' as well... http://en.wikipedia.org/wiki/List_of_RAM_drive_software I need something that operates wherever SqLite does, so can't be system dependent. SQLite runs on pretty much anything. Are you targeting, say, watches? No. All I am saying is that I don't want to limit myself to platforms that implement this particular feature. I would prefer a generic solution that works everywhere, such as a regular malloc:ed chunk that works like a vfs in SqLite. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Petite Abeille" <petite.abei...@gmail.com> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:39 PM Subject: Re: [sqlite] Serialize an in-memory database On Jun 5, 2013, at 9:38 PM, Philip Bennefall <phi...@blastbay.com> wrote: I don't want it in a file, however. I want it in a memory block. So tmpfs wouldn't do the trick from what I gather. … tmpfs *is* memory… just looks like a file system… http://en.wikipedia.org/wiki/Tmpfs I use Windows. This looks like it is purely for Unix variants? I need something that operates wherever SqLite does, so can't be system dependent. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Petite Abeille" <petite.abei...@gmail.com> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:32 PM Subject: Re: [sqlite] Serialize an in-memory database On Jun 5, 2013, at 9:25 PM, Philip Bennefall <phi...@blastbay.com> wrote: Doesn't that still create a file, just a temporary one? I need the serialized content in a char* or similar so I can memcpy it etc, and then feed it back to SqLite at a later time. I guess I could make a toy vfs that uses a malloc:ed chunk that pretends to be the disk drive, and backup to/from that to a regular in-memory database. Thoughts? Yes… write to tmpfs… read the file into byte[]… do what you meant to do… to reload… write byte[] do tmpfs… open db… and be merry… or something along these lines... I don't want it in a file, however. I want it in a memory block. So tmpfs wouldn't do the trick from what I gather. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Roman Fleysher" <roman.fleys...@einstein.yu.edu> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:26 PM Subject: Re: [sqlite] Serialize an in-memory database Read section on URI Filemanes, particularly mode for memory databases: http://www.sqlite.org/c3ref/open.html DB Connection in backup API does not have to point to a file, it can point to in-memory database if URIs are enabled. (I learned it from someone else on the list, i use SQLite for less than a month.) Roman From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Petite Abeille [petite.abei...@gmail.com] Sent: Wednesday, June 05, 2013 3:15 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Serialize an in-memory database On Jun 5, 2013, at 9:10 PM, Philip Bennefall <phi...@blastbay.com> wrote: Yes, I have seen the backup API. But I would like to avoid the disk file entirely and just serialize to and from memory. Lateral thinking… write your db to tmpfs… ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users Hi Roman, Oh I know that I can copy content from one in-memory database to another. That's trivial with the backup API. My concern is more that I want to copy it to a blob. In short, the contents that would have been fed to the file - I want that in a memory buffer. I also want to be able to feed that back into SqLite at a later time, probably backing it up to a regular in-memory database again. The malloc vfs is the only solution I can come up with, but it seems overkill. I am hoping there is a cleaner way. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Petite Abeille" <petite.abei...@gmail.com> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:15 PM Subject: Re: [sqlite] Serialize an in-memory database On Jun 5, 2013, at 9:10 PM, Philip Bennefall <phi...@blastbay.com> wrote: Yes, I have seen the backup API. But I would like to avoid the disk file entirely and just serialize to and from memory. Lateral thinking… write your db to tmpfs… Doesn't that still create a file, just a temporary one? I need the serialized content in a char* or similar so I can memcpy it etc, and then feed it back to SqLite at a later time. I guess I could make a toy vfs that uses a malloc:ed chunk that pretends to be the disk drive, and backup to/from that to a regular in-memory database. Thoughts? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Serialize an in-memory database
- Original Message - From: "Andreas Kupries" <andre...@activestate.com> To: <phi...@blastbay.com>; "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Wednesday, June 05, 2013 9:07 PM Subject: Re: [sqlite] Serialize an in-memory database On Wed, Jun 5, 2013 at 11:55 AM, Philip Bennefall <phi...@blastbay.com> wrote: Hello all, This may be a somewhat strange question, but I can't find an answer to it on the website so I figured I would give it a shot. Is it possible to put the full contents of an SqLite in-memory database into a string/blob in memory? If you can work with a file instead of string the backup/restore APIs should be able to do what you want, i.e. backup of a live database to a file, and back, and the live database can be in-memory, of course. http://www.sqlite.org/backup.html See example 1. http://www.sqlite.org/c3ref/backup_finish.html#sqlite3backupinit Hi Andreas, Yes, I have seen the backup API. But I would like to avoid the disk file entirely and just serialize to and from memory. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Serialize an in-memory database
Hello all, This may be a somewhat strange question, but I can't find an answer to it on the website so I figured I would give it a shot. Is it possible to put the full contents of an SqLite in-memory database into a string/blob in memory? I would then like to take the same string and convert that back into an in-memory database later, if that makes any sense. Serialization of the database, basically. Is there any reasonable, non hack-ish way of doing this? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Wiki out of date?
Hello, I was looking at the following page on the wiki: http://www.sqlite.org/cvstrac/wiki?p=LibraryRoutineCalledOutOfSequence And it states that one of the causes for SQLITE_MISUSE being returned is: "Trying to use the same database connection at the same instant in time from two or more threads." Isn't this outdated? Doesn't this depend on the SqLite threading mode nowadays? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] vfs?
In that case you want to specify a NULL pointer for that argument. Then SqLite will pick an appropriate vfs to use for whatever platform it's being run on. Kind regards, Philip Bennefall - Original Message - From: "Arbol One" <arbol...@programmer.net> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Sunday, June 17, 2012 8:46 PM Subject: Re: [sqlite] vfs? Thanks for your prompt respnse. Well, no, really. I don't know where my application will be used. Maybe someone will use it under linux or OS2, all I know is that now I am coding in Win32. - Original Message - From: Igor Tandetnik Sent: 06/17/12 02:02 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] vfs? Arbol One <arbol...@programmer.net> wrote: > Using C/C++, is there a way to extract which file system is SQLite using? You already know that - it's the one you've specified in a call to sqlite3_open_v2. What problem are you trying to solve? -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Strange behavior with fts4
I hate to be cluttering up the list in this fashion, but I have come across an issue that I cannot seem to find a solution for. I am using two fts tables, one that uses the normal tokenizer and another that uses the porter stemmer, so that I can search the same dataset with and without porter. For the porter stemmer table, I have set the content option to point to the other fts table. Like this: CREATE VIRTUAL TABLE main_brain USING fts4(phrase, response, matchinfo=fts3, tokenize=simple, order=desc); CREATE VIRTUAL TABLE porter_brain USING fts4(content="main_brain", phrase, response, matchinfo=fts3, tokenize=porter, order=desc); Then I do the following: INSERT INTO main_brain(phrase, response) VALUES('hello', 'Hi there!'); So you'd now think that main_brain should have this content in it, but the porter_brain table should be empty. At least, that is what the documentation on fts4 seems to indicate as it points out that it is my own responsibility to make sure the tables are in sync. But: SELECT * FROM porter_brain; Produces: hello|Hi there! And: SELECT * FROM main_brain; Gives the same result. So it seems as though some internal synchronization between these tables is going on. How should I handle this? The documentation suggests a statement like the following in one of its examples: INSERT INTO t3(docid, b, c) SELECT id, b, c FROM t2; Where I guess t3 would correspond to porter_brain and t2 would be main_brain in my case. But I don't seem to need to do this at all. Can anyone throw some light on this? Thanks in advance. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Porter Stemmer
Understood. Thank you very much for your quick help. Now I have all the information I need to get coding. And thanks once again for a great library! Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 3:39 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 9:26 AM, Philip Bennefall <phi...@blastbay.com> wrote: I understand that, but let's say that I already have a virtual fts table created that I set to use the Porter tokenizer, how then would I go about rebuilding and retokenizing this table with the simple tokenizer at a later time? Would I need to create an entirely new table? What I'm wondering is basically how I might take an existing fts virtual table, change its tokenizer and then rebuild the index? Yes. You'll need to DROP or RENAME the original table, then CREATE the new one. Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 3:14 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall <phi...@blastbay.com> wrote: I had another quick question. If I have built an fts table using the stemmer tokenizer, and then I later decide that I want to change to the simple one, is there an easy way to do this? I see the "rebuild" command, can I somehow tell that to change the tokenizer as well? I see the reference to custom ones, but what about the internal implementations? If you change your tokenizer, you need to retokenize all of the source text. Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 1:03 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> wrote: Hi all, Is the algorithm used in the stemming tokenizer in SqLite's fts extension equivalent to the C implementation found at http://tartarus.org/~martin/PorterStemmer/ The built-in Porter stemmer is a copy/paste from the above link. ? I am asking this because some sources say that there are improved versions of this algorithm released much later than 2000/2001. Does SqLite's implementation differ in any significant ways from the C implementation found at the above URL? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Porter Stemmer
I understand that, but let's say that I already have a virtual fts table created that I set to use the Porter tokenizer, how then would I go about rebuilding and retokenizing this table with the simple tokenizer at a later time? Would I need to create an entirely new table? What I'm wondering is basically how I might take an existing fts virtual table, change its tokenizer and then rebuild the index? Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 3:14 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall <phi...@blastbay.com> wrote: I had another quick question. If I have built an fts table using the stemmer tokenizer, and then I later decide that I want to change to the simple one, is there an easy way to do this? I see the "rebuild" command, can I somehow tell that to change the tokenizer as well? I see the reference to custom ones, but what about the internal implementations? If you change your tokenizer, you need to retokenize all of the source text. Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 1:03 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> wrote: Hi all, Is the algorithm used in the stemming tokenizer in SqLite's fts extension equivalent to the C implementation found at http://tartarus.org/~martin/PorterStemmer/ The built-in Porter stemmer is a copy/paste from the above link. ? I am asking this because some sources say that there are improved versions of this algorithm released much later than 2000/2001. Does SqLite's implementation differ in any significant ways from the C implementation found at the above URL? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Porter Stemmer
I had another quick question. If I have built an fts table using the stemmer tokenizer, and then I later decide that I want to change to the simple one, is there an easy way to do this? I see the "rebuild" command, can I somehow tell that to change the tokenizer as well? I see the reference to custom ones, but what about the internal implementations? Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 1:03 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> wrote: Hi all, Is the algorithm used in the stemming tokenizer in SqLite's fts extension equivalent to the C implementation found at http://tartarus.org/~martin/PorterStemmer/ The built-in Porter stemmer is a copy/paste from the above link. ? I am asking this because some sources say that there are improved versions of this algorithm released much later than 2000/2001. Does SqLite's implementation differ in any significant ways from the C implementation found at the above URL? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Porter Stemmer
Thanks, Richard. That's good to know because I am trying to decide whether to add a new tokenizer with some custom processing, as opposed to using the built in stemmer. Kind regards, Philip Bennefall - Original Message - From: Richard Hipp To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Friday, June 15, 2012 1:03 PM Subject: Re: [sqlite] Porter Stemmer On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> wrote: Hi all, Is the algorithm used in the stemming tokenizer in SqLite's fts extension equivalent to the C implementation found at http://tartarus.org/~martin/PorterStemmer/ The built-in Porter stemmer is a copy/paste from the above link. ? I am asking this because some sources say that there are improved versions of this algorithm released much later than 2000/2001. Does SqLite's implementation differ in any significant ways from the C implementation found at the above URL? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Porter Stemmer
Hi all, Is the algorithm used in the stemming tokenizer in SqLite's fts extension equivalent to the C implementation found at http://tartarus.org/~martin/PorterStemmer/ ? I am asking this because some sources say that there are improved versions of this algorithm released much later than 2000/2001. Does SqLite's implementation differ in any significant ways from the C implementation found at the above URL? Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full text search without full phrase matches
The main issue I have is that the Apache license, as well as the gpl/lgpl licenses, force you to include the entire license with any redistribution of the software (even if it is compiled as part of a derivative work) if my understanding is correct. I am not too keen on doing that. I certainly don't mind giving credit where credit is due, but I don't want to include this huge blob of legal text. This is why I like public domain software so much, as well as software distributed under licenses such as the Boost Software license, the Zlib/Libpng license and one or two others that are similar. Kind regards, Philip Bennefall - Original Message - From: Black, Michael (IS) To: phi...@blastbay.com Sent: Thursday, June 14, 2012 9:34 PM Subject: Re: [sqlite] Full text search without full phrase matches Apache license is about as liberal as you can get. Damned near public domain (may as well be for all intents and purposes). Muy I ask what the problem is? Perhaps something I should aware of? Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems -- From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Philip Bennefall [phi...@blastbay.com] Sent: Thursday, June 14, 2012 2:09 PM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] Full text search without full phrase matches Hi Michael, That seems to be under either gpl/lgpl/apache licenses, which I cannot use in my project for various reasons. The reason I am so interested in SqLite is because it's public domain. I appreciate the tip though. Kind regards, Philip Bennefall - Original Message - From: Black, Michael (IS) To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Thursday, June 14, 2012 9:03 PM Subject: Re: [sqlite] Full text search without full phrase matches Sounds to me like you want Lucene instead of SQLite http://lucene.apache.org/core/ Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems -- From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Philip Bennefall [phi...@blastbay.com] Sent: Thursday, June 14, 2012 1:32 PM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] Full text search without full phrase matches Hi Simon, The ordering is not really the issue I am having. That, I can do if I just get a result back that doesn't necessarily match all the keywords. In the query you showed as an example, all the keywords would still have to match in order for a row to be returned. The sorting is a separate problem that is not really that difficult once I get a smaller dataset. Then I can order it manually. The problem is that it only returns a match if every single word is present. I would like it to return matches if, say, mor than 2 or 3 of the specified keywords are found. Kind regards, Philip Bennefall - Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 14, 2012 8:24 PM Subject: Re: [sqlite] Full text search without full phrase matches On 14 Jun 2012, at 7:13pm, Philip Bennefall <phi...@blastbay.com> wrote: > That is unfortunate, if it is true that there's no way to accomplish this > with SqLite. To do just plain matching I can use an unordered hash map, so > I wouldn't need a database for that. The trouble with a string distance > function is that I can't really process the entire dataset with it. SqLite > technically has all the features I'm after, I just don't want it to > necessarily match all the words in a query. If I can get it to match all > as well as some, that would be enough. I could then do distancing on a > considerably smaller dataset which would be the result of the broader > SqLite search. > > So I guess my main question is, is there absolutely no way to match a > subset of the words in a query? Well, you could write that string distance function and add it to your copy of SQLite as an external function. Then you could do things like SELECT string_distance(theText, 'this new piece of text'), theText FROM oldChats WHERE string_distance(theText, 'this new piece of text') < 10 ORDER BY string_distance(theText, 'this new piece of text') (I don't know whether SQLite will o
Re: [sqlite] Full text search without full phrase matches
Hi Michael, That seems to be under either gpl/lgpl/apache licenses, which I cannot use in my project for various reasons. The reason I am so interested in SqLite is because it's public domain. I appreciate the tip though. Kind regards, Philip Bennefall - Original Message - From: Black, Michael (IS) To: phi...@blastbay.com ; General Discussion of SQLite Database Sent: Thursday, June 14, 2012 9:03 PM Subject: Re: [sqlite] Full text search without full phrase matches Sounds to me like you want Lucene instead of SQLite http://lucene.apache.org/core/ Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman Information Systems -- From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Philip Bennefall [phi...@blastbay.com] Sent: Thursday, June 14, 2012 1:32 PM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] Full text search without full phrase matches Hi Simon, The ordering is not really the issue I am having. That, I can do if I just get a result back that doesn't necessarily match all the keywords. In the query you showed as an example, all the keywords would still have to match in order for a row to be returned. The sorting is a separate problem that is not really that difficult once I get a smaller dataset. Then I can order it manually. The problem is that it only returns a match if every single word is present. I would like it to return matches if, say, mor than 2 or 3 of the specified keywords are found. Kind regards, Philip Bennefall - Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 14, 2012 8:24 PM Subject: Re: [sqlite] Full text search without full phrase matches On 14 Jun 2012, at 7:13pm, Philip Bennefall <phi...@blastbay.com> wrote: > That is unfortunate, if it is true that there's no way to accomplish this > with SqLite. To do just plain matching I can use an unordered hash map, so > I wouldn't need a database for that. The trouble with a string distance > function is that I can't really process the entire dataset with it. SqLite > technically has all the features I'm after, I just don't want it to > necessarily match all the words in a query. If I can get it to match all > as well as some, that would be enough. I could then do distancing on a > considerably smaller dataset which would be the result of the broader > SqLite search. > > So I guess my main question is, is there absolutely no way to match a > subset of the words in a query? Well, you could write that string distance function and add it to your copy of SQLite as an external function. Then you could do things like SELECT string_distance(theText, 'this new piece of text'), theText FROM oldChats WHERE string_distance(theText, 'this new piece of text') < 10 ORDER BY string_distance(theText, 'this new piece of text') (I don't know whether SQLite will optimise that to avoid executing the same function many times, or whether you can name a column and use that name to do the same thing yourself.) Here's the documentation for external functions: <http://www.sqlite.org/c3ref/create_function.html> Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full text search without full phrase matches
Hi Simon, The ordering is not really the issue I am having. That, I can do if I just get a result back that doesn't necessarily match all the keywords. In the query you showed as an example, all the keywords would still have to match in order for a row to be returned. The sorting is a separate problem that is not really that difficult once I get a smaller dataset. Then I can order it manually. The problem is that it only returns a match if every single word is present. I would like it to return matches if, say, mor than 2 or 3 of the specified keywords are found. Kind regards, Philip Bennefall - Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 14, 2012 8:24 PM Subject: Re: [sqlite] Full text search without full phrase matches On 14 Jun 2012, at 7:13pm, Philip Bennefall <phi...@blastbay.com> wrote: That is unfortunate, if it is true that there's no way to accomplish this with SqLite. To do just plain matching I can use an unordered hash map, so I wouldn't need a database for that. The trouble with a string distance function is that I can't really process the entire dataset with it. SqLite technically has all the features I'm after, I just don't want it to necessarily match all the words in a query. If I can get it to match all as well as some, that would be enough. I could then do distancing on a considerably smaller dataset which would be the result of the broader SqLite search. So I guess my main question is, is there absolutely no way to match a subset of the words in a query? Well, you could write that string distance function and add it to your copy of SQLite as an external function. Then you could do things like SELECT string_distance(theText, 'this new piece of text'), theText FROM oldChats WHERE string_distance(theText, 'this new piece of text') < 10 ORDER BY string_distance(theText, 'this new piece of text') (I don't know whether SQLite will optimise that to avoid executing the same function many times, or whether you can name a column and use that name to do the same thing yourself.) Here's the documentation for external functions: <http://www.sqlite.org/c3ref/create_function.html> Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full text search without full phrase matches
Hi Simon, That is unfortunate, if it is true that there's no way to accomplish this with SqLite. To do just plain matching I can use an unordered hash map, so I wouldn't need a database for that. The trouble with a string distance function is that I can't really process the entire dataset with it. SqLite technically has all the features I'm after, I just don't want it to necessarily match all the words in a query. If I can get it to match all as well as some, that would be enough. I could then do distancing on a considerably smaller dataset which would be the result of the broader SqLite search. So I guess my main question is, is there absolutely no way to match a subset of the words in a query? Kind regards, Philip Bennefall - Original Message - From: "Simon Slavin" <slav...@bigfraud.org> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, June 14, 2012 7:24 PM Subject: Re: [sqlite] Full text search without full phrase matches On 14 Jun 2012, at 6:12pm, Philip Bennefall <phi...@blastbay.com> wrote: The trouble I have is that in my query, all the keywords don't necessarily have to be present in order for a successful match to be made. SqLite's fts only seems to match if all the keywords are present, which I don't require. You will have to do some of this in your own programming. In fact you may end up doing all of it in your own programming and using the FTS feature only to match single words and single word-fragments. The usual way to do this is to define a 'distance' metric for comparing two strings and finding how far apart they are. A score of 0 means they match exactly. Numbers bigger than a certain amount don't matter: anything bigger than, say, 100 means they're not at all alike. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full text search without full phrase matches
Hi Paul, and thank you for your reply. The trouble I have is that in my query, all the keywords don't necessarily have to be present in order for a successful match to be made. SqLite's fts only seems to match if all the keywords are present, which I don't require. I am not familiar with Perl, but am working exclusively in C++. The input I am processing is arbitrary, and so is the data that I am searching through in the index. The incoming data is user messages, and the index contains old messages that the robot has given to users (stemmed and stripped in various ways to make matches more probable), and then there's another column which contains an appropriate answer if that query is matched. I want it to match as many keywords as possible but not necessarily all, and order by: 1. How many keywords were matched, with some minimum threshold below which no match is made. 2. How well the ordering matched. Do you have any tips? Kind regards, Philip Bennefall - Original Message - From: <pc...@sympatico.ca> To: <sqlite-users@sqlite.org> Sent: Thursday, June 14, 2012 7:01 PM Subject: [sqlite] Full text search without full phrase matches I had to implement something like this for comparing passages from statutes (see the Introduction in Douglas Hay and Paul Craven, *Masters, Servants and Magistrates in Britain and the Empire, 1562-1955* [UNCP Press, 2004] for an illustration). You need to isolate the keywords, in whatever order, count them, and measure the distances (number of words) between them. SqLite is great for managing the tables of keywords, the lists of texts that contain them, and tables of distances. But it is not the optimal tool for breaking down the texts and extracting the keywords and distances. I used Perl for this job, and found that I could easily adapt recipes from the Perl Cookbook and similar repositories to build my routines. I wrote the disaggregated lists of keywords, distances and texts as sql tables and analysed them in SqLite. Paul Craven York University -- Date: Wed, 13 Jun 2012 23:09:35 +0200 From: Philip Bennefall <phi...@blastbay.com> To: <sqlite-users@sqlite.org> Subject: [sqlite] Full text search without full phrase matches Message-ID: <A12309DB130E42BBA0590D664F66922A@chicken> Content-Type: text/plain; charset="iso-8859-1" Hi all, I am new to this maling list and to SqLite, so I wanted to start by thanking all of those who make this project a reality. It is a great tool. Now, to my question. I am trying to use the full text search feature to find rough matches for a chat robot. Basically I want to match as many keywords as possible, but not necessarily all of them. The results should be sorted based on how many keywords were found in the phrase and how closely ordered they are to the query. In other words the ordering doesn't have to be exact, but the closer it is, the higher the result should rank. Similarly, even if only one or two words in the phrase are found it should match, but rank higher the more of the words that are present. I have read the reference and I see the NEAR statement and the matchinfo function, as well as the example of how to use it, but I cannot figure out how to apply this knowledge to my specific problem. Does anyone have any suggestions? Thanks in advance for your help. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Full text search without full phrase matches
Hi all, I am new to this maling list and to SqLite, so I wanted to start by thanking all of those who make this project a reality. It is a great tool. Now, to my question. I am trying to use the full text search feature to find rough matches for a chat robot. Basically I want to match as many keywords as possible, but not necessarily all of them. The results should be sorted based on how many keywords were found in the phrase and how closely ordered they are to the query. In other words the ordering doesn't have to be exact, but the closer it is, the higher the result should rank. Similarly, even if only one or two words in the phrase are found it should match, but rank higher the more of the words that are present. I have read the reference and I see the NEAR statement and the matchinfo function, as well as the example of how to use it, but I cannot figure out how to apply this knowledge to my specific problem. Does anyone have any suggestions? Thanks in advance for your help. Kind regards, Philip Bennefall ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users