Re: [sqlite] New SQLite Forum established - this mailing list is deprecated

2020-03-13 Thread Philip Bennefall
I am happy to do that, but I can't promise any kind of timeline as it 
will have to be in my free time only. Can't really sign well, though, as 
I am totally blind. But I guess we'll cross that bridge if I come up 
with a working audio captcha solution.


I'll have a look at the Fossil captcha code this evening and write back 
if I have questions. Should I write to you directly, or to the Fossil forum?


Thanks!

Philip


On 3/13/2020 5:16 PM, Richard Hipp wrote:

On 3/13/20, Philip Bennefall  wrote:

I submitted
a thread a while back offering to work on an audio captcha for Fossil,

I don't recall that thread.  But if you want to submit code that
generates an audio file of some kind that speaks the text of a Fossil
Captcha, that would be great.  I will build it into the system,
assuming it works, does not have onerous external dependencies, and
you can get me a signed CLA.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] New SQLite Forum established - this mailing list is deprecated

2020-03-13 Thread Philip Bennefall
If you are comfortable doing that for any blind users who stop by, then 
of course that would be a quick fix. But it feels like taking a step 
backwards, from being able to subscribe without assistance to having to 
contact the author of whatever Fossil forum this applies to. I submitted 
a thread a while back offering to work on an audio captcha for Fossil, 
but cannot locate the exact post. I did not receive much response at the 
time, but am wondering if this would be of interest?


Thanks!

Kind regards,

Philip Bennefall
On 3/13/2020 4:53 PM, Richard Hipp wrote:

On 3/13/20, Philip Bennefall  wrote:

Is there a solution in the pipeline for the inaccessible captcha in the
forum for visually impaired users?

The solution is for you to send me a private email asking for me to
create your account for you, as doing that is way, way easier than
trying to engineer an audible captcha.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] New SQLite Forum established - this mailing list is deprecated

2020-03-13 Thread Philip Bennefall

Hi Richard,


Is there a solution in the pipeline for the inaccessible captcha in the 
forum for visually impaired users? I brought this up when the forum was 
first released and you kindly created an account manually for me, but it 
doesn't seem as though the problem was actually solved in the system 
itself. I just tried to create an account on the SqLite forum, but could 
not. The absolute majority of captcha systems offer some kind of audio 
version now, which is very important for me and other blind users.



Kind regards,


Philip Bennefall


On 3/13/2020 4:22 PM, Richard Hipp wrote:

On 3/13/20, Huỳnh Trần Khanh  wrote:

[On a mailing nlist] I can
filter the posts, sort them, search through them, archive them,
forward them to a friend, 

You can do all of that with the SQLite Forum.  Remember, all content
is still delivered directly to your in-box, just like with a mailing
list, so anything you can do with content received from a mailing list
can also be done with content from the forum.  But there are many
things that the forum provides that a mailing list does now.  For
example, if you want an archive of the forum activity, you can clone
the entire history with one command:

 fossil clone https://sqlite.org/forum sqlite-forum.fossil

Then periodically "sync" to keep your private archive up-to-date.  Now
you have all historical content, neatly packaged in an SQL database.
You can extract and search and manage the content in this archive in
any way you want.

The only thing that you can do with a mailing list that the SQLite
Forum does not allow is to submit new postings via email.  You must
use the web interface in order to post a message.  In my experience,
this forces people to take a little extra time to think about what
they are saying, and to format and arrange their thoughts for clarity,
and hence results in a better experience for the readers.

There are other important features that the forum provides that
mailinglists typically do not:

1.  You can format your postings using Markdown

2.  You can add hyperlinks to your postings that are consistently
displayed and are not dependent on the idiosyncrasies of various email
clients.

3.  You can edit prior posts to fix typos or mistakes.

4.  Your email address is never displayed, even to subscribers.

5.  It is much easier to contribute anonymously to a web-based forum
than it is to contribute on a mailing list.  There is no verification
process to go through.  You just type in what you want to say and
press "Submit".

6.  Moderators have much better control over spam and other malicious content.

The first point (use of Markdown) is the killer feature for me.  There
was a recent thread on this mailing list that involved people posting
EXPLAIN output.  That text gets hopelessly jumbled on most email
readers.  If those messages had been formatted with Markdown, they
would have been much easier to read and understand.

I've been using both this mailing list and the Forum on Fossil
regularly for two years now.  The forum is so much nicer that I have
come to dread having to work with the legacy mailing list, at least
for complex subjects.  It is time for a switch to better technology.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance vs. memory trade-off question

2019-12-14 Thread Philip Bennefall

Hi Richard,


I can see the advantage of both, but personally I prefer the greater 
speed with slightly more memory approach (e.g. what we have now).



Kind regards,


Philip Bennefall


On 12/14/2019 2:27 PM, Richard Hipp wrote:

A new feature on a branch has the following disadvantages:

(1)  It uses about 0.25% more CPU cycles.  (Each release of SQLite is
normally about 0.5% faster, so enabling this feature is sort of like
going back by one-half of a release cycle.)

(2)  The code space (the size of the library) is between 400 and 500
bytes larger (depending on compiler and optimization settings).

The this one advantage:

(3)  Each database connection uses about 72 KB less heap space.

QUESTION:  Should this feature be default-on or default-off?

What's more important to you?  0.25% fewer CPU cycles or about 72KB
less heap space used per database connection?

The feature can be activated or deactivated at start-time, but you
take the disadvantages (the performance hit and slightly larger
library size) regardless, unless you disable the feature at
compile-time.  If the feature is compile-time disabled, then the
corresponding code is omitted and and it cannot be turned on at
start-time.

If you have opinions, you can reply to this mailing list, or directly to me.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Tutorial on SQLite Internals - 2019-11-05 in Houston, TX

2019-09-03 Thread Philip Bennefall
I will unfortunately not be able to participate, but I am wondering if 
it will be recorded and put on YouTube or similar? I have seen some 
talks of yours from TCL and other conferences and enjoyed them very much.



Kind regards,


Philip Bennefall


On 9/3/2019 3:12 AM, Richard Hipp wrote:

There will be a full-day tutorial on SQLite Internals on Tuesday
2019-11-05 in Houston TX. See details at:

https://www.tcl.tk/community/tcl2019/tutorials.html#drh:sqlite-tour1

This will be an intensive tutorial.  Bring your laptop, with a
C-compiler already installed, and also with TCL development libraries
(needed to compile SQLite from canonical sources) and Fossil
installed, and be comfortable using the compiler tools before you
arrive, as we have a lot of ground to cover and will need to move
quickly.

This tutorial is part of the 2019 Tcl/Tk conference, which will
continue on the following three days.  To maximize your learning
experience, plan stay as long as you can, as these events normally
involve a lot of socializing with a group of very smart people.

I will be giving the tutorial (obviously) but I am not a conference
organizer and am not up on all the details.  If you have questions,
please ask and I will see if I can find answers.  Or, some of the
conference organizers are on this mailing list, so perhaps they will
speak up.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database like file archive

2019-08-27 Thread Philip Bennefall
The earliest version of the shell which ships with the archive support 
is 3.22.0, according to the page I linked to. If you have an earlier 
version you could simply grab the Mac OSX precompiled binaries from the 
download page on sqlite.org and you'll be good to go.



Kind regards,


Philip Bennefall


On 8/27/2019 4:18 PM, Peng Yu wrote:

There is the sqlar archive format, which you can test using the official
sqlite3 command line shell. There is also a library for it as part of
the Sqlite3 repository.

https://www.sqlite.org/sqlar.html
https://sqlite.org/sqlar/doc/trunk/README.md

This is good to know.

How to install it? In homebrew's sqlite package, I don't find sqlar. I
use Mac OS X.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] database like file archive

2019-08-27 Thread Philip Bennefall
There is the sqlar archive format, which you can test using the official 
sqlite3 command line shell. There is also a library for it as part of 
the Sqlite3 repository.



https://www.sqlite.org/sqlar.html

Kind regards,

Philip Bennefall
On 8/27/2019 3:56 PM, Peng Yu wrote:

Hi,

I haven't found an archive format that allows in-place delete (I know
that .zip, .7z and .tar don't). This means that whenever delete is
needed, the original archive must be copied first. This can be
problematic when the archive is large and the file to delete is small.

Something along the line of the ability of sqlite3 to perform in-place
delete might be a useful feature for archives. But I haven't found any
such archive format. Does anybody know one? Thanks.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] CTE to Get Path In a Tree

2019-05-12 Thread Philip Bennefall



On 5/12/2019 2:14 PM, Igor Tandetnik wrote:

On 5/12/2019 6:19 AM, Philip Bennefall wrote:

Hi everyone,

I have a tree of folders and I want to find the complete path from 
any arbitrary point back to the top level directory. The schema is:


CREATE TABLE IF NOT EXISTS folders(
 id INTEGER PRIMARY KEY,
 parentFolderId INTEGER REFERENCES folders(id) ON DELETE CASCADE 
ON UPDATE CASCADE,

 name TEXT NOT NULL);

I made the following CTE:

WITH RECURSIVE folderTree (id, path)
AS(SELECT id, name FROM folders WHERE parentFolderId IS NULL
UNION ALL
SELECT folders.id, folderTree.path || '/' || folders.name
FROM folders, folderTree WHERE folders.parentFolderId = folderTree.id)
  SELECT path FROM folderTree WHERE id=?1;

This produces the correct result, but I am wondering if there is a 
more efficient way? This query seems to generate the entire tree and 
then do a table scan to find just the one row I am looking for. Can I 
start from the given row and *only* traverse upwards through the 
levels until I find a node with no parent?


Just reverse the conditions. Something like this (not tested):

WITH RECURSIVE folderPath(id, parentId, path)
AS(SELECT id, parentFolderId, name FROM folders WHERE id=?1)
UNION ALL
SELECT f.id, f.parentFolderId, f.name || '/' || fp.path
FROM folders f join folderPath fp on (f.id = fp.parentId))
SELECT path FROM folderPath WHERE parentId is null;


Thanks, that seems to work with a couple of very minor tweaks. The query 
plans are somewhat different and my gut feeling is that the one you 
wrote is better, but I will measure against some larger datasets just to 
be sure.


Thanks for the quick response!

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] CTE to Get Path In a Tree

2019-05-12 Thread Philip Bennefall

Hi everyone,

I have a tree of folders and I want to find the complete path from any 
arbitrary point back to the top level directory. The schema is:


CREATE TABLE IF NOT EXISTS folders(
id INTEGER PRIMARY KEY,
parentFolderId INTEGER REFERENCES folders(id) ON DELETE CASCADE ON 
UPDATE CASCADE,

name TEXT NOT NULL);

I made the following CTE:

WITH RECURSIVE folderTree (id, path)
AS(SELECT id, name FROM folders WHERE parentFolderId IS NULL
UNION ALL
SELECT folders.id, folderTree.path || '/' || folders.name
FROM folders, folderTree WHERE folders.parentFolderId = folderTree.id)
 SELECT path FROM folderTree WHERE id=?1;

This produces the correct result, but I am wondering if there is a more 
efficient way? This query seems to generate the entire tree and then do 
a table scan to find just the one row I am looking for. Can I start from 
the given row and *only* traverse upwards through the levels until I 
find a node with no parent?


Thanks in advance for any pointers.

Kind regards,

Philip Bennefall

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fossil Delta Compression in SqLite

2018-05-07 Thread Philip Bennefall
I am far from an expert in this field myself so I don't know whether 
including it in the text section of the binary would be enough, and the 
main issue for me is when clients of mine redistribute middleware in 
their turn as I mentioned in an earlier post. But either way, Richard 
already cleared this up so there's no more ambiguity on my end.


Kind regards,

Philip
On 5/7/2018 11:11 PM, Warren Young wrote:

On May 7, 2018, at 9:53 AM, Philip Bennefall <phi...@blastbay.com> wrote:

It was merely an idea to possibly avoid some potential ambiguity regarding 
public domain, which is a bit of a gray area in many places.

So take the code under the explicit license, then.

In my non-expert opinion, the worry over attribution is bogus.  Restriction 2 
in the 2-clause BSD license just requires that the license text itself be in 
the binary, not that you “attribute” the software in your documentation or in a 
startup banner as is required by some other licenses.

Fossil itself doesn’t bother to do even that:

 $ strings `which fossil` | grep 'IMPLIED WARRANTIES'


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fossil Delta Compression in SqLite

2018-05-07 Thread Philip Bennefall
It was merely an idea to possibly avoid some potential ambiguity 
regarding public domain, which is a bit of a gray area in many places. 
Obviously not a requirement for anyone to do anything, it was but a 
friendly question.



Kind regards,

Philip
On 5/7/2018 5:44 PM, R Smith wrote:


On 2018/05/07 5:33 PM, Philip Bennefall wrote:

Thanks very much for that information, Richard! :)

I don't know if it would make any difference legally, but perhaps 
this could be made explicit in the comments?


So it's not enough to get it free... the free giver has to now put 
some extra labour for zero reward into convincing people it is free to 
use.


This is a crazy World. :)


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fossil Delta Compression in SqLite

2018-05-07 Thread Philip Bennefall

Thanks very much for that information, Richard! :)

I don't know if it would make any difference legally, but perhaps this 
could be made explicit in the comments?


Thanks again.

Kind regards,

Philip
On 5/7/2018 5:22 PM, Richard Hipp wrote:

You are welcomed to use the public-domain version of the delta
encoding routines found in the SQLite source tree for whatever purpose
you want, without attribution.  I am the sole author of that code, and
I am a citizen of a country that allows people to disavow intellectual
property claims, so it is possible for me to say this.

On 5/7/18, Philip Bennefall <phi...@blastbay.com> wrote:

As far as I can judge, you need to include the entire license - or at
least the majority of it - in the documentation (not just a single
line). For an end user product that's fine, but I would rather not have
to ask clients to do so if I am distributing middleware simply because
of a component that I use internally and which they never see. It's a
pretty big difference from public domain in that respect, though I
realize that it is a difference that many people don't care about.

I would be curious to hear what the developers think about this, since
this license differs from the rest of the SqLite codebase. Of course it
is an extension so you don't need to include it, but I'm curious
nonetheless.

Thanks in advance for any clarification.

Kind regards,

Philip Bennefall
On 5/7/2018 4:54 PM, Peter Da Silva wrote:

On 5/6/18, 11:23 AM, "sqlite-users on behalf of Philip Bennefall"
<sqlite-users-boun...@mailinglists.sqlite.org on behalf of
phi...@blastbay.com> wrote:
  Only the requirement for attribution in binaries. That can be
  significant in certain use cases.

One line of text in the documentation provided with the distribution
doesn't seem burdensome. It's not like the advertising clause in the
original BSD license... is that what you're thinking of?


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fossil Delta Compression in SqLite

2018-05-07 Thread Philip Bennefall
As far as I can judge, you need to include the entire license - or at 
least the majority of it - in the documentation (not just a single 
line). For an end user product that's fine, but I would rather not have 
to ask clients to do so if I am distributing middleware simply because 
of a component that I use internally and which they never see. It's a 
pretty big difference from public domain in that respect, though I 
realize that it is a difference that many people don't care about.


I would be curious to hear what the developers think about this, since 
this license differs from the rest of the SqLite codebase. Of course it 
is an extension so you don't need to include it, but I'm curious 
nonetheless.


Thanks in advance for any clarification.

Kind regards,

Philip Bennefall
On 5/7/2018 4:54 PM, Peter Da Silva wrote:

On 5/6/18, 11:23 AM, "sqlite-users on behalf of Philip Bennefall" 
<sqlite-users-boun...@mailinglists.sqlite.org on behalf of phi...@blastbay.com> wrote:
 Only the requirement for attribution in binaries. That can be
 significant in certain use cases.

One line of text in the documentation provided with the distribution doesn't 
seem burdensome. It's not like the advertising clause in the original BSD 
license... is that what you're thinking of?
  


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fossil Delta Compression in SqLite

2018-05-06 Thread Philip Bennefall
Only the requirement for attribution in binaries. That can be 
significant in certain use cases.



Kind regards,


Philip Bennefall


On 5/6/2018 6:19 PM, Richard Hipp wrote:

On 5/6/18, Philip Bennefall <phi...@blastbay.com> wrote:

Hi all,


I had a quick question regarding the licensing of the delta compression
code found in the sqldiff and the RBU extensions for SqLite. I see that
this code is extracted from Fossil, which is under the BSD license. But
the header of the source files in the SqLite repository which contain
the delta compression code are marked as public domain. Does that mean
that the delta compression code extracted from Fossil is also public
domain? Could I borrow it from SqLite and use it without being bound by
the BSD license?

Does it matter whether the code is public domain or two-clause BSD?
There are no restrictions on its use in either case, are there?



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Fossil Delta Compression in SqLite

2018-05-06 Thread Philip Bennefall

Hi all,


I had a quick question regarding the licensing of the delta compression 
code found in the sqldiff and the RBU extensions for SqLite. I see that 
this code is extracted from Fossil, which is under the BSD license. But 
the header of the source files in the SqLite repository which contain 
the delta compression code are marked as public domain. Does that mean 
that the delta compression code extracted from Fossil is also public 
domain? Could I borrow it from SqLite and use it without being bound by 
the BSD license?



Kind regards,


Philip Bennefall

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite3_expert Status

2017-10-11 Thread Philip Bennefall

Dear SqLite developers,

I was curious to know the status of the sqlite3_expert extension? I 
followed its development with great interest earlier in the year and was 
wondering if there are any plans to merge it to trunk? What kind of work 
remains (if any) before it can be considered complete?


Thanks,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Question about Memsys5 Internals

2016-09-05 Thread Philip Bennefall

Hi Richard,

Our library runs in video games (on both consoles and mobile devices), 
and many game developers detest calls to malloc/free at run-time but 
they are happy to give me a sizable chunk in one go. So I'm using 
memsys5 as an optional memory pool in an attempt to reduce the number of 
*user* allocations. The user is able to plug in their own malloc/free 
replacements as well in which case the memory pool is redundant, but the 
pool is there as a shield in the default configuration.


Kind regards,

Philip Bennefall
On 9/5/2016 7:39 PM, Richard Hipp wrote:

On 9/5/16, Philip Bennefall <phi...@blastbay.com> wrote:

I am using [memsys5] as a general purpose
memory pool in my application.

There is a strong feeling in some parts of the open-source community
that nobody should ever attempt to build their own memory allocator.
The malloc/free from the standard library is believed to be good
enough for every situation and any attempt to work around standard
malloc/free is discouraged.

SQLite uses system malloc/free by default, as you know.  But even the
presence of memsys5 in the source tree as a compile-time option is
controversial in some circles.  I put memsys5 there for good reasons,
but they are obscure reasons, which is why memsys5 is usually omitted
from the build.

So I'm wondering:  Why are *you* using memsys5 rather than malloc/free
from the standard library?


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Question about Memsys5 Internals

2016-09-05 Thread Philip Bennefall
Thanks, Richard. I actually arrived at that by experimenting, but I had 
no idea whether I was doing it right. Thanks for the explanation - it 
confirms that I've got it set up correctly.


Kind regards,

Philip Bennefall

On 9/5/2016 12:13 PM, Richard Hipp wrote:

On 9/5/16, Philip Bennefall <phi...@blastbay.com> wrote:

Hi all,

First, if this is the wrong place to ask this question, please let me
know and I will ask elsewhere.

I have a question about memsys5, which I am using as a general purpose
memory pool in my application. I am allocating fixed size objects, and I
have set mReq to the size of that object (it is a power of 2). I know
that I will only be using N objects at any given time. Is there an easy
way to calculate how much space I need to reserve in order to hold N
objects simultaneously? If I allocate a total of N*sizeof(object) bytes,
it doesn't seem to let me store N objects at the same time. Am I doing
something wrong or is this expected?


Let sz be the number of bytes of memory you provide to the memory
allocator.  The number of minimum-size objects that can be stored is
sz/(mnReq+1).  Not sz/mnReq.  The extra +1 is some space taken from
the front of the provided memory and used for internal bookkeeping.

--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Question about Memsys5 Internals

2016-09-05 Thread Philip Bennefall

Hi all,

First, if this is the wrong place to ask this question, please let me 
know and I will ask elsewhere.


I have a question about memsys5, which I am using as a general purpose 
memory pool in my application. I am allocating fixed size objects, and I 
have set mReq to the size of that object (it is a power of 2). I know 
that I will only be using N objects at any given time. Is there an easy 
way to calculate how much space I need to reserve in order to hold N 
objects simultaneously? If I allocate a total of N*sizeof(object) bytes, 
it doesn't seem to let me store N objects at the same time. Am I doing 
something wrong or is this expected?


Thanks in advance for any tips!

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sweet 16

2016-05-29 Thread Philip Bennefall
Incredible. Congratulations to Richard, Dan, Joe and the rest of the 
team on this incredible achievement. I'm not very active on this list, 
but I've been a casual sqLite user for years and it just keeps getting 
better. Keep up the great work!


Kind regards,

Philip Bennefall

On 5/29/2016 7:28 PM, Richard Hipp wrote:

The first check-in of SQLite code occurred 16 years ago today.
https://www.sqlite.org/src/timeline?c=2000-05-29



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Delta Compression in RBU

2015-11-13 Thread Philip Bennefall
Thanks for the clarification, Dan. Might be too picky but perhaps a short note 
should be added to the sources verifying this for people as paranoid as myself? 
:D

Kind regards,

Philip Bennefall

From: sqlite-users-bounces at mailinglists.sqlite.org [sqlite-users-bounces at 
mailinglists.sqlite.org] on behalf of Dan Kennedy [danielk1...@gmail.com]
Sent: Friday, November 13, 2015 3:11 PM
To: sqlite-users at mailinglists.sqlite.org
Subject: Re: [sqlite] Delta Compression in RBU

On 11/13/2015 08:06 AM, Philip Bennefall wrote:
> Something I forgot in my prior post; I found the delta creation code
> in sqldiff.c so my question really concerns the combined delta code
> found in RBU and sqldiff.c (both creating and applying deltas).


The versions of the delta creation and application code checked in to
the sqlite source project are public domain. The original authors of the
code re-licensed it.

Dan.

___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Delta Compression in RBU

2015-11-13 Thread Philip Bennefall
Something I forgot in my prior post; I found the delta creation code in 
sqldiff.c so my question really concerns the combined delta code found 
in RBU and sqldiff.c (both creating and applying deltas).

Kind regards,

Philip Bennefall


[sqlite] Delta Compression in RBU

2015-11-13 Thread Philip Bennefall
Hi Richard and others,

I am looking at the RBU extension, and the delta compression 
functionality in particular. I am interested in using the delta 
compression code (both as part of the RBU extension itself but possibly 
also externally by extracting it from RBU). I see that the delta 
compression was lifted from Fossil which is under the BSD license, but 
of course I also see that the RBU extension has a public domain 
dedication at the top just like all the other official SqLite 
extensions. So I really just wanted to verify that even though it is 
lifted from Fossil, the delta compression code has the same public 
domain clearance guarantee as the rest of sqLite? Is this assumption 
true, or could there potentially be contributions from other Fossil 
developers that are still under the terms of the BSD license?

Thanks!

Kind regards,

Philip Bennefall


[sqlite] fts5aux?

2015-10-16 Thread Philip Bennefall
Thanks, Dan. I'll keep my eyes on this. Thanks for the great work on FTS5!

Kind regards,

Philip Bennefall
On 10/16/2015 11:10 AM, Dan Kennedy wrote:
> On 10/15/2015 03:35 AM, Philip Bennefall wrote:
>> Hi all,
>>
>> Are there any plans in fts5 to add support for something equivalent 
>> to fts4aux? This would be of great use to me.
>
> There is, but it's not actually finished or documented yet. It still 
> does a linear scan of the entire FTS index for every query. To create 
> the table:
>
>   CREATE VIRTUAL TABLE vvv USING fts5vocab(, );
>
> where  may be either "row" or "column".
>
> Code is here: http://www.sqlite.org/src/artifact/85ebf2e93089c
>
> Dan.
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> .
>



[sqlite] fts5aux?

2015-10-14 Thread Philip Bennefall
Hi all,

Are there any plans in fts5 to add support for something equivalent to 
fts4aux? This would be of great use to me.

Thanks!

Kind regards,

Philip Bennefall


[sqlite] Possible documentation error regarding recursive triggers

2015-09-18 Thread Philip Bennefall
Sorry for the unrelated content below my last message; I responded to a 
prior post to make sure I got the address right and forgot to clear it.

On 9/18/2015 12:32 AM, Philip Bennefall wrote:
> Hi all,
>
> I have found what I believe is a mistake in the SqLite documentation. 
> On the page listing the supported pragmas, in the section called 
> recursive_triggers, it says:
>
> Support for recursive triggers was added in version 3.6.18 but was 
> initially turned OFF by default, for compatibility. Recursive triggers 
> may be turned on by default in future versions of SQLite.
>
> However, in sqlite.org/limits.html it says:
>
> Beginning with version 3.7.0, recursive triggers are enabled by 
> default but can be manually disabled using PRAGMA recursive_triggers.
>
> Kind regards,
>
> Philip Bennefall
>
> On 9/17/2015 10:49 PM, Scott Hess wrote:
>> On Thu, Sep 17, 2015 at 1:24 PM, Ralf Junker  wrote:
>>
>>> On 17.09.2015 20:14, Scott Hess wrote:
>>>
>>>> The problem is that there are LOCALE settings where tolower() does 
>>>> things
>>>> C
>>>> programmers don't expect.  I think tr_TR was one case, the handling 
>>>> of 'I'
>>>> (Google "tr_tr locale bug" and you'll see lots of people hitting 
>>>> the same
>>>> general problem).  It isn't a problem of type safety, it's a 
>>>> problem that
>>>> the same inputs might have different outputs for certain library 
>>>> functions
>>>> when you change environment variables.  I don't remember whether there
>>>> were
>>>> specific problems with other ctype functions, or if I just thought 
>>>> it was
>>>> a
>>>> good idea to be careful, once I realized the class of problem.
>>>>
>>> And this check-in therefore misses the point as it does not address 
>>> this
>>> LOCALE problem IMHO:
>>>
>>> http://www.sqlite.org/src/info/6713e35b8a8c997a
>>
>> Hmm.  Well, it might miss _a_ point, while solidly landing some other 
>> point
>> :-).
>>
>> Current fts3 seems to spread this code differently, under fts2.c it was
>> like:
>>
>> https://code.google.com/p/chromium/codesearch#chromium/src/third_party/sqlite/src/ext/fts2/fts2.c=336
>>  
>>
>> where it's not a wrapper around the library, instead it was a direct
>> implementation of the functions which only acted on 7-bit values in the
>> ASCII set.
>>
>> I think these should really be in terms of sqlite3UpperToLower
>> and sqlite3CtypeMap.  That might be an issue to expose to an extension
>> sensibly.
>>
>> -scott
>> ___
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> .
>>
>



[sqlite] Possible documentation error regarding recursive triggers

2015-09-18 Thread Philip Bennefall
Hi all,

I have found what I believe is a mistake in the SqLite documentation. On 
the page listing the supported pragmas, in the section called 
recursive_triggers, it says:

Support for recursive triggers was added in version 3.6.18 but was 
initially turned OFF by default, for compatibility. Recursive triggers 
may be turned on by default in future versions of SQLite.

However, in sqlite.org/limits.html it says:

Beginning with version 3.7.0, recursive triggers are enabled by default 
but can be manually disabled using PRAGMA recursive_triggers.

Kind regards,

Philip Bennefall

On 9/17/2015 10:49 PM, Scott Hess wrote:
> On Thu, Sep 17, 2015 at 1:24 PM, Ralf Junker  wrote:
>
>> On 17.09.2015 20:14, Scott Hess wrote:
>>
>>> The problem is that there are LOCALE settings where tolower() does things
>>> C
>>> programmers don't expect.  I think tr_TR was one case, the handling of 'I'
>>> (Google "tr_tr locale bug" and you'll see lots of people hitting the same
>>> general problem).  It isn't a problem of type safety, it's a problem that
>>> the same inputs might have different outputs for certain library functions
>>> when you change environment variables.  I don't remember whether there
>>> were
>>> specific problems with other ctype functions, or if I just thought it was
>>> a
>>> good idea to be careful, once I realized the class of problem.
>>>
>> And this check-in therefore misses the point as it does not address this
>> LOCALE problem IMHO:
>>
>> http://www.sqlite.org/src/info/6713e35b8a8c997a
>
> Hmm.  Well, it might miss _a_ point, while solidly landing some other point
> :-).
>
> Current fts3 seems to spread this code differently, under fts2.c it was
> like:
>
> https://code.google.com/p/chromium/codesearch#chromium/src/third_party/sqlite/src/ext/fts2/fts2.c=336
> where it's not a wrapper around the library, instead it was a direct
> implementation of the functions which only acted on 7-bit values in the
> ASCII set.
>
> I think these should really be in terms of sqlite3UpperToLower
> and sqlite3CtypeMap.  That might be an issue to expose to an extension
> sensibly.
>
> -scott
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> .
>



[sqlite] Possible error in FTS5 docs

2015-06-19 Thread Philip Bennefall
Thanks, Dan. Do you have any idea when FTS5 will be merged into the 
amalgamation, if at all?

Kind regards,

Philip Bennefall

On 6/19/2015 8:54 PM, Dan Kennedy wrote:
> On 06/19/2015 08:34 PM, Philip Bennefall wrote:
>> Hi all,
>>
>> While reading through the draft documentation for FTS5, I noticed the 
>> following in section 4.1.1:
>>
>> "The built-in auxiliary function bm25() returns a real value 
>> indicating how well the current row matches the full-text query. The 
>> better the match, the
>> larger the value returned."
>>
>> Then, a little further down it says:
>>
>> "In order to avoid this pitfall, the FTS5 implementation of BM25 
>> multiplies the result by -1 before returning it, ensuring that better 
>> matches are assigned
>> numerically lower scores."
>>
>> Is this a mistake, or did I misunderstand something?
>
> It's a mistake. Should be "The better the match, the numerically lower 
> the value returned" or similar.
>
> Thanks for pointing this out.
>
> Dan.
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> .
>



[sqlite] Possible error in FTS5 docs

2015-06-19 Thread Philip Bennefall
Hi all,

While reading through the draft documentation for FTS5, I noticed the 
following in section 4.1.1:

"The built-in auxiliary function bm25() returns a real value indicating 
how well the current row matches the full-text query. The better the 
match, the
larger the value returned."

Then, a little further down it says:

"In order to avoid this pitfall, the FTS5 implementation of BM25 
multiplies the result by -1 before returning it, ensuring that better 
matches are assigned
numerically lower scores."

Is this a mistake, or did I misunderstand something?

Also, I am curious to know whether the developers are able to estimate 
roughly when the FTS5 extension might appear in an official release of 
SqLite?

Kind regards,

Philip Bennefall


[sqlite] Extending VFS documentation

2015-05-25 Thread Philip Bennefall
Thanks, Jeff. That's an interesting project; redirecting to iostream. 
I'm sure it'll be useful, though an authoritative source for 
documentation on the SqLite website itself would be ideal. The methods 
themselves and their arguments seem easy enough; my main concern is what 
error codes I may return when, etc.

Kind regards,

Philip Bennefall
On 5/24/2015 6:19 PM, Jeff M wrote:
> I searched stack overflow for "sqlite3_vfs_register" and got a few hits.  For 
> example, this was a good starting point on which I based my own VFS.
>
>   http://stackoverflow.com/a/3842409/236415
>
> Jeff
>
>
>> On May 24, 2015, at 4:39 AM, Philip Bennefall  wrote:
>>
>> Hi all,
>>
>> I have been interested in the VFS layer of SqLite for a while, and found the 
>> following article to be very useful:
>>
>> http://www.sqlite.org/vfs.html
>>
>> However, it ends right when it is about to go into the actual details of how 
>> to implement a VFS. It would be great if that essay could be completed. I
>> have looked at the example VFS implementations, but it is hard to determine 
>> which parts of the code that are implementation details as opposed to being
>> part of the stable public API so to speak.
>>
>> Kind regards,
>>
>> Philip Bennefall
>>
> .
>



[sqlite] Extending VFS documentation

2015-05-25 Thread Philip Bennefall
Hi Roger,

For experimentation, I definitely agree that using a higher level 
language is better. What I am after is an authoritative source for exact 
behavior of the various functions, what different errors they may return 
and what the ramifications of not implementing various methods are, etc 
etc. There is some information of that nature in the reference, but it 
would be ideal if the article on the sqLite website itself would be 
completed as that would eliminate some guesswork. The rest of the sqLite 
documentation, and indeed the beginning of the VFS article, is 
excellent. It allows me to implement things immediately without having 
to search third party sources and hack around to find what works and 
what doesn't. It would be ideal if the same could be true for the VFS layer.

Kind regards,

Philip Bennefall
On 5/24/2015 10:38 PM, Roger Binns wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 05/24/2015 02:39 AM, Philip Bennefall wrote:
>> I have looked at the example VFS implementations, but it is hard
>> to determine which parts of the code that are implementation
>> details as opposed to being part of the stable public API so to
>> speak.
> The answer is that it matters what you are exposing.  A vfs for a csv
> file is a very different beast that one exposing bits of Amazon's
> services.  I wrote one that exposes couchdb, but it wouldn't help
> someone else exposing something different.
>
> What I recommend you do is use one of the higher level languages that
> exposes the VFS in their SQLite bindings.  You will get better error
> messages, be able to write the code quicker, and explore behaviour
> easier (eg xBestIndex).  ie you'll get a better on the VFS specific
> issues, rather than having to build out C level plumbing.  Once you
> have that understood, converting to C is simpler than having started
> with C.
>
> For example my Python wrapper (APSW) exposes the VFS, along with an
> example and documentation.  It also easily lets you "inherit" from an
> existing VFS so you only have to write methods where you want
> different behaviour (eg mangling file names, or modifying how the
> database file contents are stored).
>
>http://rogerbinns.github.io/apsw/vfs.html
>
> You should find similar bindings in your higher level languages of choice.
>
> Roger
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1
>
> iEUEARECAAYFAlViNrAACgkQmOOfHg372QRrrQCYxRX6XbpyS39O5V+tJyyjXjx6
> vACgxZ/Uu+TDvT22u1FN083YFMu2muc=
> =wo2G
> -END PGP SIGNATURE-
> .
>



[sqlite] Extending VFS documentation

2015-05-24 Thread Philip Bennefall
Hi all,

I have been interested in the VFS layer of SqLite for a while, and found 
the following article to be very useful:

http://www.sqlite.org/vfs.html

However, it ends right when it is about to go into the actual details of 
how to implement a VFS. It would be great if that essay could be 
completed. I
have looked at the example VFS implementations, but it is hard to 
determine which parts of the code that are implementation details as 
opposed to being
part of the stable public API so to speak.

Kind regards,

Philip Bennefall


Re: [sqlite] Checking whether a given date is valid

2014-08-09 Thread Philip Bennefall

Hi Keith,

Thanks for the explanation. From what I can see in your example, the 
only difference from mine is that I used date(x) and you used 
datetime(x). If I am understanding things correctly, this would give the 
same behavior if I am only concerned about dates? In this case I don't 
care about the time of day; just that the date is valid. So based on 
your description I gather that I'm doing it correctly in my example? I hope?


Kind regards,

Philip Bennefall
On 8/10/2014 12:00 AM, Keith Medcalf wrote:

The result of converting an "invalid" datetime string with julianday() is a 
floating point julianday value which corresponds to the normalized datetime.  In your 
example, date(julianday('2005-02-29')) is '2005-03-01' which is the correct normalized 
date.

Therefore, a function which will validate that a string is a valid normalized 
date / datetime / time string is the following:

datetime(julianday(x))=datetime(x)

This returns 1 if the x is a valid normal date, datetime, or time and 0 if not; 
and, can be used as a check constraint or otherwise.

sqlite> create table mytable
...> (
...>data text collate nocase check (datetime(julianday(data)) = 
datetime(data))
...> );
sqlite> insert into mytable values('01:00');
sqlite> insert into mytable values('24:00');
Error: CHECK constraint failed: mytable
sqlite> insert into mytable values('2004-02-29');
sqlite> insert into mytable values('2005-02-29');
Error: CHECK constraint failed: mytable
sqlite> insert into mytable values('-01-29');
sqlite> insert into mytable values('-0502-02-29');
Error: CHECK constraint failed: mytable
sqlite> insert into mytable values('-0502-01-29');

Of course, inserting a "numeric" value also passes the test since all strictly 
numeric values can be converted to valid datetime strings.



-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
boun...@sqlite.org] On Behalf Of Philip Bennefall
Sent: Saturday, 9 August, 2014 14:53
To: General Discussion of SQLite Database
Subject: [sqlite] Checking whether a given date is valid

Hi all,

I'm working on a project where I need to verify that a given date
actually existed (like February 29 in a particular year). I am using
SqLite to store a lot of data already and I didn't feel like hunting for
a datetime library online. I figured that there should be a way to use
SqLite's date functions to check this, and came up with the following
query:

select 1 where date(julianday('2004-02-29'))=date('2004-02-29');

The above query returns 1 as expected. The following one returns
nothing, as I would also expect:

select 1 where date(julianday('2005-02-29'))=date('2005-02-29');

Now, my question is simply this. Is the query sound? All the tests I
have run thus far have given correct results (invalid days of months,
leap years etc) but are there any pitfalls that I should be aware of? Is
it safe to rely on the conversion between the floating point julianday
and the date string being accurate if the date is in fact valid?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Checking whether a given date is valid

2014-08-09 Thread Philip Bennefall

Hi Simon,

That is what I tried to do in my query examples that I included in the 
original message. Based on those, would you say that I am doing it 
correctly?


Kind regards,

Philip Bennefall
On 8/9/2014 11:49 PM, Simon Slavin wrote:

On 9 Aug 2014, at 9:53pm, Philip Bennefall <phi...@blastbay.com> wrote:


Is it safe to rely on the conversion between the floating point julianday and 
the date string being accurate if the date is in fact valid?

Probably not.  The safest thing is to turn the string into a number (probably 
Julian day or epoch) and then turn it back into a date again.  If you get the 
original string back, it's a legit date.  If you get something else, it might 
be something like the 30th of February.

Simon.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Checking whether a given date is valid

2014-08-09 Thread Philip Bennefall

Hi all,

I'm working on a project where I need to verify that a given date 
actually existed (like February 29 in a particular year). I am using 
SqLite to store a lot of data already and I didn't feel like hunting for 
a datetime library online. I figured that there should be a way to use 
SqLite's date functions to check this, and came up with the following query:


select 1 where date(julianday('2004-02-29'))=date('2004-02-29');

The above query returns 1 as expected. The following one returns 
nothing, as I would also expect:


select 1 where date(julianday('2005-02-29'))=date('2005-02-29');

Now, my question is simply this. Is the query sound? All the tests I 
have run thus far have given correct results (invalid days of months, 
leap years etc) but are there any pitfalls that I should be aware of? Is 
it safe to rely on the conversion between the floating point julianday 
and the date string being accurate if the date is in fact valid?


Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [Spellfix] Searching for short words is very slow

2014-07-23 Thread Philip Bennefall

Hi Richard,

My application is basically just to take a text file as a command line 
argument and run the spellchecker on it, showing an alert for each word 
that is not found in the dictionary and giving the user some options.


After a bit of experimentation I concluded that one way to speed things 
up is to store the entire dictionary in memory as a hash map and look 
for exact matches. Only when an exact match isn't found do I fall back 
to the spellfix table. This allowed me to scan a document with just over 
86000 words in less than 500 milliseconds, which is more than acceptable 
for my needs. Certainly not ideal if you aren't on a workstation, but 
it's a reasonable tradeoff if memory is not an issue.


Perhaps something similar could be done in the spellfix table itself? 
Have an indexed integer column containing a crc32 or similar for each 
word in the dictionary so that we can look for exact matches very 
quickly. We only fall back to the fuzzy search if no match is found. Can 
you see any obvious drawbacks with this? If not, I'd like to put this 
optimization forth as an initial suggestion. I'll write again if I can 
think of anything else after reading the code more thoroughly.


Kind regards,

Philip Bennefall
On 7/24/2014 12:25 AM, Richard Hipp wrote:




On Wed, Jul 23, 2014 at 6:18 PM, Philip Bennefall <phi...@blastbay.com 
<mailto:phi...@blastbay.com>> wrote:


I have to amend my last message. The timings I just gave was for
looking up that word 10 times, not 1. So the longest time I've
seen would be about 150 ms. However, if you have a document with a
few thousand words we would still be looking at a significant
total searching time. Is this to be expected?


There is no expectation.

Spellfix is an experiment in doing fuzzy matching.  It was designed 
for a specific customer who is doing spell-checking in real-time, as 
the text is being entered.  Spellfix works way faster than the end 
user can enter text, so performance is not an issue in its original 
purpose.


Perhaps you are using spellfix in a different way?  You are welcomed 
to do so.  If you want to contribute ideas on how to improve spellfix 
for use in different scenarios, we will welcome your input.


There are comments in the code explaining how spellfix works.  Please 
review the principles of operation and then perhaps run a performance 
analysis using gprof or cachegrind.  Then describe exactly what you 
are doing and why it isn't working out for you and perhaps we can help.




--
D. Richard Hipp
d...@sqlite.org <mailto:d...@sqlite.org>


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [Spellfix] Searching for short words is very slow

2014-07-23 Thread Philip Bennefall
I have to amend my last message. The timings I just gave was for looking 
up that word 10 times, not 1. So the longest time I've seen would be 
about 150 ms. However, if you have a document with a few thousand words 
we would still be looking at a significant total searching time. Is this 
to be expected?


Kind regards,

Philip Bennefall
On 7/23/2014 11:57 PM, Philip Bennefall wrote:

Hi all,

I have been running some tests with spellfix using a table containing
about 30 words, extracted from the Moby project's single word list
as well as names and places. Moby can be found at:
http://icon.shef.ac.uk/Moby/

I have noticed that searching for medium length to very long words is
very fast, but when I start searching for short words like "hi" and
"bye", the search time skyrockets. I think the longest search time was
about 1500 milliseconds (the average is somewhere around the 500 ms
mark). My table is set up as follows:

create virtual table if not exists dictionary using
spellfix1(edit_cost_table=editcosts);

When searching, I specify top=5 to get these timings.

Is there anything I can tweak to speed up the search for short words, or
is there anything that can be done by the developers to optimize this
further?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] [Spellfix] Searching for short words is very slow

2014-07-23 Thread Philip Bennefall

Hi all,

I have been running some tests with spellfix using a table containing 
about 30 words, extracted from the Moby project's single word list 
as well as names and places. Moby can be found at:

http://icon.shef.ac.uk/Moby/

I have noticed that searching for medium length to very long words is 
very fast, but when I start searching for short words like "hi" and 
"bye", the search time skyrockets. I think the longest search time was 
about 1500 milliseconds (the average is somewhere around the 500 ms 
mark). My table is set up as follows:


create virtual table if not exists dictionary using 
spellfix1(edit_cost_table=editcosts);


When searching, I specify top=5 to get these timings.

Is there anything I can tweak to speed up the search for short words, or 
is there anything that can be done by the developers to optimize this 
further?


Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] [Spellfix] Avoiding exact duplicates

2014-07-23 Thread Philip Bennefall

Hi all,

Is it possible to tell the spellfix extension not to accept duplicate 
words if the rank and langid are exactly the same? I am collecting words 
from many different sources and after inserting them I very often get 
the same word back several times.


Thanks in advance for any tips.

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Questions from a novice - basic browsing of records in a listview.

2014-07-09 Thread Philip Bennefall

How about using prepared statements in conjunction with bind?

http://www.sqlite.org/c3ref/bind_blob.html

Kind regards,

Philip Bennefall
On 2014-07-09 15:03, - wrote:

Hello all,

I'm quite new at SQLite3, and have a bit of a problem with grasping the
handling of a  database.

After having used the OFFSET and LIMIT 1 method (in conjuction with a
userdata listview) and finding a past post into this forum describing it as
a rookie mistake I'm now trying to implement the "scrolling cursor" method
in that same post.  It leads to a few questions though.

For the above method to work for any database it means I need, for
each-and-every next/previous page request, to send *all* the bottom/top
records data back to the SQLite engine so it knows where to continue.  Even
when assuming the default maximum of columns the accumulated column names
and related data for the "WHERE" clause could get quite big.  Add to that a
possible the "SORT BY" clause and I'm looking at quite a large query, which
has to be created and transferred for every "scroll".  Which is something I
do not really like ...

1) Is it possible to refer to the columns in a kind of shorthand (index
perhaps) ?

2) Is it possible to have the SQLite engine initialize and remember certain
WHERE and ORDER clauses (without creating another database please :-) ), so
they can be used again-and-again (for the duration of a connection).

3) Is it possible, for the above 'scrolling cursor' method, to refer to a
starting record other than by sending the exact data of such a record back
to the SQLite engine ?

Ofcourse, feel (very) free to include other things that I've not thought
about and could be usefull. :-)

Regards,
Rudy Wieser



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Hints for the query planner

2013-09-10 Thread Philip Bennefall

Hi Richard,

What about "probability" or "likelyhood"? This works in both the case where 
the likelyhood is great as well as when it is low. From the list you 
provided, I would pick "unlikely".


Kind regards,

Philip Bennefall
- Original Message - 
From: "Richard Hipp" <d...@sqlite.org>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Tuesday, September 10, 2013 9:26 PM
Subject: [sqlite] Hints for the query planner


There is a survey question at the bottom of this message.  But first some
context...

Over on the sqlite-dev mailing list, a debate has been going on about the
best way to provide some useful hints to the query planner.  The query
under discussion looks like this:

SELECT DISTINCT aname
 FROM album, composer, track
WHERE cname LIKE '%bach%'
  AND composer.cid=track.cid
  AND album.aid=track.aid;

Assuming that the schema has appropriate indices and ANALYZE has been run,
SQLite does a good job of selecting an efficient query plan for the above.
But the query planner lacks a key piece of information that could help it
to do a better job.  In particular, the query planner does not know how
often the subexpression "cname LIKE '%bach%'" will be true.  But, it turns
out, the best query plan depends critically on this one fact.

By default, the query planner (in SQLite 3.8.0) assumes that a
subexpression that cannot use an index will always be true.  Probably this
will be tweaked in 3.8.1 so that such subexpressions will be assumed to
usually, but not always, be true.  Either way, it would be useful to be
able to convey to the query planner the other extreme - that a
subexpression is usually not true.

(Pedantic detail:  "not true" is not the same as "false" in SQL because
NULL is neither true nor false.)

There is currently code in a branch that provides a hinting mechanism using
a magic "unlikely()" function.  Subexpressions contained within
"unlikely()" are assumed to usually not be true.  Other than this hint to
the query planner, the unlikely() function is a complete no-op and
optimized out of the VDBE code so that it does not consume any CPU cycles.
The only purpose of the unlikely() function is to let the query planner
know that the subexpression contained in its argument is not commonly
true.  So, if an application developer knows that the string "bach" seldom
occurs in composer names, then she might rewrite the query like this:

SELECT DISTINCT aname
 FROM album, composer, track
WHERE unlikely(cname LIKE '%bach%')
  AND composer.cid=track.cid
  AND album.aid=track.aid;

The query planner might use this "likelihood" hint to choose a different
query plan that works better when the subexpression is commonly false.  Or
it might decide that the original query plan was good enough and ignore the
hint.  The query planner gets to make that decision.  The application
developer is not telling the query planner what to do. The application
developer has merely provided a small amount of meta-information about the
likelihood of the subexpression being true, meta-information which the
query planner may or may not use.

Note that the subexpression does not have to be a LIKE operator.
PostgreSQL, to name one example, estimates how often a LIKE operator will
be true based on the pattern on its right-hand side, and adjust query plans
accordingly, and some have argued for this sort of thing in SQLite.  But I
want a more general solution.  Suppose the subexpression involves one or
more calls to application-defined functions about which the query planner
cannot possible know anything.  A general mechanism for letting the query
planner know that subexpressions are commonly not true is what is desired -
not a technique for making LIKE operators more efficient.

SURVEY QUESTION:

The question for today is what to call this magic hint function:

(1)  unlikely(EXPR)
(2)  selective(EXPR)
(3)  seldom(EXPR)
(4)  seldom_true(EXPR)
(5)  usually_not_true(EXPR)

Please feel free to suggest other names if you think of any.

ADDITIONAL INFORMATION:

The current implementation allows a second argument which must be a
floating point constant between 0.0 and 1.0, inclusive. The second argument
is an estimate of the probability that the expression in the first argument
will be true.  The default is 0.05.  Names like "unlikely" or "seldom" work
well when this probability is small, but if the second argument is close to
1.0, then those names seem backwards.  I don't know if this matters.  The
optional second argument is not guaranteed to make it into an actually
release.
--
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] DLL Size differences and other info requested

2013-07-04 Thread Philip Bennefall

Hi Dan,

Thanks for that info. Do you have any views on compiling SqLite optimized 
for speed rather than size? Is the difference in performance generally small 
enough to be ignored? I am using Vc++ 2010 express and have been optimizing 
for speed up until now.


Kind regards,

Philip Bennefall
- Original Message - 
From: "Dan Kennedy" <danielk1...@gmail.com>

To: <sqlite-users@sqlite.org>
Sent: Thursday, July 04, 2013 5:12 PM
Subject: Re: [sqlite] DLL Size differences and other info requested


On 07/04/2013 05:49 AM, Philip Bennefall wrote:

Hi Stephen,

I don't know what compiler is used to build the official SqLite dll,
but provided it is some version of Vc++ my experience is that MinGw
often produces larger and sometimes significantly slower binaries on
Windows than VC++ does. In an unrelated project of mine, the binary
size dropped by about 500 KB when I switched from Dev-C++/MinGw to
Code::Blocks/MSVC++. This is mere speculation on my part in this case,
however, as I don't actually know what compiler that is being used to
build the official dll.


Those on the website are built with mingw 4.5.2 using:

  -Os
  -DSQLITE_ENABLE_FTS3
  -DSQLITE_ENABLE_RTREE
  -DSQLITE_ENABLE_COLUMN_METADATA

In case anyone was wondering.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] DLL Size differences and other info requested

2013-07-03 Thread Philip Bennefall

Hi Stephen,

I don't know what compiler is used to build the official SqLite dll, but 
provided it is some version of Vc++ my experience is that MinGw often 
produces larger and sometimes significantly slower binaries on Windows than 
VC++ does. In an unrelated project of mine, the binary size dropped by about 
500 KB when I switched from Dev-C++/MinGw to Code::Blocks/MSVC++. This is 
mere speculation on my part in this case, however, as I don't actually know 
what compiler that is being used to build the official dll.


Kind regards,

Philip Bennefall
- Original Message - 
From: "Stephen Chrzanowski" <pontia...@gmail.com>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Thursday, July 04, 2013 12:37 AM
Subject: [sqlite] DLL Size differences and other info requested


Just for kicks because I wanted to learn "how it was done", I decided to
download the amalgamation code and compile to a Win32 DLL.  The secondary
purpose of my doing so was that I also wanted to write a 3rd party app
that'd export all headers of the amalgamation to a different format that I
can then use to build into my own units in Delphi.  Basically convert the C
headers into Delphi headers so I can link to the DLL of the same version.
You know, just for "fun".  Whether or not my rearranging code works or
not.. well.. Its all on my dime, a learning experience, and probably to the
professionals in the crowd it'd seem like a waste of time.  To them I say
"meh".  ;)

The one thing I DID notice though is that my compiled version of the DLL in
size is much different than what the compiled version is on the
sqlite.orgsite.  Is there a particular reason for this?  What compiler
options do I
set to bring my the size of the DLL down or is it just the nature of the
amalgamation?  Just for reference, my DLL, which I baked in the oven about
30 minutes ago now, weighs in at about 850k while the pre-compiled Win32
download version looks to be about 620k.  The wife says I need to cut back
on calories, so I thought I might start with my hard drives and see how
that goes. ;)  I'm compiling under Bloodshed Dev-C++ v4.9.9.2.

The other thing I wanted to know is whether or not it'd be possible to get
source code snapshots of the amalgamation code starting from v3.0 (EDIT:
Seemingly 3.2?). or whatever this process started at starting at the major
build version 3, and get the incremental "published builds".  I've looked
at the timeline and at http://www.sqlite.org/cgi/src/brlist and I see that
the individual files have been modified and checked in, etc, but I don't
seem to be able to find sqlite3.c and sqlite.h, or even how I can download
a snapshot of the package of that revision even if it is just the
individual C files.

Is everything in the downloaded archive basically a merge (Think "copy
file1+file2+file3 file_final") of all of the required C files when a full
build to make the different pre-built packages are done?  I'm also a
complete newb when it comes to the version control being used for SQLite.
I know of version control software, and I use a Win32 only VCS, so have
absolutely zero clue on how to obtain a particular version of the code
through the means graciously provided.  If someone has the time, I'd
greatly appreciate a walk through on what to do, and what software has to
be put onto my dev machines.

I'm a pack rat when it comes to any and all software I come across, so what
I'm doing I think flies in the face of what my wife has asked me to do in
regards to calories. {smirk}
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] regexp.c

2013-06-07 Thread Philip Bennefall

  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 07, 2013 1:14 PM
  Subject: Re: [sqlite] regexp.c





  On Fri, Jun 7, 2013 at 5:30 AM, Philip Bennefall <phi...@blastbay.com> wrote:

Hello all,

I was having a quick look at the extensions provided in ext/misc in the 
source tree, and I am pleasantly surprised at the number of useful things that 
I found in there that I didn't know about. I was wondering about the regular 
expression parser in particular. Does it support capture groups? The source 
seems to indicate no, but I just wanted to make sure.


  Capture groups are not supported by the regexp.c implementation in ext/misc.  
On the other hand, that means that run-time is guaranteed linear in the size of 
the input.

  Thanks, Richard. That does make sense. On another note, generally how well 
tested are these extensions? Should I assume that since they're not part of the 
SqLite amalgamation, they aren't as extensively tested and maintained as the 
rest of SqLite's codebase?

  Kind regards,

  Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] regexp.c

2013-06-07 Thread Philip Bennefall

Hello all,

I was having a quick look at the extensions provided in ext/misc in the 
source tree, and I am pleasantly surprised at the number of useful things 
that I found in there that I didn't know about. I was wondering about the 
regular expression parser in particular. Does it support capture groups? The 
source seems to indicate no, but I just wanted to make sure.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-06 Thread Philip Bennefall

  - Original Message - 
  From: Donald Griggs 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Thursday, June 06, 2013 3:13 PM
  Subject: Re: [sqlite] Serialize an in-memory database



  Hi Philip,


  Maybe neither of these two thoughts are helpful, but fyi:


  1. Licensing for existing memory vfs.
  Regarding this memory vfs implementation referenced earlier:
 http://article.gmane.org/gmane.comp.db.sqlite.general/46450
 http://spserver.googlecode.com/files/spmemvfs-0.1.src.tar.gz
  Would it not be worth an email to the author ( gmail user  stephen.nil  ) to 
see if he might quickly release his code (already open source)  to public 
domain or another acceptable license?  




  2.  Performance of existing solutions.


  Regarding, "I would like to avoid saving the data out to a temporary disk 
file... That seems wasteful to me"
  I can certainly understand why you'd write that, but it's different from 
saying, "I've tested that solution (or put numerical bounds on its maximum 
likely performance) and found its performance to be unacceptable for my 
intended use."  


  Of course, using sqlite at all for your purpose (mainly to avoid writing 
custom sorts, as I understand) is wasteful in some sense of the word -- but I 
suspect its nevertheless an entirely appropriate application.


  One of Donald Knuth's famous quotes was, "Premature optimization is the root 
of all evil (or at least most of it) in programming."


  If there's a chance that's applicable here (maybe its not), then since the 
code to backup to a temp file is already present, would it be worth a try?


  Respectfully,
  Donald G.  (definitely NOT Donald K!)

  Hi Donald,

  You have several good points. Let's see if I can respond to them properly:

  1. I can definitely drop a line to the author and ask about the licensing. 
But one appealing part of the vfs idea is actually sitting down and learning 
enough to implement it myself. If I write the code, I will also be able to 
debug it much more easily and will learn something new to boot. I have no 
urgent need for a solution to this problem, so while I certainly would use an 
existing memory vfs if it was available in SqLite itself I would also enjoy the 
challenge of writing my own.

  2. Let me change the word wasteful to unnecessary. I certainly have no doubt 
that writing the temporary file and reading it back in would still be 
acceptable performance wise in my scenario, and I have no figures to prove 
otherwise. But it seems unnecessary to do so if an alternative method exists. 
Copying it from one memory location to another seems a lot more elegant if 
nothing else.

  Kind regards,

  Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-06 Thread Philip Bennefall


- Original Message - 
From: "Simon Slavin" <slav...@bigfraud.org>
To: <phi...@blastbay.com>; "General Discussion of SQLite Database" 
<sqlite-users@sqlite.org>

Sent: Thursday, June 06, 2013 1:45 PM
Subject: Re: [sqlite] Serialize an in-memory database



On 6 Jun 2013, at 10:45am, Philip Bennefall <phi...@blastbay.com> wrote:

I have a bunch of data structures in memory that I am looking to replace 
with an SqLite database, primarily for the purpose of avoiding reinventing 
the wheel with various sorts etc. I would then like to serialize the data 
into a memory buffer and do additional processing before finally rendering 
it to disk. The additional processing might include compression, 
encryption, and a few other things specific to my application.


Two problems:

Unlike the SQLite file format, the format SQLite uses when it keeps things 
in memory is not published, and changes from version to version.  Because 
the writers of SQLite expect the in-memory format to be accessed only by 
things built into the SQLite API, you have to read the source code to know 
what's going on.  So any routines you come up will have to just deal with 
whatever they find rather than trying to understand its structure.  Also 
your data will be able to restored only back to versions of SQLite where the 
internal data format hasn't changed.


SQLite does not, by its nature, keep everything in one long block of memory. 
It allocates and frees smaller blocks of memory as data is stored or 
deleted, and also as it needs to create temporary structures such as indexes 
needed to speed up a specific command.  So turning a stored database into 
one stream of octets takes more than just reading a section of memory.


Rather than try to mess with the internals of SQLite I suspect you would be 
better served by doing the following:


1) Using SQLite's existing in-memory databases to keep your data in memory 
while your app executes.


2) Writing your own routine in your preferred programming language to dump 
your data into text or octets in memory or disk in whatever format you want. 
One standard way to do this is to generate the SQL commands needed to 
reproduce your database.  Since these are very repetitive standard ASCII 
commands they compress down extremely well and you can do encryption at the 
same time using any of a number of standard libraries.  Data in this format 
has the added advantages that it is human-readable (after decompression) and 
can be passed straight to sqlite3_exec() to rebuild the database.  However, 
you might prefer to invent your own format, perhaps more like CSV, that 
makes implicit use of your data structures.


Simon.=

Hi Simon,

Oh I never intended to attempt to rip the data right out of an SqLite memory 
database. I realize that it is not at all the same as the disk file that I 
could create with, say, the backup API. I am considering two options:


1. Writing a memory vfs that I use when I want to save my data, backing up 
the existing in-memory database to a new database that uses this memory vfs 
and then taking the data from the resulting block where SqLite writes what 
it thinks is the database file.


2. Doing something like .dump in the shell, but writing the output to memory 
and then processing that. This seems to be the simplest approach, but would 
waste a lot of space and import/export would be slower as far as I can 
judge. This would primarily be the case if I export as SQL, as I would then 
not be able to reuse prepared statements with parameters but would have to 
use sqlite3_exec.


The memory vfs seems like the most appealing choice in the longterm, but the 
second approach is much more straightforward.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-06 Thread Philip Bennefall


- Original Message - 
From: "Simon Slavin" <slav...@bigfraud.org>
To: <phi...@blastbay.com>; "General Discussion of SQLite Database" 
<sqlite-users@sqlite.org>

Sent: Thursday, June 06, 2013 10:51 AM
Subject: Re: [sqlite] Serialize an in-memory database



On 6 Jun 2013, at 9:01am, Philip Bennefall <phi...@blastbay.com> wrote:

Since I don't believe that Windows for example has tmpfs (seems to be a 
Unix thing), would the idea of constructing a vfs that just reads and 
writes a huge memory block be doable?


Doable ?  Yes.  Use the code from one of the file-based VFSes and replace 
all the 'read/write to file offset N" with 'read/write to memory offset N', 
then sort out locking and a few other problems.  Tedious and annoying to do 
but doable if you write C.


Of advantage to many users ?  I don't know.  I don't see what the advantage 
of doing this is over SQLite's standard ways of storing data in memory or in 
a file.  Who would use this ?


Also, I have a question.  How big do you expect to make that block of memory 
you grab when someone creates a new database ?  One of the advantages of SQL 
databases is that they grow as you get more data.  You can't do this if 
you're going to pre-grab a continuous block of memory.  Do you expect to use 
the C function realloc() a lot ?


Simon.

Hi Simon,

For my own part, I would usually have a database that is no more than a few 
megabyte in size. A generic solution would be a lot harder than the one I am 
considering for my own project, where I can cut corners due to the fact that 
I know the size of my data at least roughly. What I want to achieve is to 
serialize the data in such a way so that I can do other processing on it 
before I render it to disk, such as custom compression and/or other things. 
I am aware that there is an SqLite add-on to do this, but aside from the 
fact that I cannot afford it I don't need to do this processing on the fly 
either. I just want to take an in-memory database and put it in a compressed 
and possibly encrypted file on disk in the end, without having to use a 
temporary file as an intermediary.


I write C, so would have no trouble modifying one of the existing vfs 
example implementations. Correct me if I am wrong, but do I really need to 
do any kind of locking if I am not working with disk files? I am not working 
with shared cache, either. I would have one database connection that would 
only be accessed from one thread.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-06 Thread Philip Bennefall


- Original Message - 
From: "Paolo Bolzoni" <paolo.bolzoni.br...@gmail.com>
To: <phi...@blastbay.com>; "General Discussion of SQLite Database" 
<sqlite-users@sqlite.org>

Sent: Thursday, June 06, 2013 10:33 AM
Subject: Re: [sqlite] Serialize an in-memory database



What is you use case? Why do you need this?
I am asking because maybe it helps thinking alternate solutions...



Hi Paolo,

I have a bunch of data structures in memory that I am looking to replace 
with an SqLite database, primarily for the purpose of avoiding reinventing 
the wheel with various sorts etc. I would then like to serialize the data 
into a memory buffer and do additional processing before finally rendering 
it to disk. The additional processing might include compression, encryption, 
and a few other things specific to my application. I would like to avoid 
saving the data out to a temporary disk file, reading it back in, doing my 
processing, writing it out into a new file and then finally deleting the 
temporary file. That seems wasteful to me, and so that's why I am looking 
into solutions that avoid the temporary files. So far, the memory vfs seems 
like the best approach.


Kind regards,

Philip Bennefall
P.S. I have looked at the encryption and compression add-ons for SqLite, but 
I don't need encryption/compression on the fly (just on the entire database 
in one go), and I don't have the money to purchase the code in the first 
place. 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-06 Thread Philip Bennefall


- Original Message - 
From: "Paolo Bolzoni" <paolo.bolzoni.br...@gmail.com>
To: <phi...@blastbay.com>; "General Discussion of SQLite Database" 
<sqlite-users@sqlite.org>

Sent: Thursday, June 06, 2013 10:02 AM
Subject: Re: [sqlite] Serialize an in-memory database



Sorry I am missing a bit,
What is the problem of using sqlite3_backup again?



Hi Paolo,

I would like to avoid using a temporary file, but rather just save and load 
the database as a memory block. Serialize to and from memory, in other 
words.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-06 Thread Philip Bennefall


- Original Message - 
From: "Simon Slavin" <slav...@bigfraud.org>
To: <phi...@blastbay.com>; "General Discussion of SQLite Database" 
<sqlite-users@sqlite.org>

Sent: Thursday, June 06, 2013 12:15 AM
Subject: Re: [sqlite] Serialize an in-memory database



On 5 Jun 2013, at 8:38pm, Philip Bennefall <phi...@blastbay.com> wrote:


On 5 Jun 2013, at 8:32pm, Petite Abeille <petite.abei...@gmail.com> wrote:

write to tmpfs… read the file into byte[]… do what you meant to do… to 
reload…  write byte[] do tmpfs… open db… and be merry… or something along 
these lines...


I don't want it in a file, however. I want it in a memory block.


That's why you read from tmpfs (or any other file stored in any other file 
system) into byte[].  Once your data is in byte[] you will have entire 
SQLite database in one run of memory.


You can't usefully store a memory database of SQLite because SQLite's data 
in memory isn't all in one big run of memory.  The data is stored in various 
little chunks, some here, some there. If you tried to read the data directly 
out of those chunks you would have to read lots of little chunks, not one 
big run of continuous memory.  And you'd be storing lots of pointers to 
various locations in memory.  When you 'restore' the data back into memory 
you're not going to be allocated the same locations in memory so those 
pointers won't mean anything any more.


A database stored in a file, however, has pointers to locations in that file 
instead of pointers to locations in memory.  If you save and restore the 
whole file in one big run, those pointers will become valid again: the same 
bits of data will be at the same offsets of the file.


Doesn't have to be tmpfs.  You can use any file system that SQLite thinks is 
file storage rather than memory storage.


Simon.

Hi Simon,

Since I don't believe that Windows for example has tmpfs (seems to be a Unix 
thing), would the idea of constructing a vfs that just reads and writes a 
huge memory block be doable? If so, how difficult of a task do you estimate 
that this might be? I want to reuse as much of the existing vfs code as 
possible (e.g. I don't want to reimplement randomness, date etc). Could you 
possibly give me some pointers? I read the chapter about the virtual file 
systems, but it seems incomplete.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Petite Abeille" <petite.abei...@gmail.com>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 10:04 PM
Subject: Re: [sqlite] Serialize an in-memory database



On Jun 5, 2013, at 10:02 PM, Philip Bennefall <phi...@blastbay.com> wrote:

That is exactly the sort of thing I am looking for. If anything, I think 
it'd be great to have such a vfs in SqLite if only for 
completeness/customizability, seeing as how there are so many different 
allocators for example. It'd be great to have the same level of freedom 
with the vfs backends. Of course, I want to use the existing methods in 
the native vfs for randomization, time etc - just override the file I/O 
methods to operate on a large byte array/blob.


memvfs?

http://article.gmane.org/gmane.comp.db.sqlite.general/46450

I looked into that one, but the license is unclear. There is a copyright 
notice in the source but no explicit license. I need something public 
domain, just like SqLite is. Otherwise I will attempt to roll my own.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Jay A. Kreibich" <j...@kreibi.ch>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 9:55 PM
Subject: Re: [sqlite] Serialize an in-memory database


On Wed, Jun 05, 2013 at 09:15:21PM +0200, Petite Abeille scratched on the 
wall:


On Jun 5, 2013, at 9:10 PM, Philip Bennefall <phi...@blastbay.com> wrote:

> Yes, I have seen the backup API. But I would like to avoid the disk
> file entirely and just serialize to and from memory.

Lateral thinking? write your db to tmpfs?


 A few months back, wasn't there talk about a VFS that used a giant
 byte array, rather than a file?  If someone actually wrote one of
 those, you could use the Backup API to blast the DB into a big chunk
 of memory.

 If such a VFS does not actually exist, it shouldn't be all that hard
 to write, and might come in useful for this and other reasons.

  -j

Hi there,

That is exactly the sort of thing I am looking for. If anything, I think 
it'd be great to have such a vfs in SqLite if only for 
completeness/customizability, seeing as how there are so many different 
allocators for example. It'd be great to have the same level of freedom with 
the vfs backends. Of course, I want to use the existing methods in the 
native vfs for randomization, time etc - just override the file I/O methods 
to operate on a large byte array/blob.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Petite Abeille" <petite.abei...@gmail.com>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 9:54 PM
Subject: Re: [sqlite] Serialize an in-memory database



On Jun 5, 2013, at 9:44 PM, Philip Bennefall <phi...@blastbay.com> wrote:


I use Windows. This looks like it is purely for Unix variants?


I suspect one call these 'RAM disk/drive' as well...

http://en.wikipedia.org/wiki/List_of_RAM_drive_software

I need something that operates wherever SqLite does, so can't be system 
dependent.


SQLite runs on pretty much anything. Are you targeting, say, watches?

No. All I am saying is that I don't want to limit myself to platforms that 
implement this particular feature. I would prefer a generic solution that 
works everywhere, such as a regular malloc:ed chunk that works like a vfs in 
SqLite. 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Petite Abeille" <petite.abei...@gmail.com>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 9:39 PM
Subject: Re: [sqlite] Serialize an in-memory database



On Jun 5, 2013, at 9:38 PM, Philip Bennefall <phi...@blastbay.com> wrote:

I don't want it in a file, however. I want it in a memory block. So tmpfs 
wouldn't do the trick from what I gather.


… tmpfs *is* memory… just looks like a file system…

http://en.wikipedia.org/wiki/Tmpfs

I use Windows. This looks like it is purely for Unix variants? I need 
something that operates wherever SqLite does, so can't be system dependent. 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Petite Abeille" <petite.abei...@gmail.com>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 9:32 PM
Subject: Re: [sqlite] Serialize an in-memory database



On Jun 5, 2013, at 9:25 PM, Philip Bennefall <phi...@blastbay.com> wrote:

Doesn't that still create a file, just a temporary one? I need the 
serialized content in a char* or similar so I can memcpy it etc, and then 
feed it back to SqLite at a later time. I guess I could make a toy vfs 
that uses a malloc:ed chunk that pretends to be the disk drive, and backup 
to/from that to a regular in-memory database. Thoughts?


Yes… write to tmpfs… read the file into byte[]… do what you meant to do… to 
reload…  write byte[] do tmpfs… open db… and be merry… or something along 
these lines...


I don't want it in a file, however. I want it in a memory block. So tmpfs 
wouldn't do the trick from what I gather.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Roman Fleysher" <roman.fleys...@einstein.yu.edu>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 9:26 PM
Subject: Re: [sqlite] Serialize an in-memory database



Read section on URI Filemanes, particularly mode for memory databases:

http://www.sqlite.org/c3ref/open.html

DB Connection in backup API does not have to point to a file, it can point 
to in-memory database if URIs are enabled.


(I learned it from someone else on the list, i use SQLite for less than a 
month.)


Roman


From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Petite Abeille [petite.abei...@gmail.com]

Sent: Wednesday, June 05, 2013 3:15 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Serialize an in-memory database

On Jun 5, 2013, at 9:10 PM, Philip Bennefall <phi...@blastbay.com> wrote:

Yes, I have seen the backup API. But I would like to avoid the disk file 
entirely and just serialize to and from memory.


Lateral thinking… write your db to tmpfs…

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Hi Roman,

Oh I know that I can copy content from one in-memory database to another. 
That's trivial with the backup API. My concern is more that I want to copy 
it to a blob. In short, the contents that would have been fed to the file - 
I want that in a memory buffer. I also want to be able to feed that back 
into SqLite at a later time, probably backing it up to a regular in-memory 
database again. The malloc vfs is the only solution I can come up with, but 
it seems overkill. I am hoping there is a cleaner way.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall


- Original Message - 
From: "Petite Abeille" <petite.abei...@gmail.com>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, June 05, 2013 9:15 PM
Subject: Re: [sqlite] Serialize an in-memory database



On Jun 5, 2013, at 9:10 PM, Philip Bennefall <phi...@blastbay.com> wrote:

Yes, I have seen the backup API. But I would like to avoid the disk file 
entirely and just serialize to and from memory.


Lateral thinking… write your db to tmpfs…

Doesn't that still create a file, just a temporary one? I need the 
serialized content in a char* or similar so I can memcpy it etc, and then 
feed it back to SqLite at a later time. I guess I could make a toy vfs that 
uses a malloc:ed chunk that pretends to be the disk drive, and backup 
to/from that to a regular in-memory database. Thoughts?


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall
- Original Message - 
From: "Andreas Kupries" <andre...@activestate.com>
To: <phi...@blastbay.com>; "General Discussion of SQLite Database" 
<sqlite-users@sqlite.org>

Sent: Wednesday, June 05, 2013 9:07 PM
Subject: Re: [sqlite] Serialize an in-memory database


On Wed, Jun 5, 2013 at 11:55 AM, Philip Bennefall <phi...@blastbay.com> 
wrote:

Hello all,

This may be a somewhat strange question, but I can't find an answer to it 
on

the website so I figured I would give it a shot.

Is it possible to put the full contents of an SqLite in-memory database 
into

a string/blob in memory?


If you can work with a file instead of string the backup/restore APIs
should be able to do what you want, i.e. backup of a live database to
a file, and back, and the live database can be in-memory, of course.

http://www.sqlite.org/backup.html

   See example 1.

http://www.sqlite.org/c3ref/backup_finish.html#sqlite3backupinit

Hi Andreas,

Yes, I have seen the backup API. But I would like to avoid the disk file 
entirely and just serialize to and from memory.


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Serialize an in-memory database

2013-06-05 Thread Philip Bennefall

Hello all,

This may be a somewhat strange question, but I can't find an answer to it on 
the website so I figured I would give it a shot.


Is it possible to put the full contents of an SqLite in-memory database into 
a string/blob in memory? I would then like to take the same string and 
convert that back into an in-memory database later, if that makes any sense. 
Serialization of the database, basically. Is there any reasonable, non 
hack-ish way of doing this?


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Wiki out of date?

2013-06-04 Thread Philip Bennefall

Hello,

I was looking at the following page on the wiki:

http://www.sqlite.org/cvstrac/wiki?p=LibraryRoutineCalledOutOfSequence


And it states that one of the causes for SQLITE_MISUSE being returned is:

"Trying to use the same database connection at the same instant in time from 
two or more threads."


Isn't this outdated? Doesn't this depend on the SqLite threading mode 
nowadays?


Kind regards,

Philip Bennefall 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] vfs?

2012-06-17 Thread Philip Bennefall
In that case you want to specify a NULL pointer for that argument. Then 
SqLite will pick an appropriate vfs to use for whatever platform it's being 
run on.


Kind regards,

Philip Bennefall
- Original Message - 
From: "Arbol One" <arbol...@programmer.net>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Sunday, June 17, 2012 8:46 PM
Subject: Re: [sqlite] vfs?


Thanks for your prompt respnse.
Well, no, really. I don't know where my application will be used. Maybe 
someone will use it under linux or OS2, all I know is that now I am coding 
in Win32.


- Original Message -
From: Igor Tandetnik
Sent: 06/17/12 02:02 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] vfs?

Arbol One <arbol...@programmer.net> wrote: > Using C/C++, is there a way to 
extract which file system is SQLite using? You already know that - it's the 
one you've specified in a call to sqlite3_open_v2. What problem are you 
trying to solve? -- Igor Tandetnik 
___ sqlite-users mailing list 
sqlite-users@sqlite.org 
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Strange behavior with fts4

2012-06-16 Thread Philip Bennefall
I hate to be cluttering up the list in this fashion, but I have come across an 
issue that I cannot seem to find a solution for.

I am using two fts tables, one that uses the normal tokenizer and another that 
uses the porter stemmer, so that I can search the same dataset with and without 
porter. For the porter stemmer table, I have set the content option to point to 
the other fts table. Like this:

CREATE VIRTUAL TABLE main_brain USING fts4(phrase, response, matchinfo=fts3, 
tokenize=simple, order=desc);

CREATE VIRTUAL TABLE porter_brain USING fts4(content="main_brain", phrase, 
response, matchinfo=fts3, tokenize=porter, order=desc);

 Then I do the following:

INSERT INTO main_brain(phrase, response) VALUES('hello', 'Hi there!');

So you'd now think that main_brain should have this content in it, but the 
porter_brain table should be empty. At least, that is what the documentation on 
fts4 seems to indicate as it points out that it is my own responsibility to 
make sure the tables are in sync. But:

SELECT * FROM porter_brain;

Produces:

hello|Hi there!

And:

SELECT * FROM main_brain;

Gives the same result. So it seems as though some internal synchronization 
between these tables is going on. How should I handle this? The documentation 
suggests a statement like the following in one of its examples:

INSERT INTO t3(docid, b, c) SELECT id, b, c FROM t2;

Where I guess t3 would correspond to porter_brain and t2 would be main_brain in 
my case. But I don't seem to need to do this at all.

Can anyone throw some light on this?

Thanks in advance.

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
Understood. Thank you very much for your quick help. Now I have all the 
information I need to get coding. And thanks once again for a great library!

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 3:39 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 9:26 AM, Philip Bennefall <phi...@blastbay.com> wrote:

I understand that, but let's say that I already have a virtual fts table 
created that I set to use the Porter tokenizer, how then would I go about 
rebuilding and retokenizing this table with the simple tokenizer at a later 
time? Would I need to create an entirely new table? What I'm wondering is 
basically how I might take an existing fts virtual table, change its tokenizer 
and then rebuild the index?


  Yes.  You'll need to DROP or RENAME the original table, then CREATE the new 
one.
   

Kind regards,

Philip Bennefall
 - Original Message -
 From: Richard Hipp
 To: phi...@blastbay.com ; General Discussion of SQLite Database

 Sent: Friday, June 15, 2012 3:14 PM
 Subject: Re: [sqlite] Porter Stemmer





 On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall <phi...@blastbay.com> 
wrote:

   I had another quick question. If I have built an fts table using the 
stemmer tokenizer, and then I later decide that I want to change to the simple 
one, is there an easy way to do this? I see the "rebuild" command, can I 
somehow tell that to change the tokenizer as well? I see the reference to 
custom ones, but what about the internal implementations?


 If you change your tokenizer, you need to retokenize all of the source 
text.



   Kind regards,

   Philip Bennefall
- Original Message -
From: Richard Hipp
To: phi...@blastbay.com ; General Discussion of SQLite Database
Sent: Friday, June 15, 2012 1:03 PM
Subject: Re: [sqlite] Porter Stemmer






On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> 
wrote:

  Hi all,

  Is the algorithm used in the stemming tokenizer in SqLite's fts 
extension equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


The built-in Porter stemmer is a copy/paste from the above link.



  ?

  I am asking this because some sources say that there are improved 
versions of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

      Kind regards,

  Philip Bennefall
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




--
D. Richard Hipp
d...@sqlite.org
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




 --
 D. Richard Hipp
 d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
I understand that, but let's say that I already have a virtual fts table 
created that I set to use the Porter tokenizer, how then would I go about 
rebuilding and retokenizing this table with the simple tokenizer at a later 
time? Would I need to create an entirely new table? What I'm wondering is 
basically how I might take an existing fts virtual table, change its tokenizer 
and then rebuild the index?

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 3:14 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 9:00 AM, Philip Bennefall <phi...@blastbay.com> wrote:

I had another quick question. If I have built an fts table using the 
stemmer tokenizer, and then I later decide that I want to change to the simple 
one, is there an easy way to do this? I see the "rebuild" command, can I 
somehow tell that to change the tokenizer as well? I see the reference to 
custom ones, but what about the internal implementations?


  If you change your tokenizer, you need to retokenize all of the source text.

   

Kind regards,

Philip Bennefall
 - Original Message -
 From: Richard Hipp
 To: phi...@blastbay.com ; General Discussion of SQLite Database
 Sent: Friday, June 15, 2012 1:03 PM
 Subject: Re: [sqlite] Porter Stemmer






 On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> 
wrote:

   Hi all,

   Is the algorithm used in the stemming tokenizer in SqLite's fts 
extension equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


 The built-in Porter stemmer is a copy/paste from the above link.



   ?

   I am asking this because some sources say that there are improved 
versions of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

   Kind regards,

   Philip Bennefall
   ___
   sqlite-users mailing list
   sqlite-users@sqlite.org
   http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




 --
 D. Richard Hipp
 d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
I had another quick question. If I have built an fts table using the stemmer 
tokenizer, and then I later decide that I want to change to the simple one, is 
there an easy way to do this? I see the "rebuild" command, can I somehow tell 
that to change the tokenizer as well? I see the reference to custom ones, but 
what about the internal implementations?

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 1:03 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> wrote:

Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts extension 
equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


  The built-in Porter stemmer is a copy/paste from the above link.

   

?

I am asking this because some sources say that there are improved versions 
of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
Thanks, Richard. That's good to know because I am trying to decide whether to 
add a new tokenizer with some custom processing, as opposed to using the built 
in stemmer.

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Richard Hipp 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Friday, June 15, 2012 1:03 PM
  Subject: Re: [sqlite] Porter Stemmer





  On Fri, Jun 15, 2012 at 5:51 AM, Philip Bennefall <phi...@blastbay.com> wrote:

Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts extension 
equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/


  The built-in Porter stemmer is a copy/paste from the above link.

   

?

I am asking this because some sources say that there are improved versions 
of this algorithm released much later than 2000/2001. Does SqLite's 
implementation differ in any significant ways from the C implementation found 
at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users




  -- 
  D. Richard Hipp
  d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Porter Stemmer

2012-06-15 Thread Philip Bennefall
Hi all,

Is the algorithm used in the stemming tokenizer in SqLite's fts extension 
equivalent to the C implementation found at 
http://tartarus.org/~martin/PorterStemmer/

?

I am asking this because some sources say that there are improved versions of 
this algorithm released much later than 2000/2001. Does SqLite's implementation 
differ in any significant ways from the C implementation found at the above URL?

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full text search without full phrase matches

2012-06-14 Thread Philip Bennefall
The main issue I have is that the Apache license, as well as the gpl/lgpl 
licenses, force you to include the entire license with any redistribution of 
the software (even if it is compiled as part of a derivative work) if my 
understanding is correct. I am not too keen on doing that. I certainly don't 
mind giving credit where credit is due, but I don't want to include this huge 
blob of legal text. This is why I like public domain software so much, as well 
as software distributed under licenses such as the Boost Software license, the 
Zlib/Libpng license and one or two others that are similar.

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Black, Michael (IS) 
  To: phi...@blastbay.com 
  Sent: Thursday, June 14, 2012 9:34 PM
  Subject: Re: [sqlite] Full text search without full phrase matches


  Apache license is about as liberal as you can get.  Damned near public domain 
(may as well be for all intents and purposes).



  Muy I ask what the problem is?  Perhaps something I should aware of?





  Michael D. Black

  Senior Scientist

  Advanced Analytics Directorate

  Advanced GEOINT Solutions Operating Unit

  Northrop Grumman Information Systems


--

  From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Philip Bennefall [phi...@blastbay.com]
  Sent: Thursday, June 14, 2012 2:09 PM
  To: General Discussion of SQLite Database
  Subject: EXT :Re: [sqlite] Full text search without full phrase matches


  Hi Michael,

  That seems to be under either gpl/lgpl/apache licenses, which I cannot use in 
my project for various reasons. The reason I am so interested in SqLite is 
because it's public domain. I appreciate the tip though.

  Kind regards,

  Philip Bennefall
- Original Message - 
From: Black, Michael (IS) 
To: phi...@blastbay.com ; General Discussion of SQLite Database 
Sent: Thursday, June 14, 2012 9:03 PM
Subject: Re: [sqlite] Full text search without full phrase matches


Sounds to me like you want Lucene instead of SQLite

http://lucene.apache.org/core/



Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems


  --

From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Philip Bennefall [phi...@blastbay.com]
Sent: Thursday, June 14, 2012 1:32 PM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Full text search without full phrase matches


Hi Simon,

The ordering is not really the issue I am having. That, I can do if I just 
get a result back that doesn't necessarily match all the keywords. In the 
query you showed as an example, all the keywords would still have to match 
in order for a row to be returned. The sorting is a separate problem that 
is 
not really that difficult once I get a smaller dataset. Then I can order it 
manually. The problem is that it only returns a match if every single word 
is present. I would like it to return matches if, say, mor than 2 or 3 of 
the specified keywords are found.

Kind regards,

Philip Bennefall
- Original Message - 
From: "Simon Slavin" <slav...@bigfraud.org>
To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Thursday, June 14, 2012 8:24 PM
Subject: Re: [sqlite] Full text search without full phrase matches



On 14 Jun 2012, at 7:13pm, Philip Bennefall <phi...@blastbay.com> wrote:

> That is unfortunate, if it is true that there's no way to accomplish this 
> with SqLite. To do just plain matching I can use an unordered hash map, 
so 
> I wouldn't need a database for that. The trouble with a string distance 
> function is that I can't really process the entire dataset with it. 
SqLite 
> technically has all the features I'm after, I just don't want it to 
> necessarily match all the words in a query. If I can get it to match all 
> as well as some, that would be enough. I could then do distancing on a 
> considerably smaller dataset which would be the result of the broader 
> SqLite search.
>
> So I guess my main question is, is there absolutely no way to match a 
> subset of the words in a query?

Well, you could write that string distance function and add it to your copy 
of SQLite as an external function.  Then you could do things like

SELECT string_distance(theText, 'this new piece of text'), theText FROM 
oldChats WHERE string_distance(theText, 'this new piece of text') < 10 
ORDER 
BY string_distance(theText, 'this new piece of text')

(I don't know whether SQLite will o

Re: [sqlite] Full text search without full phrase matches

2012-06-14 Thread Philip Bennefall
Hi Michael,

That seems to be under either gpl/lgpl/apache licenses, which I cannot use in 
my project for various reasons. The reason I am so interested in SqLite is 
because it's public domain. I appreciate the tip though.

Kind regards,

Philip Bennefall
  - Original Message - 
  From: Black, Michael (IS) 
  To: phi...@blastbay.com ; General Discussion of SQLite Database 
  Sent: Thursday, June 14, 2012 9:03 PM
  Subject: Re: [sqlite] Full text search without full phrase matches


  Sounds to me like you want Lucene instead of SQLite

  http://lucene.apache.org/core/



  Michael D. Black

  Senior Scientist

  Advanced Analytics Directorate

  Advanced GEOINT Solutions Operating Unit

  Northrop Grumman Information Systems


--

  From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Philip Bennefall [phi...@blastbay.com]
  Sent: Thursday, June 14, 2012 1:32 PM
  To: General Discussion of SQLite Database
  Subject: EXT :Re: [sqlite] Full text search without full phrase matches


  Hi Simon,

  The ordering is not really the issue I am having. That, I can do if I just 
  get a result back that doesn't necessarily match all the keywords. In the 
  query you showed as an example, all the keywords would still have to match 
  in order for a row to be returned. The sorting is a separate problem that is 
  not really that difficult once I get a smaller dataset. Then I can order it 
  manually. The problem is that it only returns a match if every single word 
  is present. I would like it to return matches if, say, mor than 2 or 3 of 
  the specified keywords are found.

  Kind regards,

  Philip Bennefall
  - Original Message - 
  From: "Simon Slavin" <slav...@bigfraud.org>
  To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
  Sent: Thursday, June 14, 2012 8:24 PM
  Subject: Re: [sqlite] Full text search without full phrase matches



  On 14 Jun 2012, at 7:13pm, Philip Bennefall <phi...@blastbay.com> wrote:

  > That is unfortunate, if it is true that there's no way to accomplish this 
  > with SqLite. To do just plain matching I can use an unordered hash map, so 
  > I wouldn't need a database for that. The trouble with a string distance 
  > function is that I can't really process the entire dataset with it. SqLite 
  > technically has all the features I'm after, I just don't want it to 
  > necessarily match all the words in a query. If I can get it to match all 
  > as well as some, that would be enough. I could then do distancing on a 
  > considerably smaller dataset which would be the result of the broader 
  > SqLite search.
  >
  > So I guess my main question is, is there absolutely no way to match a 
  > subset of the words in a query?

  Well, you could write that string distance function and add it to your copy 
  of SQLite as an external function.  Then you could do things like

  SELECT string_distance(theText, 'this new piece of text'), theText FROM 
  oldChats WHERE string_distance(theText, 'this new piece of text') < 10 ORDER 
  BY string_distance(theText, 'this new piece of text')

  (I don't know whether SQLite will optimise that to avoid executing the same 
  function many times, or whether you can name a column and use that name to 
  do the same thing yourself.)

  Here's the documentation for external functions:

  <http://www.sqlite.org/c3ref/create_function.html>

  Simon.
  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 

  ___
  sqlite-users mailing list
  sqlite-users@sqlite.org
  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full text search without full phrase matches

2012-06-14 Thread Philip Bennefall

Hi Simon,

The ordering is not really the issue I am having. That, I can do if I just 
get a result back that doesn't necessarily match all the keywords. In the 
query you showed as an example, all the keywords would still have to match 
in order for a row to be returned. The sorting is a separate problem that is 
not really that difficult once I get a smaller dataset. Then I can order it 
manually. The problem is that it only returns a match if every single word 
is present. I would like it to return matches if, say, mor than 2 or 3 of 
the specified keywords are found.


Kind regards,

Philip Bennefall
- Original Message - 
From: "Simon Slavin" <slav...@bigfraud.org>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Thursday, June 14, 2012 8:24 PM
Subject: Re: [sqlite] Full text search without full phrase matches



On 14 Jun 2012, at 7:13pm, Philip Bennefall <phi...@blastbay.com> wrote:

That is unfortunate, if it is true that there's no way to accomplish this 
with SqLite. To do just plain matching I can use an unordered hash map, so 
I wouldn't need a database for that. The trouble with a string distance 
function is that I can't really process the entire dataset with it. SqLite 
technically has all the features I'm after, I just don't want it to 
necessarily match all the words in a query. If I can get it to match all 
as well as some, that would be enough. I could then do distancing on a 
considerably smaller dataset which would be the result of the broader 
SqLite search.


So I guess my main question is, is there absolutely no way to match a 
subset of the words in a query?


Well, you could write that string distance function and add it to your copy 
of SQLite as an external function.  Then you could do things like


SELECT string_distance(theText, 'this new piece of text'), theText FROM 
oldChats WHERE string_distance(theText, 'this new piece of text') < 10 ORDER 
BY string_distance(theText, 'this new piece of text')


(I don't know whether SQLite will optimise that to avoid executing the same 
function many times, or whether you can name a column and use that name to 
do the same thing yourself.)


Here's the documentation for external functions:

<http://www.sqlite.org/c3ref/create_function.html>

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full text search without full phrase matches

2012-06-14 Thread Philip Bennefall

Hi Simon,

That is unfortunate, if it is true that there's no way to accomplish this 
with SqLite. To do just plain matching I can use an unordered hash map, so I 
wouldn't need a database for that. The trouble with a string distance 
function is that I can't really process the entire dataset with it. SqLite 
technically has all the features I'm after, I just don't want it to 
necessarily match all the words in a query. If I can get it to match all as 
well as some, that would be enough. I could then do distancing on a 
considerably smaller dataset which would be the result of the broader SqLite 
search.


So I guess my main question is, is there absolutely no way to match a subset 
of the words in a query?


Kind regards,

Philip Bennefall
- Original Message - 
From: "Simon Slavin" <slav...@bigfraud.org>

To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Thursday, June 14, 2012 7:24 PM
Subject: Re: [sqlite] Full text search without full phrase matches



On 14 Jun 2012, at 6:12pm, Philip Bennefall <phi...@blastbay.com> wrote:

The trouble I have is that in my query, all the keywords don't necessarily 
have to be present in order for a successful match to be made. SqLite's 
fts only seems to match if all the keywords are present, which I don't 
require.


You will have to do some of this in your own programming.  In fact you may 
end up doing all of it in your own programming and using the FTS feature 
only to match single words and single word-fragments.


The usual way to do this is to define a 'distance' metric for comparing two 
strings and finding how far apart they are.  A score of 0 means they match 
exactly.  Numbers bigger than a certain amount don't matter: anything bigger 
than, say, 100 means they're not at all alike.


Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full text search without full phrase matches

2012-06-14 Thread Philip Bennefall

Hi Paul, and thank you for your reply.

The trouble I have is that in my query, all the keywords don't necessarily 
have to be present in order for a successful match to be made. SqLite's fts 
only seems to match if all the keywords are present, which I don't require.


I am not familiar with Perl, but am working exclusively in C++.

The input I am processing is arbitrary, and so is the data that I am 
searching through in the index. The incoming data is user messages, and the 
index contains old messages that the robot has given to users (stemmed and 
stripped in various ways to make matches more probable), and then there's 
another column which contains an appropriate answer if that query is 
matched. I want it to match as many keywords as possible but not necessarily 
all, and order by:
1. How many keywords were matched, with some minimum threshold below which 
no match is made.

2. How well the ordering matched.

Do you have any tips?

Kind regards,

Philip Bennefall
- Original Message - 
From: <pc...@sympatico.ca>

To: <sqlite-users@sqlite.org>
Sent: Thursday, June 14, 2012 7:01 PM
Subject: [sqlite] Full text search without full phrase matches


I had to implement something like this for comparing passages from statutes 
(see the Introduction in Douglas Hay and Paul Craven, *Masters, Servants and 
Magistrates in Britain and the Empire, 1562-1955* [UNCP Press, 2004] for an 
illustration).


You need to isolate the keywords, in whatever order, count them, and measure 
the distances (number of words) between them.  SqLite is great for managing 
the tables of keywords, the lists of texts that contain them, and tables of 
distances.  But it is not the optimal tool for breaking down the texts and 
extracting the keywords and distances.  I used Perl for this job, and found 
that I could easily adapt recipes from the Perl Cookbook and similar 
repositories to build my routines.  I wrote the disaggregated lists of 
keywords, distances and texts as sql tables and analysed them in SqLite.


Paul Craven
York University

--

Date: Wed, 13 Jun 2012 23:09:35 +0200
From: Philip Bennefall <phi...@blastbay.com>
To: <sqlite-users@sqlite.org>
Subject: [sqlite] Full text search without full phrase matches
Message-ID: <A12309DB130E42BBA0590D664F66922A@chicken>
Content-Type: text/plain; charset="iso-8859-1"

Hi all,

I am new to this maling list and to SqLite, so I wanted to start by thanking 
all of those who make this project a reality. It is a great tool.


Now, to my question. I am trying to use the full text search feature to find 
rough matches for a chat robot. Basically I want to match as many keywords 
as possible, but not necessarily all of them. The results should be sorted 
based on how many keywords were found in the phrase and how closely ordered 
they are to the query. In other words the ordering doesn't have to be exact, 
but the closer it is, the higher the result should rank. Similarly, even if 
only one or two words in the phrase are found it should match, but rank 
higher the more of the words that are present. I have read the reference and 
I see the NEAR statement and the matchinfo function, as well as the example 
of how to use it, but I cannot figure out how to apply this knowledge to my 
specific problem. Does anyone have any suggestions?


Thanks in advance for your help.

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Full text search without full phrase matches

2012-06-13 Thread Philip Bennefall
Hi all,

I am new to this maling list and to SqLite, so I wanted to start by thanking 
all of those who make this project a reality. It is a great tool.

Now, to my question. I am trying to use the full text search feature to find 
rough matches for a chat robot. Basically I want to match as many keywords as 
possible, but not necessarily all of them. The results should be sorted based 
on how many keywords were found in the phrase and how closely ordered they are 
to the query. In other words the ordering doesn't have to be exact, but the 
closer it is, the higher the result should rank. Similarly, even if only one or 
two words in the phrase are found it should match, but rank higher the more of 
the words that are present. I have read the reference and I see the NEAR 
statement and the matchinfo function, as well as the example of how to use it, 
but I cannot figure out how to apply this knowledge to my specific problem. 
Does anyone have any suggestions?

Thanks in advance for your help.

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users