Re: [sqlite] How to index data based on custom comparisons?

2017-12-14 Thread Lifepillar

On 14/12/2017 13:14, Richard Hipp wrote:

On 12/14/17, Lifepillar  wrote:


I am not familiar with virtual tables yet, but I see that they are used,
for example, to implement Rtree indexes. Would it be feasible to
implement my own index structure as a virtual table and use it to index
a blob column in a standard table (or even just in the virtual table
itself)?


That would be complicated.


So, it is possible :)


A different idea.  Suppose you have two new UDFs:

ieee754dec(X):  Converts IEEE754-binary number X into IEEE754-decimal.
In other words it takes a "double" input and returns a "blob" output.

ieee754bin(Y):  Converts IEEE754-decimal blob Y and converts it into
IEEE754-binary.

Both routines are approximate because most IEEE754-binary values do
not have an exact equivalent IEEE754-decimal representation and vice
versa.  Your UDFs would need to find something very close.

Given these routines, you could then index your IEEE754-decimal
columns by doing an index on an expression using the new iee754bin()
function.


Thanks, that's another possibility to consider, although one typically
uses decimal values when exactness is needed; using your scheme
requires some care, I think, for example not to miss matches because of
approximations or to filter spurious matches away.

Somehow, I wish SQLite had something like PostgreSQL GiST indexes... ;)

Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to index data based on custom comparisons?

2017-12-14 Thread Lifepillar

On 13/12/2017 22:20, Simon Slavin wrote:


On 13 Dec 2017, at 8:34pm, Lifepillar  wrote:


But, (correct me if
I am wrong), if I index the blob column directly, comparisons are
based on memcpy(), which in my case is not what I want. Is it
possible to create an index that somehow uses a custom comparison
function instead? E.g., I have a deccmp(x,y) function that returns
-1 if xy. Can I define an index based on
that?


As Dr H wrote, it can’t be done.  Either store a normalised (numeric) version 
of the number, or store both the BLOB and a normalised version.


Thanks for the suggestion. Storing a normalized version side by side 
with the unnormalized number would work, but it would double the used 
space.  Storing just the normalized number would lose information, though.


Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to index data based on custom comparisons?

2017-12-14 Thread Richard Hipp
On 12/14/17, Lifepillar  wrote:

> I am not familiar with virtual tables yet, but I see that they are used,
> for example, to implement Rtree indexes. Would it be feasible to
> implement my own index structure as a virtual table and use it to index
> a blob column in a standard table (or even just in the virtual table
> itself)?

That would be complicated.

A different idea.  Suppose you have two new UDFs:

ieee754dec(X):  Converts IEEE754-binary number X into IEEE754-decimal.
In other words it takes a "double" input and returns a "blob" output.

ieee754bin(Y):  Converts IEEE754-decimal blob Y and converts it into
IEEE754-binary.

Both routines are approximate because most IEEE754-binary values do
not have an exact equivalent IEEE754-decimal representation and vice
versa.  Your UDFs would need to find something very close.

Given these routines, you could then index your IEEE754-decimal
columns by doing an index on an expression using the new iee754bin()
function.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to index data based on custom comparisons?

2017-12-14 Thread Lifepillar

On 14/12/2017 00:02, Keith Medcalf wrote:


On Wednesday, 13 December, 2017 13:35, Lifepillar  
wrote:


I am implementing an extension for manipulating IEEE754 decimal
numbers. Numbers are stored as blobs using a standard encoding.
Numbers that are mathematically equal may have different
representations, (e.g., 1.0 may have mantissa 10 and exponent -1
while 1.00 may have mantissa 100 and exponent -2).


You have stated something that is impossible, or at least self-contradictory.  Unless, of course, 
you are talking about the "decimal" formats of IEEE754-2008 and not the standard (far 
more common) "binary" formats.


Yes, I am talking about decimal IEEE754-2008 format, not binary. I 
thought this would be clear from my use of the word "decimal" and from 
my example.



On the other hand however if you do NOT need binary64 at all, then there was a minor 
change discussed a while back by someone else where you can "change" the 
default floating-point number format from binary64 to decimal64 and then compile your own 
custom version of SQLite3 ...


Thanks for the tip! I will search for that discussion, although I would 
rather not modify SQlite3 source code if I can find another solution.


I am not familiar with virtual tables yet, but I see that they are used, 
for example, to implement Rtree indexes. Would it be feasible to 
implement my own index structure as a virtual table and use it to index
a blob column in a standard table (or even just in the virtual table 
itself)? I mean, would the optimizer be able to take advantage of such 
an index?


(Sorry if my questions sound naive, I'm still pretty new to SQLite.)

Life.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to index data based on custom comparisons?

2017-12-13 Thread Keith Medcalf

On Wednesday, 13 December, 2017 13:35, Lifepillar  
wrote:

>I am implementing an extension for manipulating IEEE754 decimal
>numbers. Numbers are stored as blobs using a standard encoding.
>Numbers that are mathematically equal may have different
>representations, (e.g., 1.0 may have mantissa 10 and exponent -1
>while 1.00 may have mantissa 100 and exponent -2).

You have stated something that is impossible, or at least self-contradictory.  
Unless, of course, you are talking about the "decimal" formats of IEEE754-2008 
and not the standard (far more common) "binary" formats.

You cannot have an IEEE754 (binary) number stored in denormalized format 
*except* in the circumstance where the exponent indicates that it is a 
denormalized number.  There are only two valid exponents to indicate that the 
number is denormalized, and all denormalized numbers are only used to represent 
numbers between +/- (0 and epsilon).  All other numbers stored in IEEE754 
floating point format are required, by the standard, to be normalized. (That 
is, where the MSB is 1 and that 1 is not stored as part of the significand).

So in order for the numbers to be IEEE754 floating point, the number "1.0" (no 
matter the number of trailing 0's you choose to display) must always be stored 
with a mantissa of 0.5 and an exponent of 1.  Although a mantissa of 0.25 with 
exponent 2 evaluates also to the number 1.0, in IEEE754 format it must always 
have a mantissa of 0.5 and exponent of 1.

Note the above two paragraphs only apply to binary IEE754-2008 numbers.  These 
are the only kind of "floating point" presently understood by SQLite3.

However, if you are talking about the "decimal" IEEE754 then you can indeed 
have different representations of the same "value".  Some "values" can have 
about 800 different representations of the same value.  (Note that the solution 
below would work even if the blobs were arbitrary precision IBM GDAS floating 
point numbers, or any other kind of floating point number, pretty much).


To answer your question however, I would recommend that you consider writing a 
function that return the "value" in a supported format.  My personal 
recommendation would be IEEE754-2008 binary64 (that is, the standard double 
precision floating point format supported by SQLite3).  You would tag this 
function as being CONSTANT/DETERMINISTIC.  

You could then create an index on the result.

CREATE INDEX decimal64blob_to_binary64 ON MyTable 
(ConvertDecimal64toBinary64(binary64blob_field));

and when you search the index ala:

SELECT * FROM MyTable where ConvertDecimal64toBinary64(binary64blob_field) 
between 47.0 and 47.1;

you will use the index (I believe).  And you need not specify conversion to 
string format nor deal with the vagaries of strings.  You just have to deal 
with the standard binary floating point limitations.

On the other hand however if you do NOT need binary64 at all, then there was a 
minor change discussed a while back by someone else where you can "change" the 
default floating-point number format from binary64 to decimal64 and then 
compile your own custom version of SQLite3 ...

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to index data based on custom comparisons?

2017-12-13 Thread Simon Slavin


On 13 Dec 2017, at 8:34pm, Lifepillar  wrote:

> But, (correct me if
> I am wrong), if I index the blob column directly, comparisons are
> based on memcpy(), which in my case is not what I want. Is it
> possible to create an index that somehow uses a custom comparison
> function instead? E.g., I have a deccmp(x,y) function that returns
> -1 if xy. Can I define an index based on
> that?

As Dr H wrote, it can’t be done.  Either store a normalised (numeric) version 
of the number, or store both the BLOB and a normalised version.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to index data based on custom comparisons?

2017-12-13 Thread Richard Hipp
On 12/13/17, Lifepillar  wrote:
>
> if I index the blob column directly, comparisons are
> based on memcpy(), which in my case is not what I want. Is it
> possible to create an index that somehow uses a custom comparison
> function instead?

No.  SQLite always uses memcmp() to compare BLOBs.  You can add a
collating function for strings, but not for BLOBs.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users