[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-30 Thread Dominique Devienne
On Thu, Apr 30, 2015 at 2:52 AM, James K. Lowden 
wrote:

> On Tue, 28 Apr 2015 09:24:56 +0200 Dominique Devienne 
> wrote:
> > On Tue, Apr 28, 2015 at 4:16 AM, James K. Lowden <
> jklowden at schemamania.org> wrote:
> > > A major hurdle is the memory model: because array-programming
> > > libraries normally mandate the data be in contiguous memory,
> > > there's a cost to converting to/from the DBMS's B+ tree.  The more
> > > array-like the physical storage of the DBMS, the more it cedes
> > > transactional and update capability.
> >
> > Well, just look at how Oracle solved that problem. The row data
> > doesn't hold the blob itself, like in SQLite, but an index to
> > separate blob pages. This proverbial level of indirection brings
> > tremendous benefits, because you can then update a blob w/o having to
> > rewrite the whole blob (you copy the "blob" page(s) being updated
> > only, and copy the row into a new "data" page with an updated "blob
> > index" with just a few entries changed to point to the updated pages.
>
> Does Oracle provide access to the BLOB as a contiguous memory image in
> the Oracle server's virtual address space?
>

Of course not. But the same blob APIs any OCI client uses can be used by a
server extension that runs out-of-process (if native), or in-process (if
pure Java inside the Oracle server-side JVM), and the implementation of
those APIs do take advantage of the locality. No one in his right mind
would allow arbitrary 3rd party code to run in your "enterprisy" DB server
(unless "safe" like Java or sandboxed if native; Oracle doesn't do the
latter).

But processing a page of values is still much more efficient individual
values one at a time (coming from rows), and if pages happen to be
contiguous all the better. Think of it in terms of how aggregate functions
work, if you will. Perhaps you can use your BLAS library as-is if not
contiguous, but again given the aggr-func analogy, you can allocate the
continuous buffer in the begin/init block, do efficient page-sized copies
on the step block, and do all the computation in the end/finalize block on
that temp continuous buffer. Having such blob-specific pages would enable
such a more efficient scenario.


> The benefit you're describing reduces the I/O required to update the
> BLOB and to traverse the table without referencing it.


Don't forget it also enables to grow blobs w/o writing them in full, and
update them in place transactionally, both of which SQLite does not support
right now, which is my main issue with SQLite. Blobs are basically second
class citizens in SQLite, and I'd wish that changes in future versions.


> That's a *very* old technique; it was available in Sybase in 1980s.


So? I never made any claims to the originality or age of the technique. I
merely stated that this (well known, thank you for the precision) technique
is not used by SQLite, and that using it would allow to solve the two main
issues with SQLite regarding blobs.


> To support use of
> arbitrary linear algebra functions on those BLOBs, they would have to
> supplied to user-defined functions in linear memory with known
> addresses.


See my aggregate-function analogy above. My point of that some function
will work just fine working incrementally on the blob data in page-sized
(or multiples of it) chunks. Such that you can use the blob APIs to load
the blob in user-allocated buffer, i.e. a scalar user-defined function can
drive the loop over the blob, but instead if you do some kind of "inversion
of control" where its SQLite that drives the loop of blob pages, passing
control to an aggregate-like user-defined functions that operates
specifically over blobs and there pages, SQLite can pass in a read-only
pointer to its internal page cache for example and avoid copies. Just
thinking aloud.

My main argument is that I'd like the two current limitations on blobs
(can't extend w/o fully rewriting, can't incrementally update w/o either
losing transactionality or being forced to again rewrite in full) gone from
SQLite. Because then SQLite can model a proper transactional random-access
file on top of its blobs, and I can then use SQLite blobs as the back end
to a 3rd party VFS (like for HDF5 for example). I can do that (and did)
with Oracle, and I can't do that with SQLite. And I'd really like to be
able to do it. And then blobs would no longer be second class citizens in
SQLite. --DD


[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-29 Thread James K. Lowden
On Tue, 28 Apr 2015 09:24:56 +0200
Dominique Devienne  wrote:

> On Tue, Apr 28, 2015 at 4:16 AM, James K. Lowden
>  wrote:
> 
> > A major hurdle is the memory model: because array-programming
> > libraries normally mandate the data be in contiguous memory,
> > there's a cost to converting to/from the DBMS's B+ tree.  The more
> > array-like the physical storage of the DBMS, the more it cedes
> > transactional and update capability.
> 
> Well, just look at how Oracle solved that problem. The row data
> doesn't hold the blob itself, like in SQLite, but an index to
> separate blob pages. This proverbial level of indirection brings
> tremendous benefits, because you can then update a blob w/o having to
> rewrite the whole blob (you copy the "blob" page(s) being updated
> only, and copy the row into a new "data" page with an updated "blob
> index" with just a few entries changed to point to the updated pages. 

Does Oracle provide access to the BLOB as a contiguous memory image in
the Oracle server's virtual address space?

The benefit you're describing reduces the I/O required to update the
BLOB and to traverse the table without referencing it.  That's a *very*
old technique; it was available in Sybase in 1980s.  To support use of 
arbitrary linear algebra functions on those BLOBs, they would have to
supplied to user-defined functions in linear memory with known
addresses.  

--jkl


[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-28 Thread Dominique Devienne
On Tue, Apr 28, 2015 at 4:16 AM, James K. Lowden 
wrote:

> A major hurdle is the
> memory model: because array-programming libraries normally mandate the
> data be in contiguous memory, there's a cost to converting to/from the
> DBMS's B+ tree.  The more array-like the physical storage of the DBMS,
> the more it cedes transactional and update capability.
>

Well, just look at how Oracle solved that problem. The row data doesn't
hold the blob itself, like in SQLite, but an index to separate blob pages.
This proverbial level of indirection brings tremendous benefits, because
you can then update a blob w/o having to rewrite the whole blob (you copy
the "blob" page(s) being updated only, and copy the row into a new "data"
page with an updated "blob index" with just a few entries changed to point
to the updated pages. You can thus easily extend a blob w/o rewriting it
all like SQLite does now, or not break transactionality when doing in-place
updates using the SQLite blob API like now too. This is IMHO one of the
biggest problem with SQLite right now, even as its primary purpose of a
"application format". That and the lack of arrays and UDTs. (in this order,
i.e. blobs, arrays, UDTs). FWIW :). I love SQLite. Doesn't make me blind to
what it lacks for my use cases though. My $0.02. --DD


[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread James K. Lowden
On Mon, 27 Apr 2015 16:35:11 -0400
Jim Callahan  wrote:

> So, the application of arrays and complex numbers go far beyond the
> one question posed to this list with arrays of integers and reals
> being far more common than arrays of complex numbers.
> 
> Complex numbers are included as types in C, C#, FORTRAN and Python
> and are being added to server databases (Oracle and Microsoft SQL
> Server) as user defined types, so programmers and server database
> administrators may develop an expectation that complex types or user
> defined types will be supported SQLite3.

Useful as complex numbers and arrays are to scientific work, SQLite is
deficient in ordinary math.  Division by zero yields NULL, for
example.  The usual suite of mathematical functions such as declared in
math.h are absent, never mind linear algebra.  Even enforcing the rule
that a column always hold a number is inconvenient at best.  

> I would expect that users of complex numbers would be an
> extremely small subset of data base users

Yes, and database users are a small subset of scientific programmers.
As some of the papers you cited allude to, there is some work (thanks
in part to Jim Gray) to answer the needs of that subset.  Most of that
work, though -- already a subset of subset -- focuses on giant
datasets.  "SQLite for science" is thus a small population by almost any
measure.  

IMO a "scientific DBMS" comprises two separate good ideas.  Neither one
has been solved.  

The first is efficient computation over "array-like" stuff, alerting
the computation engine that it is working with arrays, not sets, and/or
directly invoking existing libraries e.g. BLAS.  A major hurdle is the
memory model: because array-programming libraries normally mandate the
data be in contiguous memory, there's a cost to converting to/from the
DBMS's B+ tree.  The more array-like the physical storage of the DBMS,
the more it cedes transactional and update capability.  

The second is expression of mathematics in a database language.  SciQL
goes some distance in that direction by making tables and arrays
interchangable, but the battle is not yet won.  I suspect a better
answer would arise from APL-for-databases than SQL-for-math. But, see
above, not exactly a popular topic.  

--jkl


[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Simon Slavin

On 27 Apr 2015, at 9:35pm, Jim Callahan  
wrote:

> But, even so, I would expect that users of complex numbers would be an
> extremely small subset of data base users (less than 5%? or even less than
> 1%?).

Right.  There's no way to know this for sure, but I suspect that of the 
literally billions of installations of SQLite, 99% all of them need only 
INTEGER, REAL and TEXT.  Not even NULL.  Of the ones which store two-number 
values, most of them are in SatNav units and store latitude and longitude 
rather than real and imaginary parts.

And of the <1% who store complex numbers, many users are like the OP of this 
thread and just need them stored and retrieved.  They don't need 
access/searching/sorting of individual values.  So storing TEXT serialisations 
the values, even of arrays of values, is faster, involves no loss of precision, 
and leads to less disk space being used.

Right now I'm manipulating data with Red Green Blue and White values.  If 
SQLite had an appropriate type I'd use it, but it doesn't, and I'm happy 
storing the four values in one TEXT column, JSON style.  My laptop and SSD are 
so ridiculously fast that, for example, extracting a Blue value from a stored 
column happens at a speed I find easy to accept.

Simon.


[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Stefan Keller
As of SQL:2003, arrays are standard SQL types.
Examples of standard syntax:
  BIGINT ARRAY
  BIGINT ARRAY[100]
  BIGINT ARRAY[100] ARRAY[200]

It's fully implemented e.g. in PostgreSQL...

-S.


2015-04-27 18:48 GMT+02:00 Drago, William @ CSG - NARDA-MITEQ
:
> All,
>
> I've been enjoying this discussion and have learned a thing or two from all 
> the suggestions.
>
> My particular problem is indeed solved. Adding a simple blob to my original 
> table is the best solution in this case. If you think about the trace data as 
> simply a screenshot of the analyzer display then I think it is easier to 
> understand why I don't want to bother with more complex solutions. Even 
> though they may be technically more correct, they are overkill for what I'm 
> trying to accomplish. I'm just saving a screenshot, minus the graticule and 
> all the other decorations on the display. There's no need to make it any more 
> complicated than that. I don't even have to save the X-axis elements as those 
> are derived from other parameters that are already known. Someone asked if 
> the trace data was time stamped. It is not. Date and time of the test are 
> stored along with all the other relevant information.
>
> If I ever have to deal with complex numbers as my core data I have some 
> pretty good ideas on how to handle them now. BTW, the languages I use most, 
> VEE & C#, both support complex data types. Given the amount of use database 
> programs see in the scientific world I'm surprised there is no native complex 
> data type (I can appreciate the difficulty in implementing such a thing). 
> Array support for each type would be nice too. We have this in VEE and it is 
> so easy to use. But for now every programmer working in science or 
> engineering has to re-invent the wheel.
>
> For the curious here are the analyzers I'm working with:
> http://www.keysight.com/en/pc-101745%3Aepsg%3Apgr/pna-network-analyzers-300-khz-to-11-thz?cc=US=eng
>
> --
> Bill Drago
> Senior Engineer
> L3 Narda-MITEQ
> 435 Moreland Road
> Hauppauge, NY 11788
> 631-272-5947 / William.Drago at L-3COM.com
>
>
>
>> -Original Message-
>> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-
>> users-bounces at mailinglists.sqlite.org] On Behalf Of Jim Callahan
>> Sent: Sunday, April 26, 2015 9:56 PM
>> To: General Discussion of SQLite Database
>> Subject: [sqlite] SQLite and Scientific Computing: Arrays and Complex
>> Numbers
>>
>> The original thread asking about an array of complex numbers has been
>> marked as "solved." The requester has decided to serialize the complex
>> numbers and store them in a blob. Earlier, Keith had suggested storing
>> complex numbers as a pair of real numbers and a separate box table. I
>> extended Keith's suggestion with two or three tables, elements, arrays
>> and optionally coordinates.
>>
>> There is some literature on storing arrays in SQL databases. In
>> addition complex numbers seem to be the orphan stepchild of programming
>> languages (let alone databases). Although FORTRAN IV had complex
>> numbers they were not added to the C standard until C99.
>>
>> Language / Standard / Library
>> 
>> C / C99/ complex.h
>> http://en.wikibooks.org/wiki/C_Programming/C_Reference/complex.h
>> C# / 4.0 / System.Numerics.Complex
>> https://msdn.microsoft.com/en-
>> us/library/system.numerics.complex(v=vs.110).aspx
>>
>> Java /?/ Apache Commons
>> Python/2.6.5/ cmath
>> https://docs.python.org/2/library/cmath.html
>>
>> http://en.wikipedia.org/wiki/Complex_data_type
>>
>> So, I suppose if a company wanted to sponsor it, complex numbers could
>> be supported through an addin library similar to FTS3 and FTS4 for full
>> text searches.
>> http://sqlite.org/fts3.html
>>
>> Here for example, is a discussion on IBM DeveloperWorks concerning the
>> Informix database.
>> https://www.ibm.com/developerworks/community/blogs/gbowerman/entry/sql_
>> and_the_complex_plane?lang=en
>>
>> Some databases have Abstract Defined Type (Oracle) or User Defined
>> Types (Microsoft SQL Server) that could be used for complex numbers.
>>
>> Ironically, the scientific data format NetCDF did not have provision
>> for complex numbers (it was designed for weather data).
>> https://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2011/ms
>> g00027.html
>>
>> There are some discusssions of scientific versions of SQL (such as
>> SciQL):
>>
>> "A Query Language for Multidimensional Arrays:
>> Design, Implementation, 

[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Drago, William @ CSG - NARDA-MITEQ
All,

I've been enjoying this discussion and have learned a thing or two from all the 
suggestions.

My particular problem is indeed solved. Adding a simple blob to my original 
table is the best solution in this case. If you think about the trace data as 
simply a screenshot of the analyzer display then I think it is easier to 
understand why I don't want to bother with more complex solutions. Even though 
they may be technically more correct, they are overkill for what I'm trying to 
accomplish. I'm just saving a screenshot, minus the graticule and all the other 
decorations on the display. There's no need to make it any more complicated 
than that. I don't even have to save the X-axis elements as those are derived 
from other parameters that are already known. Someone asked if the trace data 
was time stamped. It is not. Date and time of the test are stored along with 
all the other relevant information.

If I ever have to deal with complex numbers as my core data I have some pretty 
good ideas on how to handle them now. BTW, the languages I use most, VEE & C#, 
both support complex data types. Given the amount of use database programs see 
in the scientific world I'm surprised there is no native complex data type (I 
can appreciate the difficulty in implementing such a thing). Array support for 
each type would be nice too. We have this in VEE and it is so easy to use. But 
for now every programmer working in science or engineering has to re-invent the 
wheel.

For the curious here are the analyzers I'm working with:
http://www.keysight.com/en/pc-101745%3Aepsg%3Apgr/pna-network-analyzers-300-khz-to-11-thz?cc=US=eng

--
Bill Drago
Senior Engineer
L3 Narda-MITEQ
435 Moreland Road
Hauppauge, NY 11788
631-272-5947 / William.Drago at L-3COM.com



> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-
> users-bounces at mailinglists.sqlite.org] On Behalf Of Jim Callahan
> Sent: Sunday, April 26, 2015 9:56 PM
> To: General Discussion of SQLite Database
> Subject: [sqlite] SQLite and Scientific Computing: Arrays and Complex
> Numbers
>
> The original thread asking about an array of complex numbers has been
> marked as "solved." The requester has decided to serialize the complex
> numbers and store them in a blob. Earlier, Keith had suggested storing
> complex numbers as a pair of real numbers and a separate box table. I
> extended Keith's suggestion with two or three tables, elements, arrays
> and optionally coordinates.
>
> There is some literature on storing arrays in SQL databases. In
> addition complex numbers seem to be the orphan stepchild of programming
> languages (let alone databases). Although FORTRAN IV had complex
> numbers they were not added to the C standard until C99.
>
> Language / Standard / Library
> 
> C / C99/ complex.h
> http://en.wikibooks.org/wiki/C_Programming/C_Reference/complex.h
> C# / 4.0 / System.Numerics.Complex
> https://msdn.microsoft.com/en-
> us/library/system.numerics.complex(v=vs.110).aspx
>
> Java /?/ Apache Commons
> Python/2.6.5/ cmath
> https://docs.python.org/2/library/cmath.html
>
> http://en.wikipedia.org/wiki/Complex_data_type
>
> So, I suppose if a company wanted to sponsor it, complex numbers could
> be supported through an addin library similar to FTS3 and FTS4 for full
> text searches.
> http://sqlite.org/fts3.html
>
> Here for example, is a discussion on IBM DeveloperWorks concerning the
> Informix database.
> https://www.ibm.com/developerworks/community/blogs/gbowerman/entry/sql_
> and_the_complex_plane?lang=en
>
> Some databases have Abstract Defined Type (Oracle) or User Defined
> Types (Microsoft SQL Server) that could be used for complex numbers.
>
> Ironically, the scientific data format NetCDF did not have provision
> for complex numbers (it was designed for weather data).
> https://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2011/ms
> g00027.html
>
> There are some discusssions of scientific versions of SQL (such as
> SciQL):
>
> "A Query Language for Multidimensional Arrays:
> Design, Implementation, and Optimization Techniques"
> http://homepages.inf.ed.ac.uk/libkin/papers/sigmod96a.pdf
>
> "Requirements for Science Data Bases and SciDB"
> http://www-db.cs.wisc.edu/cidr/cidr2009/Paper_26.pdf
>
> "SciQL, A Query Language for Science Applications"
> http://homepages.cwi.nl/~zhang/papers/arraydb11.pdf
>
> Jim
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely

[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Jim Callahan
[mailto:sqlite-
> > users-bounces at mailinglists.sqlite.org] On Behalf Of Jim Callahan
> > Sent: Sunday, April 26, 2015 9:56 PM
> > To: General Discussion of SQLite Database
> > Subject: [sqlite] SQLite and Scientific Computing: Arrays and Complex
> > Numbers
> >
> > The original thread asking about an array of complex numbers has been
> > marked as "solved." The requester has decided to serialize the complex
> > numbers and store them in a blob. Earlier, Keith had suggested storing
> > complex numbers as a pair of real numbers and a separate box table. I
> > extended Keith's suggestion with two or three tables, elements, arrays
> > and optionally coordinates.
> >
> > There is some literature on storing arrays in SQL databases. In
> > addition complex numbers seem to be the orphan stepchild of programming
> > languages (let alone databases). Although FORTRAN IV had complex
> > numbers they were not added to the C standard until C99.
> >
> > Language / Standard / Library
> > 
> > C / C99/ complex.h
> > http://en.wikibooks.org/wiki/C_Programming/C_Reference/complex.h
> > C# / 4.0 / System.Numerics.Complex
> > https://msdn.microsoft.com/en-
> > us/library/system.numerics.complex(v=vs.110).aspx
> >
> > Java /?/ Apache Commons
> > Python/2.6.5/ cmath
> > https://docs.python.org/2/library/cmath.html
> >
> > http://en.wikipedia.org/wiki/Complex_data_type
> >
> > So, I suppose if a company wanted to sponsor it, complex numbers could
> > be supported through an addin library similar to FTS3 and FTS4 for full
> > text searches.
> > http://sqlite.org/fts3.html
> >
> > Here for example, is a discussion on IBM DeveloperWorks concerning the
> > Informix database.
> > https://www.ibm.com/developerworks/community/blogs/gbowerman/entry/sql_
> > and_the_complex_plane?lang=en
> >
> > Some databases have Abstract Defined Type (Oracle) or User Defined
> > Types (Microsoft SQL Server) that could be used for complex numbers.
> >
> > Ironically, the scientific data format NetCDF did not have provision
> > for complex numbers (it was designed for weather data).
> > https://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2011/ms
> > g00027.html
> >
> > There are some discusssions of scientific versions of SQL (such as
> > SciQL):
> >
> > "A Query Language for Multidimensional Arrays:
> > Design, Implementation, and Optimization Techniques"
> > http://homepages.inf.ed.ac.uk/libkin/papers/sigmod96a.pdf
> >
> > "Requirements for Science Data Bases and SciDB"
> > http://www-db.cs.wisc.edu/cidr/cidr2009/Paper_26.pdf
> >
> > "SciQL, A Query Language for Science Applications"
> > http://homepages.cwi.nl/~zhang/papers/arraydb11.pdf
> >
> > Jim
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any
> attachments are solely for the use of the addressee and may contain
> information that is privileged or confidential. Any disclosure, use or
> distribution of the information contained herein is prohibited. In the
> event this e-mail contains technical data within the definition of the
> International Traffic in Arms Regulations or Export Administration
> Regulations, it is subject to the export control laws of the
> U.S.Government. The recipient should check this e-mail and any attachments
> for the presence of viruses as L-3 does not accept any liability associated
> with the transmission of this e-mail. If you have received this
> communication in error, please notify the sender by reply e-mail and
> immediately delete this message and any attachments.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Jim Callahan
Oops, left out two links, the Rasdaman 10.0 quote is from the Rasdaman
roadmap.
http://rasdaman.org/roadmap

and the quote about the ISO designation is from the Array SQL Rasdaman page:
http://rasdaman.com/ArraySQL/

But, the conclusion remains the same:

So, Multi-dimensional arrays may be coming to SQL via the ISO Standard 9075
and to SQLite3 via Rasdaman, but only if, Rasdaman decides to use SQLite3
as their default backend in their 10.0.0 release.


Jim Callahan
Orlando, FL

On Mon, Apr 27, 2015 at 12:33 PM, Jim Callahan <
jim.callahan.orlando at gmail.com> wrote:

> Two more thoughts (and I have to get back to work!):
> 1. Rasdaman
> 2. ISO SQL/MDA (multi-dimensional arrays)
>
> In June 2014, British tech website "The Register" reported, "the ISO SQL
> working group agreed to start work on SQL/MDA (multi-dimensional array)
> specs.   ...A separate effort, called Rasdaman  (a
> scalable multi-dimensional array analytics server) has been working for
> some time to apply an SQL-like query language to array databases.
> Rasdaman's backer, Peter Baumann of Jacobs University Bremen in Germany,
> put forward the proposal now adopted by the ISO."
>
> http://www.theregister.co.uk/2014/06/26/sql_to_worlddog_we_doing_big_data_too/
>
> Apparently, Rasdaman runs with PostgreSQL, but is considering having
> SQLite as a default:
>
> "Suggested *big* changes for 10.0
> Improve UX by removing big external dependencies (by default)
>
>- rasserver: use *SQLite*/filesystem backend by default instead of
>PostgreSQL
>- petascope: use embedded ASQLDB instead of PostgreSQL
>- petascope: local deployment with an embedded jetty in $RMANHOME/bin
>for example, rather than in an external Tomcat. This would be managed by
>similarly start/stop_petascope.sh for example
>? "
>
> Rasdaman lists the official ISO designation:
> "ISO Array-SQL:
> ...
> It's ISO: In June 2014, ISO/IEC JTC1 SC32 WG3 has decided to establish ISO
> 9075 Part 15: SQL/MDA as ISO's latest SQL extension, specifically for Big
> Science Data."
>
> The only semi-official (non-Rasdaman) ISO document I could find (with one
> or two Google searches) was this presentation:
>
> "WG3 is incorporating two interesting new capabilities into
> the next version of 9075:
> ? Support for JSON ... [detail omitted]
> ? Support for Multi Dimensional Arrays ... [detail omitted]"
> http://jtc1sc32.org/doc/N2501-2550/32N2528-WG3-Tutorial-Opening-Plenary.pdf
>
> So, Multi-dimensional arrays may be coming to SQL via the ISO Standard
> 9075 and to SQLite3 via Rasdaman, but only if, Rasdaman decides to use
> SQLite3 as their default backend in their 10.0.0 release.
>
> Jim Callahan
> Orlando, FL
>
>
>
> On Mon, Apr 27, 2015 at 11:23 AM, Jim Callahan <
> jim.callahan.orlando at gmail.com> wrote:
>
>> Two more thoughts (and you'll stay with SQLite):
>> 1. SciQL -> MonetDB -> sockets -> embedding R in MonetDB (ala PostgresQL)
>> http://en.wikipedia.org/wiki/MonetDB
>>
>> "there was one issue that we could not solve: Processing data from the
>> database with R required transferring the relevant data over the socket
>> first. This is fine and "fast enough" for  up to ? say ? several thousand
>> rows, but not for much more. We have had a lot of demand for transferring
>> larger amounts of data from users. Hence, we chose to go in a different
>> direction.
>> Starting with the Oct2014 release, MonetDB will ship with a feature we
>> call *R-Integration*. R has some support for running embedded
>>  in another application, a
>> fact we leverage heavily. What it does is make R scripts a first class
>> citizen of the SQL world in MonetDB."
>> https://www.monetdb.org/content/embedded-r-monetdb
>>
>> So, instead of having an embedded database (like SQLite3) embedded in a
>> language, (like Python or R), they windup having the database swallow the
>> language!!! (embeding the language in the database -- instead of the other
>> way arround).
>>
>> 2. ROLAP -> Mondrian -> TomCat
>> As opposed the the scientific community, the business community thinks in
>> terms of "cubes" rather than arrays which gets one to Online Analytic
>> Processing (OLAP) or when implemented over a relational database: ROLAP. Or
>> the whole thing gets rebranded by marketing as "Business Intelligence" or
>> "Analysis Services" or "Pivot Tables".
>> The open source ROLAP tool is "Mondrian".
>>
>> Mondrian can run over SQLite, but seems to run more often with a Java
>> application server such as TomCat or JBoss.
>>
>> Mondrian Cube with SQLite
>> This entry was written by haoccheng, posted on October 7, 2012
>> "Mondrian is an in-memory OLAP cube (cache) built on top of relational
>> database. Cube is a multi-dimensional data structure. A cell holds numeric
>> measurements of a business; aggregation can be executed efficiently along
>> dimensions. MDX is the query language of Mondrian and a query can be broken
>> down to lookup at the 

[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Jim Callahan
Two more thoughts (and I have to get back to work!):
1. Rasdaman
2. ISO SQL/MDA (multi-dimensional arrays)

In June 2014, British tech website "The Register" reported, "the ISO SQL
working group agreed to start work on SQL/MDA (multi-dimensional array)
specs.   ...A separate effort, called Rasdaman  (a
scalable multi-dimensional array analytics server) has been working for
some time to apply an SQL-like query language to array databases.
Rasdaman's backer, Peter Baumann of Jacobs University Bremen in Germany,
put forward the proposal now adopted by the ISO."
http://www.theregister.co.uk/2014/06/26/sql_to_worlddog_we_doing_big_data_too/

Apparently, Rasdaman runs with PostgreSQL, but is considering having SQLite
as a default:

"Suggested *big* changes for 10.0
Improve UX by removing big external dependencies (by default)

   - rasserver: use *SQLite*/filesystem backend by default instead of
   PostgreSQL
   - petascope: use embedded ASQLDB instead of PostgreSQL
   - petascope: local deployment with an embedded jetty in $RMANHOME/bin
   for example, rather than in an external Tomcat. This would be managed by
   similarly start/stop_petascope.sh for example
   ? "

Rasdaman lists the official ISO designation:
"ISO Array-SQL:
...
It's ISO: In June 2014, ISO/IEC JTC1 SC32 WG3 has decided to establish ISO
9075 Part 15: SQL/MDA as ISO's latest SQL extension, specifically for Big
Science Data."

The only semi-official (non-Rasdaman) ISO document I could find (with one
or two Google searches) was this presentation:

"WG3 is incorporating two interesting new capabilities into
the next version of 9075:
? Support for JSON ... [detail omitted]
? Support for Multi Dimensional Arrays ... [detail omitted]"
http://jtc1sc32.org/doc/N2501-2550/32N2528-WG3-Tutorial-Opening-Plenary.pdf

So, Multi-dimensional arrays may be coming to SQL via the ISO Standard 9075
and to SQLite3 via Rasdaman, but only if, Rasdaman decides to use SQLite3
as their default backend in their 10.0.0 release.

Jim Callahan
Orlando, FL



On Mon, Apr 27, 2015 at 11:23 AM, Jim Callahan <
jim.callahan.orlando at gmail.com> wrote:

> Two more thoughts (and you'll stay with SQLite):
> 1. SciQL -> MonetDB -> sockets -> embedding R in MonetDB (ala PostgresQL)
> http://en.wikipedia.org/wiki/MonetDB
>
> "there was one issue that we could not solve: Processing data from the
> database with R required transferring the relevant data over the socket
> first. This is fine and "fast enough" for  up to ? say ? several thousand
> rows, but not for much more. We have had a lot of demand for transferring
> larger amounts of data from users. Hence, we chose to go in a different
> direction.
> Starting with the Oct2014 release, MonetDB will ship with a feature we call
>  *R-Integration*. R has some support for running embedded
>  in another application, a
> fact we leverage heavily. What it does is make R scripts a first class
> citizen of the SQL world in MonetDB."
> https://www.monetdb.org/content/embedded-r-monetdb
>
> So, instead of having an embedded database (like SQLite3) embedded in a
> language, (like Python or R), they windup having the database swallow the
> language!!! (embeding the language in the database -- instead of the other
> way arround).
>
> 2. ROLAP -> Mondrian -> TomCat
> As opposed the the scientific community, the business community thinks in
> terms of "cubes" rather than arrays which gets one to Online Analytic
> Processing (OLAP) or when implemented over a relational database: ROLAP. Or
> the whole thing gets rebranded by marketing as "Business Intelligence" or
> "Analysis Services" or "Pivot Tables".
> The open source ROLAP tool is "Mondrian".
>
> Mondrian can run over SQLite, but seems to run more often with a Java
> application server such as TomCat or JBoss.
>
> Mondrian Cube with SQLite
> This entry was written by haoccheng, posted on October 7, 2012
> "Mondrian is an in-memory OLAP cube (cache) built on top of relational
> database. Cube is a multi-dimensional data structure. A cell holds numeric
> measurements of a business; aggregation can be executed efficiently along
> dimensions. MDX is the query language of Mondrian and a query can be broken
> down to lookup at the cube (cache hit) and direct SQL query against the
> underneath relational database.
> I play with the Mondrian and SQLite.
> #1: Create a sqlite database (I borrow the data model from the Mondrian
> tutorial by Slawomir, architect of Pentaho)."
> https://haoccheng.wordpress.com/2012/10/07/mondrian-cube-with-sqlite/
>
> Creating a basic Mondrian OLAP Cube
> BY SLAWOMIR CHODNICKI, ON JULY 10TH, 2010
> "If you?d like to follow the examples you will need access to a database,
> a copy of Pentaho Kettle and a Mondrian installation. I will be using MySQL
> as RDBMS and JasperServer 3.7. CE
>  for the Mondrian
> installation. Other possibilities include Pentaho 

[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-27 Thread Jim Callahan
Two more thoughts (and you'll stay with SQLite):
1. SciQL -> MonetDB -> sockets -> embedding R in MonetDB (ala PostgresQL)
http://en.wikipedia.org/wiki/MonetDB

"there was one issue that we could not solve: Processing data from the
database with R required transferring the relevant data over the socket
first. This is fine and "fast enough" for  up to ? say ? several thousand
rows, but not for much more. We have had a lot of demand for transferring
larger amounts of data from users. Hence, we chose to go in a different
direction.
Starting with the Oct2014 release, MonetDB will ship with a feature we call
*R-Integration*. R has some support for running embedded
 in another application, a
fact we leverage heavily. What it does is make R scripts a first class
citizen of the SQL world in MonetDB."
https://www.monetdb.org/content/embedded-r-monetdb

So, instead of having an embedded database (like SQLite3) embedded in a
language, (like Python or R), they windup having the database swallow the
language!!! (embeding the language in the database -- instead of the other
way arround).

2. ROLAP -> Mondrian -> TomCat
As opposed the the scientific community, the business community thinks in
terms of "cubes" rather than arrays which gets one to Online Analytic
Processing (OLAP) or when implemented over a relational database: ROLAP. Or
the whole thing gets rebranded by marketing as "Business Intelligence" or
"Analysis Services" or "Pivot Tables".
The open source ROLAP tool is "Mondrian".

Mondrian can run over SQLite, but seems to run more often with a Java
application server such as TomCat or JBoss.

Mondrian Cube with SQLite
This entry was written by haoccheng, posted on October 7, 2012
"Mondrian is an in-memory OLAP cube (cache) built on top of relational
database. Cube is a multi-dimensional data structure. A cell holds numeric
measurements of a business; aggregation can be executed efficiently along
dimensions. MDX is the query language of Mondrian and a query can be broken
down to lookup at the cube (cache hit) and direct SQL query against the
underneath relational database.
I play with the Mondrian and SQLite.
#1: Create a sqlite database (I borrow the data model from the Mondrian
tutorial by Slawomir, architect of Pentaho)."
https://haoccheng.wordpress.com/2012/10/07/mondrian-cube-with-sqlite/

Creating a basic Mondrian OLAP Cube
BY SLAWOMIR CHODNICKI, ON JULY 10TH, 2010
"If you?d like to follow the examples you will need access to a database, a
copy of Pentaho Kettle and a Mondrian installation. I will be using MySQL
as RDBMS and JasperServer 3.7. CE
 for the Mondrian
installation. Other possibilities include Pentaho BI-Server

and
a bare bones Mondrian/JPivot
 and PAT
 installation."
http://type-exit.org/adventures-with-open-source-bi/2010/07/creating-a-basic-mondrian-olap-cube/

So, taking any of the paths (MonetDB or Mondrian) gets you to complex
server setups pretty quickly, the one exception is haoccheng makes a brake
attempt to setup Mondrian over SQLite (but even his attempt requires a
TomCat Java application server):

"#3: Install the Tomcat; verify the http://localhost:8080 up and running.
#4: Download Mondrian; unzip mondrian-version.war to tomcat_home/webapps/
folder.
#5: Get the sqlite jdbc driver: sqlite-jdbc-version.jar; deploy to
tomcat_home/webapps/mondrian/WEB-INF/lib
#6: Change the connection string in Mondrian web.xml
(tomcat_home/webapps/mondrian/WEB-INF) to point to the sqlite database."


Almost makes one pine for MS Excel PivotTables.

Jim Callahan
Orlando, FL

On Sun, Apr 26, 2015 at 9:55 PM, Jim Callahan <
jim.callahan.orlando at gmail.com> wrote:

> The original thread asking about an array of complex numbers has been
> marked as "solved." The requester has decided to serialize the complex
> numbers and store them in a blob. Earlier, Keith had suggested storing
> complex numbers as a pair of real numbers and a separate box table. I
> extended Keith's suggestion with two or three tables, elements, arrays and
> optionally coordinates.
>
> There is some literature on storing arrays in SQL databases. In addition
> complex numbers seem to be the orphan stepchild of programming languages
> (let alone databases). Although FORTRAN IV had complex numbers they were
> not added to the C standard until C99.
>
> Language / Standard / Library
> 
> C / C99/ complex.h
> http://en.wikibooks.org/wiki/C_Programming/C_Reference/complex.h
> C# / 4.0 / System.Numerics.Complex
>
> https://msdn.microsoft.com/en-us/library/system.numerics.complex(v=vs.110).aspx
>
> Java /?/ Apache Commons
> Python/2.6.5/ cmath
> https://docs.python.org/2/library/cmath.html
>
> http://en.wikipedia.org/wiki/Complex_data_type

[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-26 Thread Jim Callahan
The original thread asking about an array of complex numbers has been
marked as "solved." The requester has decided to serialize the complex
numbers and store them in a blob. Earlier, Keith had suggested storing
complex numbers as a pair of real numbers and a separate box table. I
extended Keith's suggestion with two or three tables, elements, arrays and
optionally coordinates.

There is some literature on storing arrays in SQL databases. In addition
complex numbers seem to be the orphan stepchild of programming languages
(let alone databases). Although FORTRAN IV had complex numbers they were
not added to the C standard until C99.

Language / Standard / Library

C / C99/ complex.h
http://en.wikibooks.org/wiki/C_Programming/C_Reference/complex.h
C# / 4.0 / System.Numerics.Complex
https://msdn.microsoft.com/en-us/library/system.numerics.complex(v=vs.110).aspx

Java /?/ Apache Commons
Python/2.6.5/ cmath
https://docs.python.org/2/library/cmath.html

http://en.wikipedia.org/wiki/Complex_data_type

So, I suppose if a company wanted to sponsor it, complex numbers could be
supported through an addin library similar to FTS3 and FTS4 for full text
searches.
http://sqlite.org/fts3.html

Here for example, is a discussion on IBM DeveloperWorks concerning the
Informix database.
https://www.ibm.com/developerworks/community/blogs/gbowerman/entry/sql_and_the_complex_plane?lang=en

Some databases have Abstract Defined Type (Oracle) or User Defined Types
(Microsoft SQL Server) that could be used for complex numbers.

Ironically, the scientific data format NetCDF did not have provision for
complex numbers (it was designed for weather data).
https://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2011/msg00027.html

There are some discusssions of scientific versions of SQL (such as SciQL):

"A Query Language for Multidimensional Arrays:
Design, Implementation, and Optimization Techniques"
http://homepages.inf.ed.ac.uk/libkin/papers/sigmod96a.pdf

"Requirements for Science Data Bases and SciDB"
http://www-db.cs.wisc.edu/cidr/cidr2009/Paper_26.pdf

"SciQL, A Query Language for Science Applications"
http://homepages.cwi.nl/~zhang/papers/arraydb11.pdf

Jim