On Wed, Nov 6, 2019 at 7:22 PM Jens Alfke wrote:
>
>
> > On Nov 6, 2019, at 9:25 AM, Digital Dog wrote:
> >
> > If there are millions or billions of rows
> > in the data set I consider it big data and the only reasonable format for
> > storing it is a column store format.
>
> There are many
> On Nov 6, 2019, at 9:25 AM, Digital Dog wrote:
>
> If there are millions or billions of rows
> in the data set I consider it big data and the only reasonable format for
> storing it is a column store format.
There are many types of stores for "big data". My employer, Couchbase, has
Hi!
I'm late to the party, but want to contribute. I did not read all messages
in the thread, but in those I've read did not contain a question about
number of ROWS. We know how many columns you desire, but how many rows are
there?
No matter how I like SQLite, I would not store this kind of data
""Hi!
On Thu, Oct 17, 2019 at 5:38 PM Jens Alfke wrote:
> Why should SQLite make changes, which would introduce performance problems if
> used, just to save your particular application the trouble of concatenating
> some vectors into single columns, when it uses SQLite for an edge use-case
>
—Jens
> On Oct 17, 2019, at 1:56 AM, Mitar wrote:
>
> So why not increase the limit to 2 billion
Why should SQLite make changes, which would introduce performance problems if
used, just to save your particular application the trouble of concatenating
some vectors into single columns, when
Hi!
Oh, or we could just split CSV into separate lines, and then just
store one line per SQLite row, into one column. Then we do not have to
use JSON or something.
That would work for CSV files. For other types of inputs we might be
able to find some other similar approach.
So generally the
So if character-separated values (CSV-ish) were originally your preferred
import format, would using that format for the blob's work for you?
E.g., Suppose you need to index the first two fields only, and so can use a
blob column for the bulk of the record. If the records were supplied as:
Hi!
On Thu, Oct 17, 2019 at 3:04 PM Eric Grange wrote:
> my suggestion would be to store them as JSON in a blob, and use the JSON
> functions of SQLite to extract the data
JSON has some crazy limitations like by standard it does not support
full floating point spec, so NaN and infinity cannot
> I wrote earlier that for us use case where we are reading whole rows is
the most common one.
> [...]
> we are looking for ways to store this in a stable format which will be
supported for next 50 years, without modifying to original data too much.
If you do not need access to individual columns
Hi!
This is getting a bit off topic.
On Thu, Oct 17, 2019 at 12:07 PM Simon Slavin wrote:
> 1) Almost no piece of software can handle a grid 2 billion cells wide. Excel
> maxes out at 16,384 columns. Matlab can store and retrieve a cell of data
> directly from a file, but it has a max array
On 17 Oct 2019, at 9:56am, Mitar wrote:
> I can understand how supporting a large number of columns might be
> inappropriate when you want to run complicated SQL queries on data,
> but to just store data and then extract all rows to do some data
> processing, Or as the most complicated query it
Hi!
I can see how this is a reasonable limit when SQLite is used for
querying power it provides. In our case we are really focusing on it
as a standard long-term storage format. So in the "Appropriate Uses
For SQLite" document [1] you have a section called "File archive
and/or data container" and
> On Oct 16, 2019, at 6:08 AM, Mitar wrote:
>
> Quite
> some of datasets we are dealing with have 100k or so columns.
There was a thread about this a few months ago. You Should not store every
number of a huge vector in a separate column. You don’t need to individually
query on every
SQLite could, in theory, be enhanced (with just a few minor tweaks) to
support up to 2 billion columns. But having a relation with a large
number of columns seems like a very bad idea stylistically. That's
not how relational databases are intended to be used. Normally when a
table acquires more
Hi!
On Wed, Oct 16, 2019 at 3:29 PM Richard Hipp wrote:
> Are you trying to store a big matrix with approx 100k columns? A
> better way to do that in a relational database (*any* relational
> database, not just SQLite) is to store one entry per matrix elements:
Sure, this is useful for sparse
On 10/16/19, Mitar wrote:
> Hi!
>
> We are considering using SQLite as a ML dataset archival format for
> datasets in OpenML (https://www.openml.org/). When investigating it,
> we noticed that it has a very low limit on number of columns. Quite
> some of datasets we are dealing with have 100k or
16 matches
Mail list logo