We currently use fastbit in read-only mode from our server interface (currently in Java) so the aspect of "requires data to be on disk already" is our norm as we ETL from our raw data into fastbit index before querying (if I am correct in assuming that is all that you meant). So for now the C API over JNI provides most of what we need except for the optimal means of aggregate queries and some occasional concurrency issues not yet fully debugged. However, I recognize the current JNI interface and C API underpinning is imperfect and while we have examined other options to wrap access to the underlying C++ library in Java or Go we may ultimately just write a simple C++ server to handle concurrent queries -- language selection mostly comes down to maintainability on our end and C++ is not in our usual mix. Regardless of how we proceed, is it correct that the thula example's use of ibis::table.select is the optimal starting point to implement a read-only server query interface where goals are maintaining sub-second query response from 20M record table partitions with concurrency target of at least 100/req/s whether the aggreagte result set will commonly be 1 row and not more than 1000 if groupby? ;) too specific a question? In part I am surprised there are no currently available server implementations already exposing fastbit functionality -- it offers a much more focused and accessible solution compared to columnar MPP DB especially for already aggregated data.
On Wed, Sep 17, 2014 at 11:55 AM, K. John Wu <[email protected]> wrote: > Hi, Fred, > > Thanks for your interest in FastBit software. If you are planning to > extend FastBit in someway, it would be much better to do it in C++. > The Java API is based on a very old C API that requires data to be on > disk already. > > John > > > On 9/10/14 7:33 AM, Fred Oko wrote: > > Aim is to be able to access aggregates via JNI w/ greater efficiency > > of not having to pull back all the hit values via get_qualified_ints etc. > > > > I started by just attempting to add support for > > fastbit_build_result_set and passing a select clause with aggregates. > > But : > > 1) this returns the aggregate for teh wrong column (e.g. if asking for > > sum(colB) it would return a sum for a different column as was visible > > by returned aggregate and the debug logging showing access to said > > other column (apparently lexicographically selected) > > 2) as one starts tracing where that went wrong, one realizes this > > method will return a record with those aggreagtes for each hit as that > > is what the result set would contain -- given that it seem inefficient > > and based on query.h comment "If any additional functions are needed > > in the select clause, use the function ibis::table::select instead of > > using this class" I turned to that > > > > From there I took the thula example as a better starting point over > > tcapi and did manage do get the functionality of thula doQuery into > > the capi and access it via JNI. However I want to make certain what > > would be the best way to proceed now that I have validated this was > > feasible. > > 1) do you agree ibis::table.select is optimal for the case of wanting > > a couple of aggregates over a couple of columns (not necessarily teh > > same as in the where clause) for a set of where clauses against a table? > > 2) do you have recommendation on how to best expose this -- adding the > > table facade to FastBitQuery seems cleanest but for now I'm just > > exposing a specialized function > > 3) do you any arguments against a count(*) to the select clause to > > have one complete select response instead of having to mix in a > > separate request for num hits? -- it appears if using table.select I > > won't need to use the query interface and the table mechanisms are > > separate for computing hits > > 4) any concerns with this approach on memory cleanup or optimizations > > given that these queries will be run within a long lived container? > > > > Thx in advance > > > > > > _______________________________________________ > > FastBit-users mailing list > > [email protected] > > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > > > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
