Hi, Patrik,

Sounds great.  Thanks for the update.

John


On 2/6/13 4:42 AM, Patrik Nisen wrote:
> Hi John,
> 
> and thanks for your help, although this new ibis::filter interface
> does not directly meet my needs, because it is always returning a
> table-object. I just wanted to let you know that I'm pretty happy with
> my current solution.
> 
> So, I'm doing kind of iterative data reduction, where I am trying to
> avoid bringing any real data into memory until the very last moment.
> For this, I am using the query interface to filter the data using the
> indexes and saving the resulting bitvectors in memory for later use.
> If I need to combine some of the results, I use the bit operations
> provided by the bitvector class. When I finally need the data itsef, I
> am using the table interface to select the needed columns with a
> previously calculated bitvector, which is potentially paged to limit
> the results. With the resulting table object I can then use
> aggregation if needed.
> 
> I hope this clarifies my implementation a bit.
> 
> Best regards
> 
> Patrik Nisen
> 
> 
> On Wed, Dec 19, 2012 at 6:32 PM, K. John Wu <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Hi, Patrik,
> 
>     You mentioned that ibis::query class does not have what you need.
>     Would ibis::quaere class, more specifically, ibis::filter class
>     <http://lbl.gov/~kwu/fastbit/doc/html/classibis_1_1filter.html
>     <http://lbl.gov/%7Ekwu/fastbit/doc/html/classibis_1_1filter.html>>, be
>     better suited for your work?  Because the return from a select
>     function call on this object is an ibis::table, you are free to
>     perform group by operations on it.  In additional, you can put the
>     group by operations as the argument of the function
>     ibis::filter::select, which would save you the trouble of calling
>     ibis::table::groupby.
> 
>     Looks to me what might be needed is an addition of a constructor that
>     takes a bitvector object in place of a where clause and a single data
>     partition.  This constructor has been added to ibis::filter class.
>     Please take a look at src/filter.h and src/filter.cpp.
> 
>     By the way, the class ibis::bitvector has read and write functions
>     that allow you to read and write the bitvector returned by
>     ibis::query::getHitVector.
> 
>     Let us know if you have a chance to try it.
> 
>     John
> 
> 
>     On 12/19/12 7:17 AM, Patrik Nisen wrote:
>     > Hi,
>     >
>     > I'm sorry I was not able to explain my problem clearly. However, I
>     > found a close enough solution, so let me explain that and perhaps
>     > clarify it a bit.
>     >
>     > So the idea was to save an index to the data based on a query, and
>     > reuse that index later. I want to save the result, because the
>     > filtering is done with pretty expensive "LIKE" queryies on text
>     > columns (and previously outside Fastbit). So I'm saving the
>     bitvectors
>     > retrieved from an evaluated query-object. Then, I want to run
>     > aggregate queries over these results, which can be done mainly using
>     > the table interface. However, the normal interface does not
>     allow use
>     > of bitvectors to limit the query to only those rows defined in the
>     > bitvector, because the "generic" table could be, for instance, using
>     > many partitions (I suppose). I found out that I can populate a table
>     > myself using directly the bord-class and its append-method, and
>     bring
>     > in just the records needed. Then I can call groupby for those
>     results.
>     >
>     > I did not find a straightforward way to reuse these bitvectors with
>     > the query interface, so at the moment I'm simply doing AND
>     between the
>     > two bitvector results, which is fine when the following queries are
>     > not using text columns. One option could be to use the
>     > getRIDs(bitvector) method to convert the bitvector to RIDSet and
>     then
>     > use that with setRIDs() method, byt I have not looked into that yet.
>     >
>     > Thanks!
>     >
>     > Patrik
>     >
>     > On Mon, Dec 10, 2012 at 5:26 PM, K. John Wu <[email protected]
>     <mailto:[email protected]>> wrote:
>     >> Hi, Patrik,
>     >>
>     >> If you would like to write a data table to screen or a file in CSV
>     >> format, use the function ibis::table::dump.  If you plan to write a
>     >> data table out in binary format that can be used for further
>     queries,
>     >> then use the function ibis::table::backup.
>     >>
>     >> John
>     >>
>     >>
>     >> On 12/10/12 4:29 AM, Patrik Nisen wrote:
>     >>> Hi,
>     >>>
>     >>> yes, it would. Would it then be possible to do that and save the
>     >>> intermediary results into a file?
>     >>>
>     >>> Thank you.
>     >>>
>     >>> Patrik
>     >>>
>     >>>
>     >>> On 12/08/12 at 12:56pm, K. John Wu wrote:
>     >>>> Hi, Patrik,
>     >>>>
>     >>>> Would it be possible for to issue two queries with the same where
>     >>>> clause, but different select clauses?
>     >>>>
>     >>>> John
>     >>>>
>     >>>>
>     >>>> On 12/7/12 4:13 AM, Patrik Nisen wrote:
>     >>>>> Hi,
>     >>>>>
>     >>>>> and thank you for your great work!
>     >>>>>
>     >>>>> I am currently looking into performing operations for
>     pre-filtered
>     >>>>> sets of rows, and I would need some help to understand if
>     this is
>     >>>>> possible at the moment to do with fastbit, or advice to
>     implement it.
>     >>>>>
>     >>>>> I have a dataset saved into a single data partition and my
>     goal is to
>     >>>>> perform filtering with varying conditions, save these
>     results and use
>     >>>>> them as starting sets for later queries. If I have understood
>     >>>>> correctly, this is at the moment possible by retrieving the
>     RIDSet from
>     >>>>> an evaluated query, saving that, and setting it
>     (query::setRIDs) to the
>     >>>>> next query.  However, I would like to use aggregate
>     functions with the
>     >>>>> following queries, but I did not find a way to do similar
>     things with
>     >>>>> the table interface.
>     >>>>>
>     >>>>> So my question is: how could I perform the described
>     aggregation for a
>     >>>>> pre-filtered set of rows? In addition, as I'm only having
>     one partition,
>     >>>>> I would prefer to save the filtering results as bitvectors
>     >>>>> (query::getHitVector) and reuse them later as masks due to
>     their smaller
>     >>>>> size. There's a protected function query::doEvaluate having this
>     >>>>> functionality, and perhaps that could be opened. Would this
>     make any
>     >>>>> sense?
>     >>>>>
>     >>>>> Thanks for your help!
>     >>>>>
>     >>>>>
>     >>>>> Patrik Nisen
>     >>>>> _______________________________________________
>     >>>>> FastBit-users mailing list
>     >>>>> [email protected]
>     <mailto:[email protected]>
>     >>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>     >>>>>
>     > _______________________________________________
>     > FastBit-users mailing list
>     > [email protected] <mailto:[email protected]>
>     > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>     >
> 
> 
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to