Whoops my mistake, deactivate() returns the number of inactive rows, just
like it says in the doc :)

Greg

On Mon, Aug 13, 2012 at 6:11 PM, Greg Barker <[email protected]> wrote:

> Hello John,
>
> Thank you for the updated code, it appears to be working quite well now
> for that case. I really appreciate it.
>
> Another thing I noticed while I was testing is that if you call
> deactivate() multiple times before purgeInactive(), the return value was
> not what I expected. Do I need to call purgeInactive() after each
> deactivate()?
>
> For example:
>
> int deactivatedCount = 0;
> deactivatedCount += existing_part.deactivate("my_primary_key in (1, 2)");
> deactivatedCount += existing_part.deactivate("my_primary_key in (3, 4)");
> existing_part.purgeInactive();
> std::cout << "deactivatedCount = " << deactivatedCount << "\n";
>
> Which yields:
>
> part[existing_dir]::deactivate marked 2 rows as inactive, leaving 3 active
> rows out of 5
> part[existing_dir]::deactivate marked 2 rows as inactive, leaving 1 active
> row out of 5
> part[existing_dir]::purgeInactive to remove 4 out of 5 rows
> deactivatedCount = 6
>
> Thanks again for your work,
>
> Greg
>
>
> On Mon, Aug 13, 2012 at 4:10 PM, K. John Wu <[email protected]> wrote:
>
>> Hi, Greg,
>>
>> Thanks for the test case and test code.  The problem should be fix
>> with SVN Revision 538.  Please give it a try when you get the chance.
>>
>> There is a one minor change to your test program in order to it to do
>> what you want.  The following line,
>>
>>      ibis::part existing_part(existing_dir);
>>
>> needs to be changed to
>>
>>      ibis::part existing_part(existing_dir, static_cast<const char*>(0));
>>
>> The version you used will create two directories hidden in .ibis,
>> which are probably not what you want.
>>
>> John
>>
>>
>>
>> On 8/13/12 1:57 AM, Greg Barker wrote:
>> > Hello,
>> >
>> > The type of my_primary_key is a long. I was able to reproduce the
>> > error without the join, I also noticed that it does not hit the seg
>> > fault if the category column is omitted. The following program will
>> > hit the error.
>> >
>> > $ cat first_data_file.csv
>> > 1,93.19,AAA
>> > 2,49.14,BBB
>> > 3,50.41,CCC
>> > 4,58.59,AAA
>> > 5,19.53,CCC
>> >
>> > $ cat second_data_file.csv
>> > 3,49.19,DDD
>> > 4,59.10,EEE
>> > 5,34.48,FFF
>> > 6,91.49,AAA
>> > 7,19.50,BBB
>> >
>> > $ cat loading_error.cc
>> > #include <memory>
>> >
>> > #include <ibis.h>
>> >
>> > int main(int argc, char **argv)
>> > {
>> >     char existing_dir[] = "existing_dir";
>> >     char first_incoming_dir[] = "first_incoming_dir";
>> >     char second_incoming_dir[] = "second_incoming_dir";
>> >
>> >     std::auto_ptr<ibis::tablex> firstTable(ibis::tablex::create());
>> >     firstTable->addColumn("my_primary_key", ibis::LONG);
>> >     firstTable->addColumn("my_double_value", ibis::DOUBLE);
>> >     firstTable->addColumn("my_category_value", ibis::CATEGORY);
>> >     firstTable->readCSV("first_data_file.csv", 0, first_incoming_dir,
>> > ",");
>> >     firstTable->write(first_incoming_dir, "working", NULL, NULL, NULL);
>> >     firstTable->clearData();
>> >
>> >     ibis::part existing_part(existing_dir);
>> >     existing_part.append(first_incoming_dir);
>> >     existing_part.commit(first_incoming_dir);
>> >     existing_part.purgeIndexFiles();
>> >     existing_part.buildIndexes();
>> >     existing_part.emptyCache();
>> >
>> >     std::auto_ptr<ibis::tablex> secondTable(ibis::tablex::create());
>> >     secondTable->addColumn("my_primary_key", ibis::LONG);
>> >     secondTable->addColumn("my_double_value", ibis::DOUBLE);
>> >     secondTable->addColumn("my_category_value", ibis::CATEGORY);
>> >     secondTable->readCSV("second_data_file.csv", 0,
>> > second_incoming_dir, ",");
>> >     secondTable->write(second_incoming_dir, "working", NULL, NULL,
>> NULL);
>> >     secondTable->clearData();
>> >
>> >     ibis::part second_part(second_incoming_dir);
>> >
>> >     existing_part.deactivate("my_primary_key = 1");
>> >     existing_part.purgeInactive();
>> >
>> >     existing_part.append(second_incoming_dir);
>> > }
>> >
>> > Thank you John,
>> >
>> > Greg
>> >
>> > On Sun, Aug 12, 2012 at 3:27 PM, K. John Wu <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >     Hi, Greg,
>> >
>> >     Thanks for the information.  Looks like we might have neglected to
>> >     close some index files or somehow mishandled some index files.
>>  There
>> >     is only easy thing for us to check, this is related to the handling
>> of
>> >     categorical values (the columns of type ibis::CATEGORY).  Would you
>> >     mind tell us if my_primary_key is an integer column or a CATEGORY
>> >     column?
>> >
>> >     If it is not a CATEGORY, then we might have something a little bit
>> >     more complex.  We would appreciate a small test case to replicate
>> the
>> >     problem.
>> >
>> >     John
>> >
>> >
>> >     On 8/10/12 5:32 PM, Greg Barker wrote:
>> >     > Hello -
>> >     >
>> >     > I am attempting to append some new data to some existing data,
>> >     and ran
>> >     > into some trouble. When loading, I join the new data to the
>> existing
>> >     > data on a particular column, and then deactivate & purgeInactive
>> on
>> >     > the matching records. Then when I try to append the new data to
>> the
>> >     > existing data, I hit a seg fault using rev 536. If I
>> >     > call purgeIndexFiles before the append, it seems to avoid the
>> crash,
>> >     > but I wasn't sure if that was recommended?
>> >     >
>> >     > My code is essentially:
>> >     >
>> >     >     ibis::part existing_part("my_data");
>> >     >     ibis::part incoming_part("new_data");
>> >     >     std::auto_ptr<ibis::quaere>
>> >     >     join(ibis::quaere::create(&existing_part, &incoming_part,
>> >     >     "my_primary_key"));
>> >     >     std::auto_ptr<ibis::table> rs(join->select("my_primary_key"));
>> >     >     //then build the where clause
>> >     >     working_part.deactivate("my_primary_key in (3, 4, 5)");
>> >     >     working_part.purgeInactive();
>> >     >     working_part.append(incoming_data);
>> >     >
>> >     >
>> >     > Which yields the following:
>> >     >
>> >     >     part[my_data]::deactivate marked 9 rows as inactive, leaving
>> 10
>> >     >     active rows out of 19
>> >     >     part[my_data]::purgeInactive to remove 9 out of 19 rows
>> >     >     Warning -- fileManager::flushDir can not remove in-memory file
>> >     >     (my_data/my_primary_key.idx).  It is in use
>> >     >     Warning -- fileManager::flushDir(my_data) finished with 1 file
>> >     >     still in memory
>> >     >     Constructed a part named my_data
>> >     >     filter::sift1S -- processing data partition my_data
>> >     >     Segmentation fault (core dumped)
>> >     >
>> >     > Many Thanks,
>> >     > Greg
>> >
>> >
>>
>
>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to