Whoops my mistake, deactivate() returns the number of inactive rows, just like it says in the doc :)
Greg On Mon, Aug 13, 2012 at 6:11 PM, Greg Barker <[email protected]> wrote: > Hello John, > > Thank you for the updated code, it appears to be working quite well now > for that case. I really appreciate it. > > Another thing I noticed while I was testing is that if you call > deactivate() multiple times before purgeInactive(), the return value was > not what I expected. Do I need to call purgeInactive() after each > deactivate()? > > For example: > > int deactivatedCount = 0; > deactivatedCount += existing_part.deactivate("my_primary_key in (1, 2)"); > deactivatedCount += existing_part.deactivate("my_primary_key in (3, 4)"); > existing_part.purgeInactive(); > std::cout << "deactivatedCount = " << deactivatedCount << "\n"; > > Which yields: > > part[existing_dir]::deactivate marked 2 rows as inactive, leaving 3 active > rows out of 5 > part[existing_dir]::deactivate marked 2 rows as inactive, leaving 1 active > row out of 5 > part[existing_dir]::purgeInactive to remove 4 out of 5 rows > deactivatedCount = 6 > > Thanks again for your work, > > Greg > > > On Mon, Aug 13, 2012 at 4:10 PM, K. John Wu <[email protected]> wrote: > >> Hi, Greg, >> >> Thanks for the test case and test code. The problem should be fix >> with SVN Revision 538. Please give it a try when you get the chance. >> >> There is a one minor change to your test program in order to it to do >> what you want. The following line, >> >> ibis::part existing_part(existing_dir); >> >> needs to be changed to >> >> ibis::part existing_part(existing_dir, static_cast<const char*>(0)); >> >> The version you used will create two directories hidden in .ibis, >> which are probably not what you want. >> >> John >> >> >> >> On 8/13/12 1:57 AM, Greg Barker wrote: >> > Hello, >> > >> > The type of my_primary_key is a long. I was able to reproduce the >> > error without the join, I also noticed that it does not hit the seg >> > fault if the category column is omitted. The following program will >> > hit the error. >> > >> > $ cat first_data_file.csv >> > 1,93.19,AAA >> > 2,49.14,BBB >> > 3,50.41,CCC >> > 4,58.59,AAA >> > 5,19.53,CCC >> > >> > $ cat second_data_file.csv >> > 3,49.19,DDD >> > 4,59.10,EEE >> > 5,34.48,FFF >> > 6,91.49,AAA >> > 7,19.50,BBB >> > >> > $ cat loading_error.cc >> > #include <memory> >> > >> > #include <ibis.h> >> > >> > int main(int argc, char **argv) >> > { >> > char existing_dir[] = "existing_dir"; >> > char first_incoming_dir[] = "first_incoming_dir"; >> > char second_incoming_dir[] = "second_incoming_dir"; >> > >> > std::auto_ptr<ibis::tablex> firstTable(ibis::tablex::create()); >> > firstTable->addColumn("my_primary_key", ibis::LONG); >> > firstTable->addColumn("my_double_value", ibis::DOUBLE); >> > firstTable->addColumn("my_category_value", ibis::CATEGORY); >> > firstTable->readCSV("first_data_file.csv", 0, first_incoming_dir, >> > ","); >> > firstTable->write(first_incoming_dir, "working", NULL, NULL, NULL); >> > firstTable->clearData(); >> > >> > ibis::part existing_part(existing_dir); >> > existing_part.append(first_incoming_dir); >> > existing_part.commit(first_incoming_dir); >> > existing_part.purgeIndexFiles(); >> > existing_part.buildIndexes(); >> > existing_part.emptyCache(); >> > >> > std::auto_ptr<ibis::tablex> secondTable(ibis::tablex::create()); >> > secondTable->addColumn("my_primary_key", ibis::LONG); >> > secondTable->addColumn("my_double_value", ibis::DOUBLE); >> > secondTable->addColumn("my_category_value", ibis::CATEGORY); >> > secondTable->readCSV("second_data_file.csv", 0, >> > second_incoming_dir, ","); >> > secondTable->write(second_incoming_dir, "working", NULL, NULL, >> NULL); >> > secondTable->clearData(); >> > >> > ibis::part second_part(second_incoming_dir); >> > >> > existing_part.deactivate("my_primary_key = 1"); >> > existing_part.purgeInactive(); >> > >> > existing_part.append(second_incoming_dir); >> > } >> > >> > Thank you John, >> > >> > Greg >> > >> > On Sun, Aug 12, 2012 at 3:27 PM, K. John Wu <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Hi, Greg, >> > >> > Thanks for the information. Looks like we might have neglected to >> > close some index files or somehow mishandled some index files. >> There >> > is only easy thing for us to check, this is related to the handling >> of >> > categorical values (the columns of type ibis::CATEGORY). Would you >> > mind tell us if my_primary_key is an integer column or a CATEGORY >> > column? >> > >> > If it is not a CATEGORY, then we might have something a little bit >> > more complex. We would appreciate a small test case to replicate >> the >> > problem. >> > >> > John >> > >> > >> > On 8/10/12 5:32 PM, Greg Barker wrote: >> > > Hello - >> > > >> > > I am attempting to append some new data to some existing data, >> > and ran >> > > into some trouble. When loading, I join the new data to the >> existing >> > > data on a particular column, and then deactivate & purgeInactive >> on >> > > the matching records. Then when I try to append the new data to >> the >> > > existing data, I hit a seg fault using rev 536. If I >> > > call purgeIndexFiles before the append, it seems to avoid the >> crash, >> > > but I wasn't sure if that was recommended? >> > > >> > > My code is essentially: >> > > >> > > ibis::part existing_part("my_data"); >> > > ibis::part incoming_part("new_data"); >> > > std::auto_ptr<ibis::quaere> >> > > join(ibis::quaere::create(&existing_part, &incoming_part, >> > > "my_primary_key")); >> > > std::auto_ptr<ibis::table> rs(join->select("my_primary_key")); >> > > //then build the where clause >> > > working_part.deactivate("my_primary_key in (3, 4, 5)"); >> > > working_part.purgeInactive(); >> > > working_part.append(incoming_data); >> > > >> > > >> > > Which yields the following: >> > > >> > > part[my_data]::deactivate marked 9 rows as inactive, leaving >> 10 >> > > active rows out of 19 >> > > part[my_data]::purgeInactive to remove 9 out of 19 rows >> > > Warning -- fileManager::flushDir can not remove in-memory file >> > > (my_data/my_primary_key.idx). It is in use >> > > Warning -- fileManager::flushDir(my_data) finished with 1 file >> > > still in memory >> > > Constructed a part named my_data >> > > filter::sift1S -- processing data partition my_data >> > > Segmentation fault (core dumped) >> > > >> > > Many Thanks, >> > > Greg >> > >> > >> > >
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
