Hi, Greg,

Thanks for the test case and test code.  The problem should be fix
with SVN Revision 538.  Please give it a try when you get the chance.

There is a one minor change to your test program in order to it to do
what you want.  The following line,

     ibis::part existing_part(existing_dir);

needs to be changed to

     ibis::part existing_part(existing_dir, static_cast<const char*>(0));

The version you used will create two directories hidden in .ibis,
which are probably not what you want.

John



On 8/13/12 1:57 AM, Greg Barker wrote:
> Hello,
> 
> The type of my_primary_key is a long. I was able to reproduce the
> error without the join, I also noticed that it does not hit the seg
> fault if the category column is omitted. The following program will
> hit the error.
> 
> $ cat first_data_file.csv
> 1,93.19,AAA
> 2,49.14,BBB
> 3,50.41,CCC
> 4,58.59,AAA
> 5,19.53,CCC
> 
> $ cat second_data_file.csv
> 3,49.19,DDD
> 4,59.10,EEE
> 5,34.48,FFF
> 6,91.49,AAA
> 7,19.50,BBB
> 
> $ cat loading_error.cc
> #include <memory>
> 
> #include <ibis.h>
> 
> int main(int argc, char **argv)
> {
>     char existing_dir[] = "existing_dir";
>     char first_incoming_dir[] = "first_incoming_dir";
>     char second_incoming_dir[] = "second_incoming_dir";
> 
>     std::auto_ptr<ibis::tablex> firstTable(ibis::tablex::create());
>     firstTable->addColumn("my_primary_key", ibis::LONG);
>     firstTable->addColumn("my_double_value", ibis::DOUBLE);
>     firstTable->addColumn("my_category_value", ibis::CATEGORY);
>     firstTable->readCSV("first_data_file.csv", 0, first_incoming_dir,
> ",");
>     firstTable->write(first_incoming_dir, "working", NULL, NULL, NULL);
>     firstTable->clearData();
> 
>     ibis::part existing_part(existing_dir);
>     existing_part.append(first_incoming_dir);
>     existing_part.commit(first_incoming_dir);
>     existing_part.purgeIndexFiles();
>     existing_part.buildIndexes();
>     existing_part.emptyCache();
> 
>     std::auto_ptr<ibis::tablex> secondTable(ibis::tablex::create());
>     secondTable->addColumn("my_primary_key", ibis::LONG);
>     secondTable->addColumn("my_double_value", ibis::DOUBLE);
>     secondTable->addColumn("my_category_value", ibis::CATEGORY);
>     secondTable->readCSV("second_data_file.csv", 0,
> second_incoming_dir, ",");
>     secondTable->write(second_incoming_dir, "working", NULL, NULL, NULL);
>     secondTable->clearData();
> 
>     ibis::part second_part(second_incoming_dir);
> 
>     existing_part.deactivate("my_primary_key = 1");
>     existing_part.purgeInactive();
> 
>     existing_part.append(second_incoming_dir);
> }
> 
> Thank you John,
> 
> Greg
> 
> On Sun, Aug 12, 2012 at 3:27 PM, K. John Wu <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Hi, Greg,
> 
>     Thanks for the information.  Looks like we might have neglected to
>     close some index files or somehow mishandled some index files.  There
>     is only easy thing for us to check, this is related to the handling of
>     categorical values (the columns of type ibis::CATEGORY).  Would you
>     mind tell us if my_primary_key is an integer column or a CATEGORY
>     column?
> 
>     If it is not a CATEGORY, then we might have something a little bit
>     more complex.  We would appreciate a small test case to replicate the
>     problem.
> 
>     John
> 
> 
>     On 8/10/12 5:32 PM, Greg Barker wrote:
>     > Hello -
>     >
>     > I am attempting to append some new data to some existing data,
>     and ran
>     > into some trouble. When loading, I join the new data to the existing
>     > data on a particular column, and then deactivate & purgeInactive on
>     > the matching records. Then when I try to append the new data to the
>     > existing data, I hit a seg fault using rev 536. If I
>     > call purgeIndexFiles before the append, it seems to avoid the crash,
>     > but I wasn't sure if that was recommended?
>     >
>     > My code is essentially:
>     >
>     >     ibis::part existing_part("my_data");
>     >     ibis::part incoming_part("new_data");
>     >     std::auto_ptr<ibis::quaere>
>     >     join(ibis::quaere::create(&existing_part, &incoming_part,
>     >     "my_primary_key"));
>     >     std::auto_ptr<ibis::table> rs(join->select("my_primary_key"));
>     >     //then build the where clause
>     >     working_part.deactivate("my_primary_key in (3, 4, 5)");
>     >     working_part.purgeInactive();
>     >     working_part.append(incoming_data);
>     >
>     >
>     > Which yields the following:
>     >
>     >     part[my_data]::deactivate marked 9 rows as inactive, leaving 10
>     >     active rows out of 19
>     >     part[my_data]::purgeInactive to remove 9 out of 19 rows
>     >     Warning -- fileManager::flushDir can not remove in-memory file
>     >     (my_data/my_primary_key.idx).  It is in use
>     >     Warning -- fileManager::flushDir(my_data) finished with 1 file
>     >     still in memory
>     >     Constructed a part named my_data
>     >     filter::sift1S -- processing data partition my_data
>     >     Segmentation fault (core dumped)
>     >
>     > Many Thanks,
>     > Greg
> 
> 
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to