Hi John, I have been storing data in multiple partitions, using metaTags to identify the partitioning. For example, this query fails because multiple partitions have matches: $ thula -d . -s "FBchr,count(*)" -w "1=1" doQuery(1=1) evaluated on T-1 produced 1 hit out of 331230 records -- begin printing the result table -- Table (in memory) _8PVC (GROUP BY FBchr,count(*) on table SF5D42 (GROUP BY FBchr, COUNT(*) on table OAiQa1)) contsists of 2 columns and 1 row FBchr UINT (dictionary size: 0) _1 UINT 1, 331230 -- end printing --
And this one works because the matches are all in one partition $ thula -d . -s "FBchr,count(*)" -w "FBchr='1'" doQuery(FBchr='1') evaluated on T-1 produced 1 hit out of 331230 records -- begin printing the result table -- Table (in memory) UIpJq2 (GROUP BY FBchr,count(*) on table o0Lu8 (GROUP BY FBchr, COUNT(*) on table _qULt2)) contsists of 2 columns and 1 row FBchr UINT (dictionary size: 0) _1 UINT 1, 51976 -- end printing -- I'm not sure if this happens with normal CATEGORY columns when the dictionaries differ between partitions. It seems like a bug in some output functions that are using one dictionary for a whole column, rather than partition specific dictionaries. Would it be useful to have a command line tool that merges dictionaries and updates .int and .idx files across a set of partitions? This could also remove unused entries from each merged dictionary that don't appear in any of the partitions. Andrew _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
