Hi, Andrew, Depending whether your perl script produces CSV file or FastBit binary files, there are different options.
If you produce CSV files, an empty field (simply having a coma delimiter for a particular column) is taken to be indicating a NULL value. The program ardea.cpp is designed to recognize this and generates appropriate NULL masks. In a CSV file, either "" or '' indicate an empty string. In most cases, an empty string is effectively a NULL value. If you are producing FastBit binary files, you can produce .msk files as follows. Use unsigned 32-bit words, using the lower 31 bits of each word and leave the most significant bit as 0. Record a valid value as 1 and a null value as 0. Place the bits from the more significant position to the less significant position. Each whole word represent the status of 31 rows, any remainder needs another word. Say there are k rows left, you will need a word to record the value k and another word to record the values of these k bits. In a .msk file, word record the value k is the last word and the k bits are placed in second to the last word. The last k bits are stored in the lowest k positions of the second to the last word. Hope this helps. John On 8/29/12 8:50 AM, Olson, Andrew wrote: > I've been converting text files to FastBit partitions in perl and I > need to be able to create a .msk file because I have some null > values. What is the format of the .msk file? Is it WAH > compressed? If so, does FastBit replace an uncompressed .msk file > automatically? If not, can ardea produce this for me? > > Andrew _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
