Guess you are looking for a command line tool that let you specify the
names and types of the data files, and create a file named
'-part.txt'. Am I right?
Please take a look at the command line arguments used for ardea below,
We could make it write the file -part.txt if you don't specify any
data. Would that suite your needs?
John
PS: The help message from ardea
ardea --help
usage:
/Users/john/src/ibis/examples/.libs/ardea [-c conf-file] [-d
directory-to-write-data] [-n name-of-dataset] [-r a-row-in-ASCII] [-t
text-file-to-read] [-sqldump file-to-read] [-b
break/delimiters-in-text-file][-M metadata-file] [-m
name:type[,name:type,...]] [-m max-rows-per-file] [-tag
name-value-pair] [-select clause] [-where clause] [-v[=| ]verbose_level]
Note:
Column name must start with an alphabet and can only contain
alphanumeric values, and max-rows-per-file must start with a decimal digit
This program only recognize the following column types:
byte, short, int, long, float, double, key, and text
It only checks the first character of the types.
For example, one can load the data in tests/test0.csv either one of
the following command lines:
ardea -d somwhere1 -m a:i,b:i,c:i -t tests/test0.csv
ardea -d somwhere2 -m a:i -m b:f -m c:d -t tests/test0.csv
On 7/27/12 4:42 PM, S M Faisal wrote:
> Hi John,
> Thanks so much for your quick answer. Really appreciate it.
>
> To make sure I understand, does it mean I have to create the -part.txt
> file myself (manually)? Do you happen to
> have a script that somehow generates the -part.txt file when the user
> already has the files in binary format. Because
> I'm talking about cases where I have few hundred columns and its
> really nearly impossible to generate a file manually.
>
> Any help in this regard?
>
> Thanks,
> faisal
>
> On Fri, Jul 27, 2012 at 4:35 PM, K. John Wu <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi, Faisal,
>
> Thanks for your interest in FastBit. If you have the data already in
> binary format (for the machine you are running on), then you need to
> put the data files into a directory and place a file named '-part.txt'
> to tell FastBit the data types of the files. Another thing to note is
> that the file names are taken to be the column names (and the case of
> the file name must match the case used in '-part.txt'). A directory
> is considered a partition of a data table, and a data table could have
> any number of data partitions.
>
> The file <http://lbl.gov/~kwu/fastbit/doc/dataLoading.html
> <http://lbl.gov/%7Ekwu/fastbit/doc/dataLoading.html>> has a
> little bit more details.
>
> Hope this helps.
>
> John
>
>
> On 7/27/12 4:10 PM, S M Faisal wrote:
> > Hi,
> > I'm new to FastBit. I see that there are programs for preprocessing
> > and formatting data so that FastBit can be used. But what if my data
> > is already in column files in binary format? That is, my data is
> > already stored as one file per column and in binary format. All that
> > is missing is the -part.txt file.
> >
> > How should I proceed?
> >
> > Thanks in advance!
> >
> > --
> > -----------------------------------
> > faisal
> >
> >
> > _______________________________________________
> > FastBit-users mailing list
> > [email protected] <mailto:[email protected]>
> > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
> >
>
>
>
>
> --
> -----------------------------------
> faisal
>
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users