Hello, My name is Yoshihiro TANAKA.

I'm interested in GSOC, and MetaDataBase project.

So let me ask about file format for MetaDataBase(SIDB).
Considering forwards-compatibility, Wget should be able to ignore items
it does not recognize. For this, Wget has to know which data belongs to
which item.
So how about csv, with delimiter "|" ?

It would look like below.

<-----------------------------------------------------
first  line:Wget Start at MMSSMMHH-DDMMYYYY
second line:SIDB Version:1.13
third  line:Wget invocation configration
fourth line:titleline:URL|StatusCode|Filepath|MIME-Type|......
fifth  line, and below:data lines bra|bra|bra|bra|bra|bra|...
        data lines bra|bra|bra|bra|bra|bra|...
        data lines bra|bra|bra|bra|bra|bra|...
        data lines bra|bra|bra|bra|bra|bra|...
        data lines bra|bra|bra|bra|bra|bra|...
        data lines bra|bra|bra|bra|bra|bra|...
last line:Wget End at MMSSMMHH-DDMMYYYY
------------------------------------------------------->

The advantage of this format is:
1. Wget can recognize start/end of session
2. Wget can recognize which data belongs to with item
   (It includes configuration infor in title line)
3. Wget can recognize the version of this SIDB file
   (It does not have to be same to that of Wget)

Case 1: When Older Wget reads newer version of SIDB file,
        it can only read items which it recognizes.

Case 2: When Newer Wget wants to use old version SIDB file,
        it can check Version of file, and cope with it.

Case 3: When New Wget wants to use new version SIDB file as Old
version SIDB file,
        it can specify version of SIDB file like:
        # Wget -VSIDB 1.12
        which means even SIDB file version is 1.13, Wget treat it as
version 1.12 file.


so please comment on this file format.
Thank you for your time.
-- 
Yoshihiro TANAKA
SFSU CS Department

Reply via email to